
Our Database Will Betray Us
A wake-up call on why the web stopped fitting its data into neat little tables and what happened next.
It works perfectly right now. 100 users, no problems. But something is coming and when it does, our carefully designed SQL schema will be the thing that kills our app.
The Story
Last Semester, as an end project I worked on a football pitch booking website for the college of Science and Technology. At first everything was perfect. The database schema was easy to understand. The tables was setup correctly and the realationship between them was clear.
After buliding the app, we started testing the application. When doing the stress, smoke and integration testing, I got an error. It test for 100 user and 1000 users. Two users trying to book the same slot at the same time and the database just… not knowing what to do with that. Pages timing out for no obvious reason. It was genuinely confusing because the code looked fine. The logic looked fine. Everything looked fine.
I used PostgreSQL, which is one of the best relational databases out there. But back then I didnt have any idea about the NoSQL database. I think that the PostgreSQL is not a good choice because it isn’t just about storing data neatly in rows and columns. It’s about handling two people hitting the same slot at the exact same millisecond. It’s about locking and real-time availability.
Looking back, the database wasn’t wrong. It just wasn’t the right tool for my job. It was best to use NoSQL database like MongoDB or DynamoDB, which are designed to handle high concurrency and real-time updates without locking issues.
| Users | What Happens |
|---|---|
| 100 users | Queries run in ~12ms. Everything is fine. |
| 10,000 users | Queries slow to 800ms. Page loads drag. Users notice. We refresh our monitoring dashboard obsessively. |
| 100,000 users | Database locks up. Connections pile up. The server returns 503 errors. Our inbox fills with angry emails. |
Here’s the question nobody asks ourself:
What changed? The code didn’t. The queries didn’t. The schema didn’t. Only the users did.
That’s the betrayal. The database that saved us at 100 users is the same one strangling us at 100,000. And the worst part? We designed it correctly. we normalized our tables. We added indexes. We followed every rule.
It just doesn’t matter at scale.
This is not a bug. It’s a fundamental property of how relational databases are built and understanding it is the first step to understanding why NoSQL was invented.
The Proof
Even the Best Engineers Couldn’t Fix It
We might be thinking: “Maybe the app just wasn’t optimized enough. Better engineers would have made it work.”
Let’s look at what happened to some of the best engineering teams in the world.
Facebook - The Wall Problem
In 2007, Facebook’s social graph, who knows whom, who likes what, became impossible to model in relational tables. Every friend-of-a-friend query was a join across millions of rows. They eventually built their own graph storage engine. SQL just isn’t shaped like relationships.
→ Read the official Facebook Engineering post
Netflix - The Scale Problem
Netflix serves 200+ million users across the globe simultaneously. A single database server even the most powerful one money can buy, cannot handle that. Netflix moved to Apache Cassandra, a NoSQL database that runs on hundreds of servers at once. They literally cannot go back.
→ Read the official Netflix Tech Blog post
Twitter - The Write Speed Problem
At peak, Twitter handles 500,000 tweets per second. A relational database writes to a single primary node. That bottleneck becomes lethal at this volume. Twitter routes writes to distributed systems that don’t need to coordinate the way RDBMS does.
→ Read the official Twitter Engineering post
Amazon - The Latency Problem
Amazon’s internal research found that every 100ms of latency costs them 1% in sales. They built DynamoDB, one of the world’s most influential NoSQL databases, because even tiny query slowdowns at their scale translate to millions in lost revenue.
→ Read the official Amazon Science post
These aren’t startup mistakes. These are the companies that wrote the textbooks your professors are teaching from. And they all hit the same wall.
The relational database wasn’t broken. It was just designed for a world where “massive scale” meant a few thousand concurrent users, not a few hundred million. The internet changed the definition of scale. The database had to follow.
What If Some Databases Return Wrong Data, On Purpose?
Our relational database makes us a promise: every read returns the most recent, correct version of your data. Always. No exceptions. This promise is called ACID.
But it comes at a cost. To keep that promise, the database must lock rows, check constraints, synchronize every node, and make the entire cluster agree before responding. At scale, all of that coordination becomes your bottleneck.
So distributed systems engineers asked a radical question: What if we relax the promise?

Introducing BASE - The NoSQL Philosophy
BASE is the alternative to ACID. The system for data does not promise that everything will be perfect. The system for data promises something that’s more practical. The system for data will keep working. It will keep working quickly. The system for data will eventually have the data even if for a short time some users of the system for data see things that are a little different from what other users of the system, for data see.
B - Basically Available
The system always sends a response. Even if one node is down, another handles the request. No timeouts, no errors, maybe slightly stale data, but something, not nothing.
A - Soft State
The state of the system is fluid and always shifting. Nodes are constantly communicating to sync. Data is “in motion” settling toward truth rather than being rigidly locked in it.
S - Eventually Consistent
If you write data to Node A, Nodes B and C will catch up within milliseconds. Not instantly, but soon. For most use cases, “soon” is more than good enough.
Here’s how it works in real life:
Here’s BASE in real life: When we click on the Like button for a post on Instagram the number we see may be wrong for a short time. This is because computers in parts of the world are sharing information with each other. After that the Like count is correct again. We do not even notice this happening. We do not even think about the Like count on Instagram when it is wrong for a second. The Like count, on Instagram is what matters to us not how it gets updated.
But if our bank balance worked that way? We’d care enormously. That’s why banks still use ACID. The right model depends entirely on your problem.
The Iron Rule
The Law That Governs Every Distributed Database
In 2000, a computer scientist named Eric Brewer proved something uncomfortable. In a distributed database system cannot guarantee Consistency, Availability, and Partition Tolerance all at the same time at most, we can only ever guarantee two of these three properties at the same time. We cannot have all three. Ever. Understanding this balance helps developers to choose the right priorities to create systems that perform reliably in real-world distributed environments.
The theorem is frequently associate a NoSQL databases. It because their can scale out (horizontally) easily.
CAP Theorem Properties
- Consistency
Every node shows the information. All clients see the thing at the same time. The data is always up, to date.
Example: Let us say we have ₹500. Then we spend ₹200. Now every database node must show that we have ₹300. It is not okay if one node says we have ₹500 and another node says we have ₹300. All database nodes must show ₹300. The ₹500 is data and the ₹300 is the new data. Every database node must have the data, which is ₹300.

- Availability
Every time we make a request, we will get a response. No part of the system will ever tell us “Error Try again Later!” If some parts of the system are not working the system will still give us a response. But here the system keeps going even when some parts are having problems.
Example: User B is far from User A but tries to subscribe. An available system must process that request, no matter the distance or node state.

- Partition Tolerance
The system keeps going even if the nodes lose contact, with each other. If a network cable breaks the system does not crash. The system just carries on.
Example: Network outage splits DB into two halves. User B still sees subscriber count from the replica, the system stayed alive despite the split.

Brewer’s CAP Theorem, in a distributed system, we can only pick two
Possible combinations
CP - Consistency + Partition Tolerance
When we choose CP, our system will be consistent even if some partitions happen, but there isn’t guarantee that our system will be fast and available all the time. If a partition happens, the system will prioritize consistency over availability, which means that some requests may be rejected or delayed until the partition is resolved.
Refuses requests during a network split rather than return stale data. Correct or silent. Used in banking, ticket booking, stock markets.
(Dis)Advantages:
- Slow down performance;
- Consistency system;
- System/Nodes can be unavailable.
Examples: Apache HBase, MongoDB, Redis.
Real-World Example - The Bank Balance
We have ₹500 in our account. We open our banking app on our phone and transfer ₹200 to a friend. At the exact same moment, our wife uses the debit card at a shop to pay ₹400.
Both transactions hit different database nodes at the same time.
Now imagine the bank used an AP system - prioritizing availability over consistency.
- Node A processes your transfer: sees ₹500, deducts ₹200, shows ₹300
- Node B processes the card payment: also sees ₹500 (hasn’t synced yet), deducts ₹400, shows ₹100
Both transactions go through. But you only had ₹500. The bank just lost ₹200. We effectively spent ₹600 you didn’t have. This is why banks use CP systems.
When our friend swipes the card, the system locks the account, checks the real balance across all nodes, and only then approves or declines. If two nodes can’t agree, one transaction gets blocked, not both approved.
Correct or silent. Never wrong.
AP - Availability + Partition Tolerance
Choosing AP, our system will lost consistency, but gain availability (responding all request but can have diff responses between node) and speed (nodes not need to be consistent, so syncronyzation will not happen).
Here when some writing happens in one node, another one won´t have the same state as the written one, because they aren´t syncronized. Also, when some partitions happen the partition node can still responding and writing, even if with outdated state.
Always responds, even during failures. Data might be slightly stale, it corrects later. Used in social media, streaming, content platforms. Example; Facebook like counts or Twitter timelines.
(Dis)Advantages:
- High availability;
- Fast;
- Poor consistency.
Examples: DynamoDB, Cassandra, CouchDB
Real-World Example - The Instagram Like Count
We are scrolling Instagram. A viral post just hit 10,000 likes. We see 9,998 likes. Our friend sitting next to us, on the same post, sees 10,000 likes.
We are both looking at the same post. We are seeing different numbers.
Is the app broken? No. This is AP working exactly as designed.
Instagram’s servers are spread across data centres in the US, Europe, and Asia. When someone in Japan likes the post, that like hits the Japan node first. The node in Bhutan hasn’t received that update yet, it takes a few milliseconds to sync.
For those few milliseconds, different nodes show different counts. Then they catch up. We refresh. We see 10,000.
Nobody was harmed. The app never went down. That is the trade-off AP makes, always available, eventually correct.
What about CA - Consistency + Availability?
CA sounds ideal, but it’s a trap. CA systems cannot tolerate network partitions, which means they must run on a single server (monolithic). The moment you distribute across multiple machines, partitions become inevitable. So CA is only possible without distribution, which defeats the whole purpose of building a distributed system. This is why real-world distributed databases only choose CP or AP.

So what’s the actual choice?
In real distributed systems, P (Partition Tolerance) is not optional. Networks fail. Cables break. Cloud regions go down. You must tolerate partitions or your system crashes the moment anything goes wrong. So the real choice is always: CP or AP?
Choose CP when…
- Bank transfers - wrong balance = disaster
- Ticket booking - two users, one last seat
- Stock trading - stale price = lawsuits
Choose AP when…
- Social media likes - off by 3 for 50ms? Fine.
- Netflix streaming - availability matters most
- News feeds - slightly stale is acceptable
Tunable Consistency - the best of both worlds
Many modern NoSQL systems like Cassandra don’t lock you into one choice globally. We can set consistency per operation, strong quorum for a payment write, relaxed for a social feed read. This is called tunable consistency and it’s one of the most powerful features in distributed database design.
Not all data fits into rows and columns
The truth is that if we built our system using only relational tables we have already made a mistake. This is because SQL is bad it is actually really good at what it does. However not every problem can be solved with tables.
Let’s think about this: how do we store a network in a table? We could have a table called friends with two columns, one for the user id and one for the friend id.. If we want to find the friends of our friends we have to connect the friends table to itself times. This can be very slow when the friends table has millions of rows. We can do this with the friends table it is just that it is too slow when the friends table has a lot of data. The friends table is the problem because it is too slow when it has a lot of rows. We need to think about the friends table and how to make it faster.
Relational tables are not the choice for every problem and using them for everything can cause a lot of problems. Relational tables have their uses. They are not the answer, to every question.
The main focus of NoSQL databases is to provide horizontal scalability, a flexible data model, and high availability. Therefore, most NoSQL databases do not support relationships, which is also not their strong suit.
However, some NoSQL databases offer functionality that allows for some degree of relationships between data, for example:
Key-Value Stores: This is like a dictionary that is spread across many computers. You give it a key. It gives you a value right away. It does not do searches. It does not combine things. Key-Value Stores are really good, at what they do because they are very very fast. Perfect for caching, sessions, rate limiting, and leaderboards. Example; Redis · DynamoDB · Riak
Document Databases: We can store your data in a flexible way using JSON documents. The good thing about this is that you do not have to follow a fixed plan. Each document can have its set of fields. We can also do searches on nested data and arrays and things that are embedded inside other objects.
This is really useful for things like user profiles, catalogues and content that does not have a structure. JSON documents are great for these things because they can be different, from one another. Example; MongoDB · CouchDB · Firestore
Column-Family Stores: Column-family NoSQL databases are also known as wide-column stores. These databases organize data into rows and columns within families. This is really useful because it gives us high write scalability.. It is also very good for storing sparse data because it does that very efficiently. These tables are built for Internet of Things logging and analytics at a big scale like the whole planet. Example; Apache Cassandra · HBase · ScyllaDB
Graph Databases: When we talk about graph database Dawe are talking about a way of storing data as node and edges. We can even use Graph Databases to find patterns that might show us something is going on like fraud. This is really important for things like networks, where we need to be able to see how people are connected and for systems that try to figure out what we might want to buy, based on what our friends like. Graph Databases are also used to detect fraud, which’s a big problem. Example; Neo4j · Amazon Neptune · Jan usGraph
Time-Series Databases: Time-Series Databases are special because they are really good at handling data that has a time stamp on it like the date and time something happened. They can also handle a lot of data that is coming in at once and they can store it in a way that makes it easy to look at later. This makes them perfect for things like monitoring systems, where we need to keep track of what’s happening right now and for sensors that are always sending us new data. They are also used in finance, where we need to keep track of what’s happening with stocks and things like that. Example; InfluxDB · TimescaleDB · Prometheus
Vector Database: Vector Databases are used to store complicated pieces of data like the things that artificial intelligence and machine learning models come up with. We can use these databases to search for things that’re similar not just exact matches. This is really useful for things like searching for words that mean the thing or, for finding pictures that look similar. Vector Databases are an growing field and they are being used for things like making search results more accurate and for helping computers understand what we are looking for. Example; Pinecone · Weaviate · Milvus

RDBMS vs NoSQL: the honest comparison
NoSQL is not a replacement for relational databases. It’s a different tool for different problems. Here’s the full comparison;
| Aspect | Relational DB (RDBMS) | NoSQL DB |
|---|---|---|
| Data Model | Tables with rows and columns | Key-value, document, column-family, graph, etc. |
| Schema | Rigid, defined upfront, hard to change | Flexible or schema-less, evolves with your app |
| Query Language | SQL - standardized, powerful | Proprietary APIs, JSON-based queries |
| Joins | Native, highly optimized | Usually avoided, denormalize instead |
| Transactions | Full ACID guarantees | Often BASE; some support multi-document ACID |
| Scaling Strategy | Vertical scaling | Horizontal scaling |
| Consistency | Strong by default | Tunable - from eventual to strong |
| Best For | Finance, OLTP, strictly structured data | Big data, real-time, AI/ML, IoT, social platforms |
The two questions that always work
Every bad database decision comes from the same mistake: picking a technology first, then figuring out how to make the data fit. The right process is always the opposite. Ask these two questions before you do anything else:
Only need lookups by a single unique key?
- Key-Value Store (Redis, DynamoDB)
Flexible JSON-like documents, rich field queries?
- Document Store (MongoDB, CouchDB)
Massive write volume, time-based or append-heavy workload?
- Column-Family (Cassandra) or Time-Series (InfluxDB)
Complex relationship traversals (friends, fraud, recommendations)?
- Graph Database (Neo4j, Amazon Neptune)
AI/ML semantic search, embeddings, similarity?
- Vector Database (Pinecone, Milvus)
Money, payments, inventory; legally critical transactions?
- Relational DB. Don’t switch. ACID is non-negotiable here.
Real production systems rarely use just one database. The best engineering teams practise what’s called polyglot persistence, using the right tool for each distinct workload within the same application.
Three Rules. (Must remember)
Everything else in NoSQL follows from these three.
- The data model and the query patterns decide our database, not trends, not what Netflix uses, not what our friend recommended.
- Scale is a spectrum. Know where we are today. Design for where we are going tomorrow.
- ACID and BASE are not good versus evil. They are tradeoffs. Know which one your specific problem actually needs.
