Sitemap

Distributed SQL’s Moment Has Arrived: How TiDB Is Leading the Way

12 min readMay 15, 2025

At AWS re:Invent 2024, Amazon sparked fresh excitement in the distributed SQL space with the launch of Aurora DSQL — a globally distributed SQL database that delivers high availability and active-active performance across regions — without compromising SQL consistency or developer simplicity. While this announcement is a strong endorsement of the distributed SQL category, it’s worth noting that enterprises have been depending on distributed SQL databases for years to support some of the world’s most demanding, mission-critical workloads.

At AWS re:Invent 2024, Amazon sparked fresh excitement in the distributed SQL space with the launch of Aurora DSQL — a globally distributed SQL database that delivers high availability and active-active performance across regions — without compromising SQL consistency or developer simplicity. While this announcement is a strong endorsement of the distributed SQL category, it’s worth noting that enterprises have been depending on distributed SQL databases for years to support some of the world’s most demanding, mission-critical workloads.

From global e-commerce platforms serving customers across continents with millisecond latency to financial institutions reconciling transactions in real time across regions, distributed SQL has already proven its value in production. Enterprises have relied on it for years to power fault-tolerant systems, ensure multi-region availability, and scale seamlessly with growing demand — well before Aurora DSQL entered the picture.

Despite its proven track record, distributed SQL remains a somewhat hidden gem in the broader database landscape. Now is the time to shine a light on its capabilities, showcasing how it has quietly powered some of the most complex, high-scale applications for years.

In this blog, we’ll explore why distributed SQL databases are becoming essential for organizations facing massive scale, cost-efficiency demands, and strict compliance requirements. And it’s not just the largest enterprises — high-growth SaaS companies are also recognizing the need for these capabilities to stay agile and competitive.

We’ll dive into the power of distributed SQL through the lens of TiDB, an innovative open-source database that unlocks petabyte-grade scale for data-intensive companies. We’ll illuminate its foundational capabilities, showcase compelling real-world use cases, and make the case for why this category is poised to take center stage in the forthcoming era of cloud-native data architectures.

To truly appreciate its significance, let’s first establish a clear understanding: what exactly is distributed SQL?

Defining Distributed SQL and its Challenges

A distributed database is a collection of data that is spread across multiple physical locations, such as servers, regions, or data centers. Unlike traditional databases confined to a single server, distributed databases divide the workload of storing and processing data across several nodes to achieve scalability, fault tolerance, and high availability. Each node operates independently; collectively, they function as a unified database system.

In the early 2000s, the inability of traditional databases to handle burgeoning data volumes led directly to the emergence of NoSQL. NoSQL provided a much-needed path to distributed scalability, but this advantage was generally achieved by forgoing complete ACID compliance. To bridge this gap, NewSQL databases reintroduced SQL semantics and transactions; however, many were not fully distributed. The next evolution — distributed SQL — combines the best of NoSQL and SQL: global scalability, strong consistency, and cloud-native resiliency.

Distributed SQL databases combine the relational consistency of traditional SQL systems with the scalability and resilience of cloud-native architectures.

Figure 1 lays out the timeline of select vendors in this space.

Figure 1: Distributed SQL databases have emerged as a powerful solution, combining the scalability and availability of NoSQL with the familiar SQL interface and ACID transaction guarantees.

Distributed databases have evolved significantly from their early NoSQL origins to the emergence of robust distributed SQL systems. This brings us to the core question: what is driving this surge in demand for distributed SQL? The next section examines how they are addressing fundamental business needs that traditional databases simply can’t meet.

What is Driving the Need for Distributed SQL?

The modern enterprise, characterized by a multitude of applications and microservices, faces the challenge of managing ever-increasing data volume and growing user traffic. Additionally, disparate database infrastructures, especially at global scale, place immense pressure on IT budgets.

While AI is receiving increased investment and many teams are tightening traditional IT budgets, it’s a mistake to frame this as a zero-sum game. Scalable, high-performance databases are essential to successful AI initiatives — especially when powering use cases like retrieval-augmented generation (RAG), semantic search, or agent memory. Rather than competing, AI and database investments reinforce one another.

This environment necessitates data architecture that optimizes operational efficiency while maintaining stringent performance and scalability requirements and addressing compliance with data sovereignty regulations. Distributed SQL emerges as a critical solution, driven not solely by the need for scale, performance, and cost optimization, but also because it aligns with the operational and regulatory realities of modern digital enterprises.

These challenges are common across large enterprises and software-as-a-service (SaaS) providers alike. For example, e-commerce platforms like Shopify must handle extreme traffic surges during events like Black Friday, where traditional NoSQL systems can falter under scale limitations, leading to potential data inconsistency and lost transactions. Similarly, SaaS leaders must ensure low-latency access and real-time synchronization across globally distributed data centers while complying with strict data residency laws. Even gaming platforms face massive, unpredictable user loads where strong consistency and availability are critical to user experience — demands that traditional monolithic or NoSQL systems often struggle to meet.

Key challenges include:

  • Dynamic Traffic Patterns: Many enterprises experience significant peaks and troughs in traffic. For example, e-commerce platforms must manage large variability in transaction volumes while providing real-time personalized recommendations. To accommodate these fluctuations, infrastructure is often either underprovisioned or underutilized, leading to performance degradation and cost inefficiencies. Thus, there is a need for solutions that can scale elastically in real-time (up/down and in/out). While scaling up and out is often emphasized for handling increased workloads, scaling down or in is equally important. However, this downward scaling is frequently more challenging to implement and is not commonly offered by many providers.
  • Rapid Feature Releases: SaaS businesses are renowned for their fast-paced innovation, frequently launching new features. This requires systems that support agile development techniques, enabling rapid iteration and deployment.
  • Always-On Availability: Given their global reach, platforms must ensure high availability and fault-tolerance. Even routine tasks like database upgrades should be seamless, avoiding any downtime to maintain user satisfaction.
  • Cost Efficiency: Cost optimization is critical for all organizations and especially SaaS providers. Thus, the need for solutions that balance scalability and affordability.
  • Data Privacy and Security: Retaining customer trust hinges on robust data protection that adheres to regulatory compliances. Solutions that prioritize data security and governance are critical.
  • User Experience (Complexity Reduction): Reducing complexity enhances usability, increases user satisfaction, and reduces development and support costs. Solutions that seamlessly integrate with the existing ecosystem lead to simplified, streamlined cost-effective architectures.

Table 1 summarizes SaaS business requirements and technical capabilities needed to accomplish them.

By addressing these factors, businesses can ensure they remain competitive while delivering exceptional service to their users. However, the sheer number of databases available — over 300 according to DB-Engines — presents a significant challenge. Like a child overwhelmed by too many choices, organizations can find themselves paralyzed by the complexity of database selection.

The next section explores how distributed SQL databases are architected to address complexities like maintaining consistency across nodes, handling network partitions, and optimizing query performance at scale.

Foundational Capabilities of Distributed SQL Databases

Distributed SQL databases inherit key attributes of relational databases but they differ mainly in that they automatically shard data across geographically distributed regions. The goal of distributing data across regions is to parallelize queries and deliver very high throughput on growing data and global query needs. This distribution also helps ensure compliance with data residency requirements.

The query layer often adheres to open standards like MySQL or PostgreSQL. However, MySQL and PostgreSQL were designed to run on single servers and do not have native capabilities to handle simultaneous reads and writes to the same data. In fact, supporting multiple writers on the same data is challenging and requires protocols like Raft to replicate data synchronously to maintain consistency. Hence, all the providers in this category decouple storage and compute.

This section explores the foundational capabilities of distributed SQL databases, using TiDB as a representative example. We assess how it addresses core architectural and operational requirements — ranging from scalability and consistency to workload isolation and observability. Figure 2 outlines the evaluation criteria used to benchmark these capabilities.

Figure 2: The Seven Pillars of distributed SQL: Core Capabilities Powering Scalable, Resilient, and AI-Ready Data Infrastructure

1. Elastic Scalability

Scalability is a foundational capability because a high-quality user experience relies on maintaining near real-time latency — even during sudden surges in activity. To achieve this, the system must be both elastic and capable of rapid scaling, seamlessly handling increased loads while preserving low latency. Distributed SQL enhances scalability and resilience by spreading data and processing across multiple nodes.

TiDB partitions object data into 96MB regions that are distributed across row storage (TiKV)nodes. Its architecture scales SQL processing independently of storage. During peak traffic, additional nodes are automatically provisioned to handle the query load, achieving linear scalability without interrupting data persistence.

This process remains transparent to users, who interact solely with standard SQL tables. Meanwhile, continuous monitoring of metrics like disk usage and CPU load allows TiDB to dynamically redistribute regions and prevent hotspots — a stark contrast to some NoSQL databases where developers must manually design shard keys.

2. Strong Consistency and ACID Compliance

Consensus algorithms coordinate transactions across distributed nodes, providing ACID guarantees even in a distributed environment.

Data writes in row storage (TiKV) are replicated synchronously using Raft groups. Each region maintains three replicas, with leader nodes coordinating writes only after a majority acknowledge the operation. TiDB’s metadata orchestration layer (internally called Placement Driver) coordinates transactions and ensures ACID compliance.

When writing across nodes in different regions, the concept of time becomes highly critical. TiDB assigns globally unique timestamps, enabling snapshot isolation and consistent reads across shards.

3. High Availability

Since distributed SQL databases replicate data across multiple nodes, their high availability design is more complicated than traditional databases. If one node fails, the system should continue operating using data from other nodes without any data loss and with full consistency.

TiDB maintains business continuity and fault tolerance by detecting node failures within seconds and promotes follower replicas to leaders using the Raft protocol. User requests are rerouted to healthy replicas without requiring changes on the application-side, while the impacted nodes are rebuilt automatically using surviving copies.

High availability also needs to be maintained during software patching (minor releases) and upgrades (major releases). TiDB achieves zero-downtime upgrades by rolling updates to individual nodes while others handle traffic. TiDB achieves 99.99% uptime for unplanned outages.

4. Operational and Analytical Use Cases

Historically, relational databases like MySQL and PostgreSQL were optimized for transactional and operational use cases (OLTP), where the primary focus was on efficiently handling everyday operations and transactions. As the need for data-driven insights grew, organizations began to extract data from these OLTP systems and load it into separate data warehouses for analytical processing (OLAP). This process, typically facilitated by change data capture (CDC) and extract, transform, load (ETL) pipelines, allowed companies to analyze historical data and derive business intelligence.

However, this traditional separation comes with several drawbacks. Managing separate systems for OLTP and OLAP introduces additional complexity, increases latency in data availability, and raises operational costs due to the overhead of maintaining synchronization between environments. In response, some database vendors have streamlined this process through what is now known as Zero-ETL. This approach automates and abstracts the ETL process, significantly reducing the complexity and lag traditionally associated with data synchronization.

An alternative and increasingly popular strategy is the Hybrid Transactional/Analytical Processing (HTAP) model. HTAP systems are designed to support both transactional and analytical workloads on the same platform, eliminating resource contention. This unified architecture enables real-time analytics on live operational data, ensuring that insights can be derived instantly without the need for separate data movement processes. By eliminating the bottlenecks associated with traditional ETL pipelines, HTAP delivers immediate, actionable insights while maintaining robust transaction processing capabilities.

One of the standout features of TiDB is its ability to handle both transactional and analytical workloads. With the integration of a columnar storage extension (called TiFlash), TiDB supports real-time analytics without compromising transactional performance, enabling HTAP.

In addition, TiDB has added support for multiple other use cases, such as graph, vector search, and full-text search. By allowing other use cases, TiDB can act as a single database avoiding deployment of multiple database technologies which incur integration overhead and cost.

5. AI Applications Use Cases

Distributed SQL databases are a cornerstone for modern AI applications because they can power Retrieval-Augmented Generation (RAG) and other AI-driven use cases directly within the database with unlimited scalability. By embedding vector indexing and search capabilities along with the enterprise data, they avoid the need to deploy multiple databases. With this unified storage, TiDB makes it possible to run advanced AI workloads, such as semantic search and similarity matching, natively using SQL. It is also adding support for agentic workflows and model context protocol (MCP).

TiDB has strong integration with AWS Bedrock and uses its native Nova as well as 3rd-party embedding models to convert documents or prompts into vectors. TiDB then stores those vectors and leverages Bedrock’s generative models to produce answers or insights based on relevant context retrieved from TiDB.

In addition, TiDB also integrates with the wider AI ecosystem vendors like LlamaIndex and LangChain. This enables developers to use TiDB as a vector store within the frameworks provided by these vendors. TiDB handles a variety of data, including the original text chunks, associated metadata, and the vector embeddings that capture the semantic meaning of the text for efficient similarity search.

6. Cost Performance

Distributed SQL databases execute queries concurrently across multiple nodes, harnessing parallel processing to significantly improve performance, especially for complex queries. This parallelism minimizes query execution time by distributing the workload evenly across the cluster, ensuring that each node contributes to the overall processing power.

As we have seen, these systems are architected with a clear separation between compute and storage that allows organizations to scale each component independently. This flexibility not only streamlines resource allocation but also minimizes unnecessary expenditure, reducing the overall operational cost.

TiDB’s serverless, decoupled architecture delivers low-latency query performance at scale, supporting customers that manage thousands of clusters, petabytes of data, and millions of tables. This kind of scale is especially critical for SaaS platforms that need to manage multi-tenant workloads efficiently. By separating compute and storage, TiDB enables organizations to right-size infrastructure, reduce TCO, and scale cost-effectively as demand grows — without compromising performance.

7. User Experience

High customer satisfaction derives from an excellent user experience that benefits all types of users — developers, administrators, and end users. One way to achieve this is through adherence to open standards. It not only reduces the learning curve but is also crucial for avoiding vendor lock-in.

TiDB’s SQL layer is compatible with the MySQL ecosystem, making it easier for developers to adopt and migrate existing applications.

From an administrator’s point of view, TiDB supports containerized deployments and integrates well with orchestration systems like Kubernetes. TiDB’s decoupled architecture simplifies operations and supports online scaling and maintenance, reducing downtime and easing the management overhead in cloud or hybrid environments.

TiDB’s architecture handles increasing data volumes and traffic, while maintaining consistent performance and eliminating single points of failure. These features contribute to its high level of customer satisfaction.

8. Security

TiDB supports enterprise-grade security measures designed to meet the demands of regulated industries and mission-critical workloads. This includes encryption at rest and in transit, ensuring that data remains protected both on disk and during network transmission. Role-Based Access Control (RBAC) enables fine-grained user permissions, allowing organizations to enforce the principle of least privilege. Audit logging provides visibility into database activity, making it easier to detect anomalous behavior and meet compliance requirements. TiDB also supports customer-managed encryption keys (KMS/CMEK), giving customers full control over key lifecycle management and further aligning with regulatory standards such as HIPAA, SOC 2, and GDPR.

Crucially, TiDB is 100% open source, offering full transparency into the codebase. This openness fosters trust and allows security-conscious organizations to independently verify the database’s security architecture, conduct third-party audits, or contribute improvements.

Summary

As organizations continue to generate and rely on vast amounts of data — whether from customer interactions, IoT devices, AI applications, or internal systems — the need for databases that can scale seamlessly, perform under pressure, and deliver real-time insights has never been more critical. This has accelerated the shift toward distributed SQL databases, which combine the scalability and resilience of cloud-native architectures with the familiarity and transactional guarantees of traditional RDBMS.

Among these, TiDB has emerged as a leading platform, purpose-built to meet the evolving data infrastructure needs of modern enterprises. It offers global scalability, allowing organizations to elastically grow workloads across multiple regions and clouds. Its fault-tolerant design, based on the Raft consensus algorithm, ensures high availability and data consistency even in the face of infrastructure failures. With strong ACID compliance, it supports mission-critical transactional workloads while simultaneously enabling real-time analytics and AI through its HTAP (Hybrid Transactional/Analytical Processing) architecture.

Moreover, TiDB simplifies complex infrastructure by supporting diverse workloads — including operational OLTP, analytical OLAP, vector search, and full-text search — within a unified platform. It is MySQL-compatible, enabling easier migration, and supports cloud-native deployments with features like autoscaling, serverless operation, and decoupled compute/storage layers.

--

--

Sanjeev Mohan
Sanjeev Mohan

Written by Sanjeev Mohan

Sanjeev researches the space of data and analytics. Most recently he was a research vice president at Gartner. He is now a principal with SanjMo.

No responses yet