This is the second article of a series to summarize the key concepts of Chapter 3. Storage and Retrieval in the Designing Data Intensive Application book. The series consists of 3 articles, including log-structure storage engine (SSTables and LSM Tree), page-oriented storage engine (B Tree) and column based storage engine.
Landscape of database storages
In short, database storages are divided into two categories, row-based and column based. We will talk more about the difference of them in next article. In this article, we will talk more about the page-oriented storage under row-based storage.
Page-oriented storage engine
Almost log-structured storage engine have pro-found influence on database evolution, the most widely adopted engine is actually page-oriented storage engine.
Different from log-structure storage engine, page oriented storage engine update the value in place (in its original page), instead of appending data to latest segment file.
B Tree, which were adopted by MySQL, PostgreSQL and MongoDB, are the most adopted data structure.
How it works
For read request:
- Binary search for root page
- Go to corresponding child page if necessary or get the value of corresponding key directly
For write request:
- Find corresponding node first
- If found, update the value in that page directly
- If not found, insert a node in the page.
- If the page is full, split page into 2 pages, and equally distribute the nodes
Branching factor: the number of nodes in one page, usually it’s several hundreds.
- Since B Tree is a balanced tree, the depth of tree is O(log n), where n is the number of key-value pairs.
- A four-level tree of 4KB pages with a branching factor of 500 can store up to 256 TB.
Pros & Cons
- Better read performance: because we are searching a N-ary search tree, the time complexity is O(log n). Where N is branching factor. On the other hands, the worst case of log-structured storage engine is nearly O(n).
- Worse write performance: write request in log-structured storage engine is to intuitively append data, but in B Tree, we need to search key first, open the target page, update value, and then save the page back.
- More complex concurrency control & crash recovery.
Comparison matrix of Log-structure & Page-oriented Storage engine
|Category||Pros & Cons||Well know data structure||Real world DB|
|Log-structure||Better write performance||SSTables & LSM Tree||Cassandra LevelDB HBase Lucence|
|Page-oriented||Better read performance more complex concurrency control & crash recovery||B Tree||MySQL PostgreSQL MongoDB|
About write performance
Write Amplification: one write query can produce how many disk writes ?
- For write-intensive application, disk I/O is usually the bottleneck. Usually log-structured storage engine will have less write amplification, so its write performance is usually better.
Log structure engines have higher write latency at high percentiles:
- although log structure storage engines usually have better write performance, however, they are sometimes largely influenced by background compaction process disk I/O. So log structure storage engine usually have higher write latency at higher percentile.
- On the other hands, the write latency of page oriented storage engine is more stable.
In this article, we walk through the design of page-oriented storage engine, mainly B Tree. We can know that data was updated directly in page file, so in general we can have worse write performance. But Since it’s a balanced tree structure, it can have much better read performance.
So in short, if the scenario requires much more read than write, we can consider use page-oriented storage engine. On the other hands, if we to handle a write intensive application, we can consider using log-structured storage engine.
In next article, we will talk more about column-based storage engine, which has many different design philosophy comparing to row-based storage engine. Let’s go.