How ScyllaDB makes LSM-tree compaction state-of-art by leveraging RUM conjecture and controller theory

Talk in English

LSM-tree storage engines are known for providing very fast write throughput, as they allow new data to be flushed into immutable files called Sorted String Table, a.k.a. SSTable. But there's no such thing as a free lunch. This append-only nature of the LSM tree generates read amplification over time, because a given key can be redundantly stored across multiple immutable files. That's the reason storage engines have to employ a background compaction process, to reduce both read and space amplification. Compaction itself is very complex and it plays an essential role in the system's performance. There are multiple compaction policies, and each suits a particular workload better. For example, leveled policy is better for workloads performing overwrites that are very sensitive to read latency, but that comes at the cost of higher write amplification, meaning that the system cannot keep up with the write rate unless you throw faster disks or more instances at it. On the other hand, there's the tiered policy that favors write performance but at the cost of higher read and space amplification. There can also be a policy which mixes both tiers and levels, to have something in between, efficiency wise. The RUM conjecture, which states that only two amplification factors can be optimized at the expense of a third, is used by engineers to come up with efficient policies for a variety of workloads. A good example is the ScyllaDB incremental compaction approach that breaks large files into fixed-size fragments, to fix the 100% space overhead that affects most implementations of tiered policy out there. ScyllaDB also leverages industrial controller theory to avoid the endless tuning cycle for compaction speed. If compaction is slower than it should be, the compaction backlog increases, resulting in bad read latency and potentially the system running out of disk space. If compaction is running too fast, then resources are being needlessly stolen from foreground requests, affecting both latency and throughput. So nothing is better than a closed-loop control system that measures backlog and adjusts compaction speed accordingly. That frees your DBAs to perform more important tasks.

#databases
#internals
#lsm
#storage

Speakers

Raphael Carvalho
Company: ScyllaDB

Invited experts

Ilya Kokorin
Company: ITMO University

Other talks on «Database Internals»
- Watch recording
  Talk type: Talk
  Pragmatic Code Generation for Efficient Execution
  Satyam Shekhar
  Company: NetSpring
  Talk in English
- Watch recording
  Talk type: Talk
  OK S3
  Vadim Tsesko
  Company: Odnoklassniki
  Talk in Russian
- Watch recording
  Talk type: Talk
  HTAP Workloads: Challenges and Solutions
  Ilya Kokorin
  Company: ITMO University
  Talk in English
- Watch recording
  Talk type: Conversation
  Roundtable: Cloud Databases as a Service
  Vladimir Ozerov
  Company: Querify Labs
  Oleg Anastasyev
  Company: Odnoklassniki
  Ivan Prisyazhniy
  Company: Workato
  Talk in Russian

Schedule

How ScyllaDB makes LSM-tree compaction state-of-art by leveraging RUM conjecture and controller theory

Speakers

Invited experts

Other talks on «Database Internals»