尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
ScyllaDB Leaps Forward
Dor Laor, Co-founder & CEO of ScyllaDB
Dor Laor
■ Hello world, Core router, KVM, OSv, ScyllaDB
■ Phd in Snowboard, aspiring MTB rider
■ Let’s shard!
3 Learnings from the KVM Hypervisor Dev Period
1.Layer == overhead
2.Locking == Evil
3.Simplicity == True
4.1 million op/s == Expectations
5.Off-by == one || 0x10
ScyllaDB is Proud to Serve!
You’ll hear from many of our customers
Why ScyllaDB?
Best High Availability in the industry
Best Disaster Recovery in the industry
Best scalability in the industry
Best Price/Performance in the industry Auto-tune - out of the box performance
Compatible with Cassandra & DynamoDB
The power of Cassandra at the speed of Redis with the usability of DynamoDB
No Lock-in
Open Source Software
Agenda
Part 1
■ Arch overview
■ New 2024 results & benchmarks
■ ScyllaDB Cloud brief
Part 2
■ ScyllaDB 6.0
■ Tablets
■ What’s coming
Shard Per Core Architecture
Shards
Threads
ScyllaDB Architecture
Homogeneous nodes Ring Architecture
ScyllaDB Architecture
Homogeneous nodes Ring Architecture?@$@#
Can we linearly
scale up?
■ Time between failures TBF
■ Ease of maintenance
■ No noisy neighbours
■ No virtualization, container overhead
■ No other moving parts
■ Scale up before out!
Small vs Large Machines
Linear Scale Ingestion
2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion
2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion
2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion
2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion
2X 2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion
2X 2X 2X 2X 2X
3 x 4-vcpu VMs
1B keys
3 x 8-vcpu VMs
2B keys
3 x 16-vcpu VMs
4B keys
3 x 32-vcpu VMs
8B keys
3 x 64-vcpu VMs
16B keys
3 x 128-vcpu VMs
32B keys
Credit: Felipe Cardeneti Mendes
Linear Scale Ingestion (2023 vs 2018)
Constant Time Ingestion
2X 2X 2X 2X 2X
Ingestion:
17.9k OPS
per shard *
16
Ingestion:
<2ms P99
per shard
“Nodes must be small, in case they fail”
You can scale OPS but can you scale failures?
2X 2X 2X 2X 2X
“Nodes must be small, in case they fail”
No they don’t! {Replace, Add, remove} Node at constant time
2X 2X 2X 2X 2X
“Nodes must be small, in case they fail”
No they don’t! {Replace, Add, remove} Node at constant time
2X 2X 2X 2X 2X
“Nodes must be small, in case they fail”
No they don’t! {Replace, Add, remove} Node at constant time
What’s new in 2024.1?
2024.1 vs 2023.1 vs OSS 5.4
Benchmark: ScyllaDB vs MongoDB
Benchmark: ScyllaDB vs MongoDB
■ Fair
■ 3rd party
■ SaaS - zero config
■ YCSB - standard
■ 132 wins in 133 workloads
■ 10x-20x better performance
■ 10x-20x better latency
■ MongoDB doesn’t scale!
Scale Benchmark: ScyllaDB vs MongoDB
ScyllaDB Cloud News
ScyllaDB Cloud - 65% of Customers
■ Terraform providers, launch,
scale, manage
■ Multiple networking modes
■ Encryption at rest, BYOK
■ Certifications: SoC2, ISO, PCI
■ Coming: Azure
Welcome ScyllaDB
Our journey
from eventual
to immediate
consistency
Scylla today is awesome but
■ Topology changes are allowed one-at-a-time
■ Rely on 30+ second timeouts for consistency
■ Node failed/down block scale
■ Streaming time is a function of the schema
■ Additional complex operations: Cleanup, repair
ScyllaDB 6.0
■ Consistent schema changes (raft, >= 5.2)
■ Consistent topology changes (raft, 6.0)
■ Tablets
ScyllaDB 6.0 Value
Elasticity
■ Faster bootstrap
■ Concurrent node operations
■ Immediate request serving
Simplicity
■ Transparent cleanups
■ Semi transparent repairs
■ Auto gc-grace period
■ Parallel maintenance operations
Speed
■ Streaming sstables - 30x faster
■ Load balancing of tablets Consistency
TCO
■ Shrink free space
■ Reduce static, over provisioned deployments
Behind the Scene
Raft- Consistent metadata
Protocol for state machine replication
Total order broadcast of state change commands
X = 0 X += 1 CAS(X, 0, 1)
node A
bootstrap
bootstrap
Linearizable Token Metadata
node B
node C
system.token_metadata
Read
barrier
Read
barrier
■ Fencing - each write is signed with topology version
■ If there is a version mismatch, the write doesn’t go through
Changes in the Data Plane
Replica
Coordinator
Topology
coordinator
Consistent Metadata Journey
RAFT Safe schema
changes
Safe topology
changes
Dynamic partitioning
Consistent tables
Tablets
5.0
5.2
5.2+
6.0
6.0 Tablets FTW
Standard tables
{R1, R2, R3}
R1
R2
R3
key1
replication
metadata:
(per keyspace)
Standard tables
{R1, R2, R3}
R1
R2
R3
key1
key2
Sharding function generates
good load distribution between
CPUs
RAFT group
No. 299238
RAFT group
No. 299236 RAFT group
No. 299237
RAFT tables key1
key2
tablet
tablet
replica
tablet
replica
RAFT tables key1
key2
Good load distribution requires
lots of RAFT groups.
Tablets - balancing
Table starts with a few tablets.
Small tables end there
Not fragmented into tiny pieces
like with tokens
Tablets - balancing
When tablet becomes too heavy
(disk, CPU, …) it is split
Tablets - balancing
When tablet becomes too heavy
(disk, CPU, …) it is split
Tablets - balancing
The load balancer can decide to
move tablets
Tablets - balancing
Depends on fault-tolerant,
reliable, and fast topology
changes.
Tablets
Resharding is cheap.
SStables split at tablet boundary.
Reassign tablets to shards (logical operation).
Tablets
Cleanup after topology change is cheap.
Just delete SStables.
Tablet Scheduler
Scheduler globally controls movement, maintenance
operation on a per tablet basis
repair
migration
tablet 0
tablet 1
schedule schedule
repair
Backup
Tablet Scheduler
Goals:
■ Maximize throughput (saturate)
■ Keep migrations short (don’t overload)
Rules:
■ migrations-in <= 2 per shard
■ migrations-out <= 4 per shard
Tablets - Repair - Fire & forget
■ Tablet based
■ Continuous, transparent controlled by the load
balancer
■ Auto GC grace period
Tablets - Streaming (Enterprise)
Send files over RPC. No per
schema, per row processing.
30x faster, saturate links.
Scylla Enterprise only
Sstables
files
Sstables
files
Sstables
files
Post 6.0
Since we have
■ 30x faster streaming
■ Parallel operations
■ Small unit size - based on tablet/shard, not hardware
■ Negligent performance impact
■ Incremental serving as you add machines
No reliance on cluster size, instance size or instance type
Tablets =~ Serverless
Serverless
Time
ie3n’s
i4i
Time
Capacity
Required
Time
On-demand
Base
Typeless Sizeless limitless
What’s Cooking?
■ Full transactional consistency with Raft
■ HDD and dense standard nodes (d3en)
■ S3 backend
■ Incremental repair
■ Point in time backup/restore
■ Tiered storage
What’s in the Oven
Eventual
Consistency
Thank You! Keep Innovating!
IoT
Crypto
eCommerce
Telco
Feature Store
Streaming
Fintech
Cyber
Social network
Graph Storage
Layer
Recommendation
& Personalization
Engine
Fraud & Threat
Detection
AI/ML
Analytics
Customer
Experience

More Related Content

Similar to ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB

Stabilizing Ceph
Stabilizing CephStabilizing Ceph
Stabilizing Ceph
Ceph Community
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at Facebook
MariaDB plc
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
Internet World
 
Retour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenantRetour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenant
Swiss Data Forum Swiss Data Forum
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
ScyllaDB
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
ScyllaDB
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
ScyllaDB
 
IBM Power capacity planning
IBM Power capacity planningIBM Power capacity planning
IBM Power capacity planning
Pavel Hampl
 
Road show 2015 triangle meetup
Road show 2015 triangle meetupRoad show 2015 triangle meetup
Road show 2015 triangle meetup
wim_provoost
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
ScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
ScyllaDB
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
ITCamp
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
inwin stack
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Clustrix
 
Oss4b - pxc introduction
Oss4b   - pxc introductionOss4b   - pxc introduction
Oss4b - pxc introduction
Frederic Descamps
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Sage Weil
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good?
Alkin Tezuysal
 
NoSQL with MySQL
NoSQL with MySQLNoSQL with MySQL
NoSQL with MySQL
FromDual GmbH
 

Similar to ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB (20)

Stabilizing Ceph
Stabilizing CephStabilizing Ceph
Stabilizing Ceph
 
Migrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at FacebookMigrating from InnoDB and HBase to MyRocks at Facebook
Migrating from InnoDB and HBase to MyRocks at Facebook
 
Storage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, WhiptailStorage and performance- Batch processing, Whiptail
Storage and performance- Batch processing, Whiptail
 
Retour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenantRetour d'expérience d'un environnement base de données multitenant
Retour d'expérience d'un environnement base de données multitenant
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
 
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
 
IBM Power capacity planning
IBM Power capacity planningIBM Power capacity planning
IBM Power capacity planning
 
Road show 2015 triangle meetup
Road show 2015 triangle meetupRoad show 2015 triangle meetup
Road show 2015 triangle meetup
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten RachfahlStorage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
Storage Spaces Direct - the new Microsoft SDS star - Carsten Rachfahl
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Oss4b - pxc introduction
Oss4b   - pxc introductionOss4b   - pxc introduction
Oss4b - pxc introduction
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good?
 
NoSQL with MySQL
NoSQL with MySQLNoSQL with MySQL
NoSQL with MySQL
 

More from ScyllaDB

99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
ScyllaDB
 
How to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance ProblemsHow to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance Problems
ScyllaDB
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy WorkloadsUsing ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
ScyllaDB
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
ScyllaDB
 
The Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetryThe Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetry
ScyllaDB
 
ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?
ScyllaDB
 
High Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen ShapiraHigh Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen Shapira
ScyllaDB
 
Writing Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code SucksWriting Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code Sucks
ScyllaDB
 
Building a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge PlatformBuilding a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge Platform
ScyllaDB
 
Beyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOsBeyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOs
ScyllaDB
 
Quantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core ArchitectureQuantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core Architecture
ScyllaDB
 
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & DiskLow-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & Disk
ScyllaDB
 
Demanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database BenchmarkingDemanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database Benchmarking
ScyllaDB
 
P99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io SystemP99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io System
ScyllaDB
 

More from ScyllaDB (20)

99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
 
How to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance ProblemsHow to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance Problems
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy WorkloadsUsing ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy Workloads
 
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
 
The Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetryThe Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetry
 
ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?
 
High Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen ShapiraHigh Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen Shapira
 
Writing Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code SucksWriting Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code Sucks
 
Building a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge PlatformBuilding a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge Platform
 
Beyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOsBeyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOs
 
Quantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core ArchitectureQuantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core Architecture
 
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & DiskLow-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & Disk
 
Demanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database BenchmarkingDemanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database Benchmarking
 
P99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io SystemP99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io System
 

Recently uploaded

Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
UiPathCommunity
 
Fuxnet [EN] .pdf
Fuxnet [EN]                                   .pdfFuxnet [EN]                                   .pdf
Fuxnet [EN] .pdf
Overkill Security
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
ScyllaDB
 
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
Kieran Kunhya
 
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
ThousandEyes
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
Databarracks
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
UiPathCommunity
 
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google CloudRadically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
ScyllaDB
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
DynamoDB to ScyllaDB: Technical Comparison and the Path to Success
DynamoDB to ScyllaDB: Technical Comparison and the Path to SuccessDynamoDB to ScyllaDB: Technical Comparison and the Path to Success
DynamoDB to ScyllaDB: Technical Comparison and the Path to Success
ScyllaDB
 
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to SuccessMongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
ScyllaDB
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
Larry Smarr
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
Neeraj Kumar Singh
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
ScyllaDB
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
NTTDATA INTRAMART
 

Recently uploaded (20)

Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
 
Fuxnet [EN] .pdf
Fuxnet [EN]                                   .pdfFuxnet [EN]                                   .pdf
Fuxnet [EN] .pdf
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
 
ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024ThousandEyes New Product Features and Release Highlights: June 2024
ThousandEyes New Product Features and Release Highlights: June 2024
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
 
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
 
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google CloudRadically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
DynamoDB to ScyllaDB: Technical Comparison and the Path to Success
DynamoDB to ScyllaDB: Technical Comparison and the Path to SuccessDynamoDB to ScyllaDB: Technical Comparison and the Path to Success
DynamoDB to ScyllaDB: Technical Comparison and the Path to Success
 
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to SuccessMongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
 

ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB

  • 1. ScyllaDB Leaps Forward Dor Laor, Co-founder & CEO of ScyllaDB
  • 2. Dor Laor ■ Hello world, Core router, KVM, OSv, ScyllaDB ■ Phd in Snowboard, aspiring MTB rider ■ Let’s shard!
  • 3. 3 Learnings from the KVM Hypervisor Dev Period 1.Layer == overhead 2.Locking == Evil 3.Simplicity == True 4.1 million op/s == Expectations 5.Off-by == one || 0x10
  • 4. ScyllaDB is Proud to Serve!
  • 5. You’ll hear from many of our customers
  • 6. Why ScyllaDB? Best High Availability in the industry Best Disaster Recovery in the industry Best scalability in the industry Best Price/Performance in the industry Auto-tune - out of the box performance Compatible with Cassandra & DynamoDB The power of Cassandra at the speed of Redis with the usability of DynamoDB No Lock-in Open Source Software
  • 7. Agenda Part 1 ■ Arch overview ■ New 2024 results & benchmarks ■ ScyllaDB Cloud brief Part 2 ■ ScyllaDB 6.0 ■ Tablets ■ What’s coming
  • 8. Shard Per Core Architecture Shards Threads
  • 10. ScyllaDB Architecture Homogeneous nodes Ring Architecture?@$@#
  • 12.
  • 13. ■ Time between failures TBF ■ Ease of maintenance ■ No noisy neighbours ■ No virtualization, container overhead ■ No other moving parts ■ Scale up before out! Small vs Large Machines
  • 14. Linear Scale Ingestion 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 15. Linear Scale Ingestion 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 16. Linear Scale Ingestion 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 17. Linear Scale Ingestion 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 18. Linear Scale Ingestion 2X 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 19. Linear Scale Ingestion 2X 2X 2X 2X 2X 3 x 4-vcpu VMs 1B keys 3 x 8-vcpu VMs 2B keys 3 x 16-vcpu VMs 4B keys 3 x 32-vcpu VMs 8B keys 3 x 64-vcpu VMs 16B keys 3 x 128-vcpu VMs 32B keys Credit: Felipe Cardeneti Mendes
  • 20. Linear Scale Ingestion (2023 vs 2018) Constant Time Ingestion 2X 2X 2X 2X 2X
  • 23. “Nodes must be small, in case they fail” You can scale OPS but can you scale failures?
  • 24. 2X 2X 2X 2X 2X “Nodes must be small, in case they fail” No they don’t! {Replace, Add, remove} Node at constant time
  • 25. 2X 2X 2X 2X 2X “Nodes must be small, in case they fail” No they don’t! {Replace, Add, remove} Node at constant time
  • 26. 2X 2X 2X 2X 2X “Nodes must be small, in case they fail” No they don’t! {Replace, Add, remove} Node at constant time
  • 27. What’s new in 2024.1?
  • 28. 2024.1 vs 2023.1 vs OSS 5.4
  • 29.
  • 31. Benchmark: ScyllaDB vs MongoDB ■ Fair ■ 3rd party ■ SaaS - zero config ■ YCSB - standard ■ 132 wins in 133 workloads ■ 10x-20x better performance ■ 10x-20x better latency ■ MongoDB doesn’t scale!
  • 34. ScyllaDB Cloud - 65% of Customers ■ Terraform providers, launch, scale, manage ■ Multiple networking modes ■ Encryption at rest, BYOK ■ Certifications: SoC2, ISO, PCI ■ Coming: Azure
  • 35. Welcome ScyllaDB Our journey from eventual to immediate consistency
  • 36. Scylla today is awesome but ■ Topology changes are allowed one-at-a-time ■ Rely on 30+ second timeouts for consistency ■ Node failed/down block scale ■ Streaming time is a function of the schema ■ Additional complex operations: Cleanup, repair
  • 37. ScyllaDB 6.0 ■ Consistent schema changes (raft, >= 5.2) ■ Consistent topology changes (raft, 6.0) ■ Tablets
  • 38. ScyllaDB 6.0 Value Elasticity ■ Faster bootstrap ■ Concurrent node operations ■ Immediate request serving Simplicity ■ Transparent cleanups ■ Semi transparent repairs ■ Auto gc-grace period ■ Parallel maintenance operations Speed ■ Streaming sstables - 30x faster ■ Load balancing of tablets Consistency TCO ■ Shrink free space ■ Reduce static, over provisioned deployments
  • 40. Raft- Consistent metadata Protocol for state machine replication Total order broadcast of state change commands X = 0 X += 1 CAS(X, 0, 1)
  • 41. node A bootstrap bootstrap Linearizable Token Metadata node B node C system.token_metadata Read barrier Read barrier
  • 42. ■ Fencing - each write is signed with topology version ■ If there is a version mismatch, the write doesn’t go through Changes in the Data Plane Replica Coordinator Topology coordinator
  • 43. Consistent Metadata Journey RAFT Safe schema changes Safe topology changes Dynamic partitioning Consistent tables Tablets 5.0 5.2 5.2+ 6.0
  • 45. Standard tables {R1, R2, R3} R1 R2 R3 key1 replication metadata: (per keyspace)
  • 46. Standard tables {R1, R2, R3} R1 R2 R3 key1 key2 Sharding function generates good load distribution between CPUs
  • 47. RAFT group No. 299238 RAFT group No. 299236 RAFT group No. 299237 RAFT tables key1 key2 tablet tablet replica tablet replica
  • 48. RAFT tables key1 key2 Good load distribution requires lots of RAFT groups.
  • 49. Tablets - balancing Table starts with a few tablets. Small tables end there Not fragmented into tiny pieces like with tokens
  • 50. Tablets - balancing When tablet becomes too heavy (disk, CPU, …) it is split
  • 51. Tablets - balancing When tablet becomes too heavy (disk, CPU, …) it is split
  • 52. Tablets - balancing The load balancer can decide to move tablets
  • 53. Tablets - balancing Depends on fault-tolerant, reliable, and fast topology changes.
  • 54. Tablets Resharding is cheap. SStables split at tablet boundary. Reassign tablets to shards (logical operation).
  • 55. Tablets Cleanup after topology change is cheap. Just delete SStables.
  • 56. Tablet Scheduler Scheduler globally controls movement, maintenance operation on a per tablet basis repair migration tablet 0 tablet 1 schedule schedule repair Backup
  • 57. Tablet Scheduler Goals: ■ Maximize throughput (saturate) ■ Keep migrations short (don’t overload) Rules: ■ migrations-in <= 2 per shard ■ migrations-out <= 4 per shard
  • 58. Tablets - Repair - Fire & forget ■ Tablet based ■ Continuous, transparent controlled by the load balancer ■ Auto GC grace period
  • 59. Tablets - Streaming (Enterprise) Send files over RPC. No per schema, per row processing. 30x faster, saturate links. Scylla Enterprise only Sstables files Sstables files Sstables files
  • 61. Since we have ■ 30x faster streaming ■ Parallel operations ■ Small unit size - based on tablet/shard, not hardware ■ Negligent performance impact ■ Incremental serving as you add machines No reliance on cluster size, instance size or instance type Tablets =~ Serverless
  • 64. ■ Full transactional consistency with Raft ■ HDD and dense standard nodes (d3en) ■ S3 backend ■ Incremental repair ■ Point in time backup/restore ■ Tiered storage What’s in the Oven Eventual Consistency
  • 65. Thank You! Keep Innovating! IoT Crypto eCommerce Telco Feature Store Streaming Fintech Cyber Social network Graph Storage Layer Recommendation & Personalization Engine Fraud & Threat Detection AI/ML Analytics Customer Experience
  翻译: