尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
MongoDB vs ScyllaDB:
Path to Success
Felipe Mendes, Solution Architect at ScyllaDB
Pete Aven, Solution Architect at ScyllaDB
Meet The Team
Felipe Cardeneti Mendes
■ Published Author
■ ScyllaDB Committer
■ IT Specialist & Solution Architect
Your photo
goes here,
smile :)
Pete Aven
■ Data Engineer
■ Application Development
■ ScyllaDB Solution Architect
Soccer Football Goal?
Survival Rule #1
Lower Node
Count
Efficient hardware usage
results in direct savings
Where Do You Wanna Be?
Predictable,
Low Latencies
Consistent single-digit
millisecond p99 latencies
Autonomous
Self-Tuning
Self-optimizing, smaller
footprint, easy to use
Always-On
Architecture
Active-active deployments for
mission-critical workloads
Architecture
Differences
Data Model & Query Language
■ MQL – Mongo Query Language
■ Documents stored in BSON format
■ Schemaless
■ Favors Query Flexibility over Performance
Data Model & Query Language
■ CQL or DynamoDB API
■ Wide-Column Representation
■ Schema-oriented
■ Favors Performance over Query Flexibility
MongoDB – Replica Sets
Reads
Reads
(optional)
Reads
(optional)
REPLICA SET
3x replication
■ Primary-Secondary
■ Impact on Write Throughput
■ Election on Failure
■ Scaling Limitations
■ Up to 50 members
■ Arbiters
■ Non-voting members
■ Limited Storage/Replica*
■ Affect Reads Scaling
MongoDB Atlas *
MongoDB – Sharded Cluster
■ A Shard is a Replica Set
■ Adds Overhead & Complexity
■ Further Availability Concerns
■ Underutilized Capacity
■ Sharding
■ Performed Manually
■ Hashed vs Ranged
■ Still Require a Query Router
Reads
SHARDED CLUSTER
3 shards, 3x replication
ScyllaDB – Sharding & Replication
■ Simplified Architecture
■ Equal Nodes
■ Sharded to the Core
■ Maximizes Infrastructure Power
■ Unlimited Scale
■ Always-On
■ No Limits
■ Fully Autonomous
App
App
Independent Benchmark Results
"ScyllaDB outperforms
MongoDB in 132 of 133
measurements."
– Daniel Seybold,
benchANT CTO
Read the full report
"MongoDB is Web Scale"
"The scalability results
demonstrate that ScyllaDB
achieves up to near linear
scalability, while MongoDB
shows less efficient
horizontal scalability."
– Daniel Seybold,
benchANT CTO
Read the full report
Migration Options
It’s a Simple Matter of Programming(™)
Connectors
Connecting
ScyllaDB Shard-Aware Drivers
Collections vs Tables
Mental Exercise
MemeGroup
[
{ id: simpsons, title: 'The Simpsons', author: uid1, tags: ['homer', 'liza'], photos: [pid1, pid2], views: 200 }
{ id: comics, title: 'Comics', author: uid2, tags: ['drawing', 'cartoon'], photos: [pid1, pid2, pid3], views: 300 }
{ id: matrix, title: 'Matrix', author: uid1, tags: ['neo', 'morpheus', 'pill'], photos: [pid4, pid5], views: 50 }
]
Memes
[
{ id: homer_grass, title: 'Homer Hiding', location: p1, tags: ['hiding', 'grass'], likes: 30, views: 200 }
{ id: morpheus_pill, title: Choose Destiny', location: p2, tags: ['pills'], likes: 99, views: 300 }
{ id: trollface, title: 'Troll face', location: p3, tags: ['troll', 'face], likes: 9999999, views: NaN }
]
■ Queries:
■ Display all Memes posted by a given Author, sorted by time
■ Retrieve all Groups of Memes matching a particular Tag
Mental Exercise
■ Queries:
■ Display all Memes posted by a given Author, sorted by time
■ Retrieve all Groups of Memes matching a particular Tag
CREATE TABLE MemeByAuthor (
author uuid,
meme_id uuid,
post_time timeuuid,
title text,
tags frozen<set<text>>,
path text,
PRIMARY KEY(author, post_time)
)
CREATE TABLE MemeGroup (
group_id uuid,
author_id uuid,
title text,
tags set<text>,
memes uuid,
PRIMARY KEY(group_id, memes)
)
CREATE INDEX ON MemeGroup(tags);
Adapted from StackOverflow
ScyllaDB JSON Support
INSERT INTO mytable JSON '{ ""_id"": 1,
"title": "Reservoir Dogs", "director": ...}'
SELECT and INSERT statements
Leave JSON “As-Is”
ID #1
document
Hybrid Approach
ID #1
Document
Document
ID #1
Before Migrating
■ Understand the differences between a document vs wide-column
■ Denormalize and design a query-driven approach
■ Optimize your model for fast retrieval & ingestion
■ Take the opportunity to re-evaluate your access patterns!
■ Get to know the available data-types, indexes, collections and UDTs in
ScyllaDB
■ Implement highly concurrent asynchronous code in your application
■ BLAST ScyllaDB – It WILL handle it!
Migration Walkthrough
ETL
Application
Writes & Reads
Reads
Forklift Historical Data
Source Connector
MongoDB
Change Streams
ScyllaDB Sink
Connector
Writes & Reads
Success Stories
Trillions of messages
60% fewer nodes
5msp99
"We knew we were
not going to use
MongoDB sharding
because it is
complicated to
use and not known
for stability"
2015 – Million
2016 – Billion
2019 – Trillion
Stanislav Vishnevskiy
How Discord Stores Billions and Trillions of
Unleashing the Power of Data with Scylla
"We were looking for something
really specific. A highly
scalable, and high performance
NoSQL database. The answer
was simple, ScyllaDB is a better
fit for our use case."
João Pedro Voltani
MongoDB Versus ScyllaDB, in Production at Numberly
"We are happy users of both, because we select the use cases based on their strengths"
We use ScyllaDB for real-time
and latency sensitive pipelines,
mixed batch and real-time
workloads, and web backends in
GraphQL (imposing a schema).
We see a trend into ScyllaDB
today.
Alexys Jacob
Numberly
adoption
MongoDB
OSS
(AGPL)
MongoDB
1.4 stable
Cassandra
OSS
(Apache)
MongoDB is very strong in
web backends and real time
queries over unpredictable
behavioral data. We need to
query and operate on top of
this data.
This is where MongoDB
flexibility is good for us.
Cassandra
1.0
Cassandra
2.0
Cassandra
3.0
ScyllaDB
OSS
(AGPL)
ScyllaDB
1.0
ScyllaDB
2.0
ScyllaDB
3.0
2016 2017 2018 2019
2015
2013
2011
2010
2008 2009
MongoDB
2.0
MongoDB
3.0
WiredTiger
(3.2)
MongoDB
4.0
(SSPL)
MongoDB
4.2
Numberly
adoption
MongoDB
3.6
Takeaways
30
■ ScyllaDB is the database for Performance
■ ScyllaDB and MongoDB are different databases
■ Knowing the differences is key for a successful transition
■ Identify the optimal data model
■ Follow a query-driven & performance-oriented design approach
■ Both have a large driver and connectors ecosystem
■ The path for migration is straightforward, and we're here to help!
■ Customers met their Performance challenges with ScyllaDB
Stay in Touch
Felipe Cardeneti Mendes
felipemendes@scylladb.com
@cardeneti82118
@fee-mendes
Find me on LinkedIn!
Pete Aven
pete.aven@scylladb.com
@peteaven
@wpaven
Find me on LinkedIn!

More Related Content

Similar to MongoDB to ScyllaDB: Technical Comparison and the Path to Success

Application Development and Data Modeling on Amazon DynamoDB
Application Development and Data Modeling on Amazon DynamoDBApplication Development and Data Modeling on Amazon DynamoDB
Application Development and Data Modeling on Amazon DynamoDB
Amazon Web Services Japan
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
Steven Francia
 
MongoDB - Riviera Dev 2018
MongoDB - Riviera Dev 2018MongoDB - Riviera Dev 2018
MongoDB - Riviera Dev 2018
Maxime Beugnet
 
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
Chris Fregly
 
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
jins0618
 
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQLMongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
Embarcadero Technologies
 
[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote
MongoDB
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
MongoDB
 
Serverless ddd
Serverless dddServerless ddd
Serverless ddd
Asher Sterkin
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
MongoDB
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
Joe Drumgoole
 
Dbs302 driving a realtime personalization engine with cloud bigtable
Dbs302  driving a realtime personalization engine with cloud bigtableDbs302  driving a realtime personalization engine with cloud bigtable
Dbs302 driving a realtime personalization engine with cloud bigtable
Calvin French-Owen
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
Victoria Malaya
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
MongoDB
 
Production deployment
Production deploymentProduction deployment
Production deployment
MongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
MongoDB
 
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
Chris Fregly
 
CloudML talk at DevFest Madurai 2016
CloudML talk at DevFest Madurai 2016 CloudML talk at DevFest Madurai 2016
CloudML talk at DevFest Madurai 2016
Karthik Padmanabhan
 

Similar to MongoDB to ScyllaDB: Technical Comparison and the Path to Success (20)

Application Development and Data Modeling on Amazon DynamoDB
Application Development and Data Modeling on Amazon DynamoDBApplication Development and Data Modeling on Amazon DynamoDB
Application Development and Data Modeling on Amazon DynamoDB
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
MongoDB - Riviera Dev 2018
MongoDB - Riviera Dev 2018MongoDB - Riviera Dev 2018
MongoDB - Riviera Dev 2018
 
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...Brussels Spark Meetup Oct 30, 2015:  Spark After Dark 1.5:  Real-time, Advanc...
Brussels Spark Meetup Oct 30, 2015: Spark After Dark 1.5:  Real-time, Advanc...
 
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQLMongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
MongoDB .local London 2019: Managing Diverse User Needs with MongoDB and SQL
 
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News!
 
[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote[MongoDB.local Bengaluru 2018] Keynote
[MongoDB.local Bengaluru 2018] Keynote
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
 
Serverless ddd
Serverless dddServerless ddd
Serverless ddd
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Dbs302 driving a realtime personalization engine with cloud bigtable
Dbs302  driving a realtime personalization engine with cloud bigtableDbs302  driving a realtime personalization engine with cloud bigtable
Dbs302 driving a realtime personalization engine with cloud bigtable
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
 
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the CloudWebinar: Deploying MongoDB to Production in Data Centers and the Cloud
Webinar: Deploying MongoDB to Production in Data Centers and the Cloud
 
Production deployment
Production deploymentProduction deployment
Production deployment
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
Paris Spark Meetup Oct 26, 2015 - Spark After Dark v1.5 - Best of Advanced Ap...
 
CloudML talk at DevFest Madurai 2016
CloudML talk at DevFest Madurai 2016 CloudML talk at DevFest Madurai 2016
CloudML talk at DevFest Madurai 2016
 

More from ScyllaDB

99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
ScyllaDB
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
ScyllaDB
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
ScyllaDB
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
ScyllaDB
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
ScyllaDB
 
How to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance ProblemsHow to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance Problems
ScyllaDB
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy WorkloadsUsing ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy Workloads
ScyllaDB
 
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
ScyllaDB
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
ScyllaDB
 
The Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetryThe Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetry
ScyllaDB
 
ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?
ScyllaDB
 
High Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen ShapiraHigh Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen Shapira
ScyllaDB
 
Writing Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code SucksWriting Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code Sucks
ScyllaDB
 
Building a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge PlatformBuilding a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge Platform
ScyllaDB
 
Beyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOsBeyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOs
ScyllaDB
 
Quantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core ArchitectureQuantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core Architecture
ScyllaDB
 
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & DiskLow-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & Disk
ScyllaDB
 
Demanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database BenchmarkingDemanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database Benchmarking
ScyllaDB
 
P99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io SystemP99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io System
ScyllaDB
 

More from ScyllaDB (20)

99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz99.99% of Your Traces are Trash by Paige Cruz
99.99% of Your Traces are Trash by Paige Cruz
 
Square's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with RaftSquare's Lessons Learned from Implementing a Key-Value Store with Raft
Square's Lessons Learned from Implementing a Key-Value Store with Raft
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
 
A Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus AlbuquerqueA Deep Dive Into Concurrent React by Matheus Albuquerque
A Deep Dive Into Concurrent React by Matheus Albuquerque
 
The Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of LatencyThe Latency Stack: Discovering Surprising Sources of Latency
The Latency Stack: Discovering Surprising Sources of Latency
 
eBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at IsovalenteBPF vs Sidecars by Liz Rice at Isovalent
eBPF vs Sidecars by Liz Rice at Isovalent
 
How to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance ProblemsHow to Improve Your Ability to Solve Complex Performance Problems
How to Improve Your Ability to Solve Complex Performance Problems
 
Using ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy WorkloadsUsing ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Write-Heavy Workloads
 
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
Distributed System Performance Troubleshooting Like You’ve Been Doing it for ...
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
 
The Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetryThe Art of Event Driven Observability with OpenTelemetry
The Art of Event Driven Observability with OpenTelemetry
 
ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?ORM is Bad, But is There an Alternative?
ORM is Bad, But is There an Alternative?
 
High Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen ShapiraHigh Performance on a Low Budget with Gwen Shapira
High Performance on a Low Budget with Gwen Shapira
 
Writing Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code SucksWriting Low Latency Database Applications Even If Your Code Sucks
Writing Low Latency Database Applications Even If Your Code Sucks
 
Building a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge PlatformBuilding a 10x More Efficient Edge Platform
Building a 10x More Efficient Edge Platform
 
Beyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOsBeyond Availability: The Seven Dimensions for Data Product SLOs
Beyond Availability: The Seven Dimensions for Data Product SLOs
 
Quantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core ArchitectureQuantifying the Performance Impact of Shard-per-core Architecture
Quantifying the Performance Impact of Shard-per-core Architecture
 
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & DiskLow-Latency Data Access: The Required Synergy Between Memory & Disk
Low-Latency Data Access: The Required Synergy Between Memory & Disk
 
Demanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database BenchmarkingDemanding the Impossible: Rigorous Database Benchmarking
Demanding the Impossible: Rigorous Database Benchmarking
 
P99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io SystemP99 Publish Performance in a Multi-Cloud NATS.io System
P99 Publish Performance in a Multi-Cloud NATS.io System
 

Recently uploaded

Communications Mining Series - Zero to Hero - Session 2
Communications Mining Series - Zero to Hero - Session 2Communications Mining Series - Zero to Hero - Session 2
Communications Mining Series - Zero to Hero - Session 2
DianaGray10
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
Neeraj Kumar Singh
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
Larry Smarr
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
FilipTomaszewski5
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
Databarracks
 
Guidelines for Effective Data Visualization
Guidelines for Effective Data VisualizationGuidelines for Effective Data Visualization
Guidelines for Effective Data Visualization
UmmeSalmaM1
 
New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024
ThousandEyes
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
NTTDATA INTRAMART
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
Kieran Kunhya
 
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
UiPathCommunity
 
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google CloudRadically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
ScyllaDB
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
Knoldus Inc.
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
ScyllaDB
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
Enterprise Knowledge
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
AlexanderRichford
 

Recently uploaded (20)

Communications Mining Series - Zero to Hero - Session 2
Communications Mining Series - Zero to Hero - Session 2Communications Mining Series - Zero to Hero - Session 2
Communications Mining Series - Zero to Hero - Session 2
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0Chapter 5 - Managing Test Activities V4.0
Chapter 5 - Managing Test Activities V4.0
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
 
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeckPoznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
Poznań ACE event - 19.06.2024 Team 24 Wrapup slidedeck
 
Cyber Recovery Wargame
Cyber Recovery WargameCyber Recovery Wargame
Cyber Recovery Wargame
 
Guidelines for Effective Data Visualization
Guidelines for Effective Data VisualizationGuidelines for Effective Data Visualization
Guidelines for Effective Data Visualization
 
New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024New ThousandEyes Product Features and Release Highlights: June 2024
New ThousandEyes Product Features and Release Highlights: June 2024
 
intra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_Enintra-mart Accel series 2024 Spring updates_En
intra-mart Accel series 2024 Spring updates_En
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
 
Multivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back againMultivendor cloud production with VSF TR-11 - there and back again
Multivendor cloud production with VSF TR-11 - there and back again
 
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
Day 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data ManipulationDay 4 - Excel Automation and Data Manipulation
Day 4 - Excel Automation and Data Manipulation
 
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google CloudRadically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
Radically Outperforming DynamoDB @ Digital Turbine with SADA and Google Cloud
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
 
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...
 

MongoDB to ScyllaDB: Technical Comparison and the Path to Success

  • 1. MongoDB vs ScyllaDB: Path to Success Felipe Mendes, Solution Architect at ScyllaDB Pete Aven, Solution Architect at ScyllaDB
  • 2. Meet The Team Felipe Cardeneti Mendes ■ Published Author ■ ScyllaDB Committer ■ IT Specialist & Solution Architect Your photo goes here, smile :) Pete Aven ■ Data Engineer ■ Application Development ■ ScyllaDB Solution Architect
  • 3.
  • 6. Lower Node Count Efficient hardware usage results in direct savings Where Do You Wanna Be? Predictable, Low Latencies Consistent single-digit millisecond p99 latencies Autonomous Self-Tuning Self-optimizing, smaller footprint, easy to use Always-On Architecture Active-active deployments for mission-critical workloads
  • 8. Data Model & Query Language ■ MQL – Mongo Query Language ■ Documents stored in BSON format ■ Schemaless ■ Favors Query Flexibility over Performance
  • 9. Data Model & Query Language ■ CQL or DynamoDB API ■ Wide-Column Representation ■ Schema-oriented ■ Favors Performance over Query Flexibility
  • 10. MongoDB – Replica Sets Reads Reads (optional) Reads (optional) REPLICA SET 3x replication ■ Primary-Secondary ■ Impact on Write Throughput ■ Election on Failure ■ Scaling Limitations ■ Up to 50 members ■ Arbiters ■ Non-voting members ■ Limited Storage/Replica* ■ Affect Reads Scaling MongoDB Atlas *
  • 11. MongoDB – Sharded Cluster ■ A Shard is a Replica Set ■ Adds Overhead & Complexity ■ Further Availability Concerns ■ Underutilized Capacity ■ Sharding ■ Performed Manually ■ Hashed vs Ranged ■ Still Require a Query Router Reads SHARDED CLUSTER 3 shards, 3x replication
  • 12. ScyllaDB – Sharding & Replication ■ Simplified Architecture ■ Equal Nodes ■ Sharded to the Core ■ Maximizes Infrastructure Power ■ Unlimited Scale ■ Always-On ■ No Limits ■ Fully Autonomous App App
  • 13. Independent Benchmark Results "ScyllaDB outperforms MongoDB in 132 of 133 measurements." – Daniel Seybold, benchANT CTO Read the full report
  • 14. "MongoDB is Web Scale" "The scalability results demonstrate that ScyllaDB achieves up to near linear scalability, while MongoDB shows less efficient horizontal scalability." – Daniel Seybold, benchANT CTO Read the full report
  • 16. It’s a Simple Matter of Programming(™)
  • 19. Mental Exercise MemeGroup [ { id: simpsons, title: 'The Simpsons', author: uid1, tags: ['homer', 'liza'], photos: [pid1, pid2], views: 200 } { id: comics, title: 'Comics', author: uid2, tags: ['drawing', 'cartoon'], photos: [pid1, pid2, pid3], views: 300 } { id: matrix, title: 'Matrix', author: uid1, tags: ['neo', 'morpheus', 'pill'], photos: [pid4, pid5], views: 50 } ] Memes [ { id: homer_grass, title: 'Homer Hiding', location: p1, tags: ['hiding', 'grass'], likes: 30, views: 200 } { id: morpheus_pill, title: Choose Destiny', location: p2, tags: ['pills'], likes: 99, views: 300 } { id: trollface, title: 'Troll face', location: p3, tags: ['troll', 'face], likes: 9999999, views: NaN } ] ■ Queries: ■ Display all Memes posted by a given Author, sorted by time ■ Retrieve all Groups of Memes matching a particular Tag
  • 20. Mental Exercise ■ Queries: ■ Display all Memes posted by a given Author, sorted by time ■ Retrieve all Groups of Memes matching a particular Tag CREATE TABLE MemeByAuthor ( author uuid, meme_id uuid, post_time timeuuid, title text, tags frozen<set<text>>, path text, PRIMARY KEY(author, post_time) ) CREATE TABLE MemeGroup ( group_id uuid, author_id uuid, title text, tags set<text>, memes uuid, PRIMARY KEY(group_id, memes) ) CREATE INDEX ON MemeGroup(tags); Adapted from StackOverflow
  • 21. ScyllaDB JSON Support INSERT INTO mytable JSON '{ ""_id"": 1, "title": "Reservoir Dogs", "director": ...}' SELECT and INSERT statements
  • 24. Before Migrating ■ Understand the differences between a document vs wide-column ■ Denormalize and design a query-driven approach ■ Optimize your model for fast retrieval & ingestion ■ Take the opportunity to re-evaluate your access patterns! ■ Get to know the available data-types, indexes, collections and UDTs in ScyllaDB ■ Implement highly concurrent asynchronous code in your application ■ BLAST ScyllaDB – It WILL handle it!
  • 25. Migration Walkthrough ETL Application Writes & Reads Reads Forklift Historical Data Source Connector MongoDB Change Streams ScyllaDB Sink Connector Writes & Reads
  • 27. Trillions of messages 60% fewer nodes 5msp99 "We knew we were not going to use MongoDB sharding because it is complicated to use and not known for stability" 2015 – Million 2016 – Billion 2019 – Trillion Stanislav Vishnevskiy How Discord Stores Billions and Trillions of
  • 28. Unleashing the Power of Data with Scylla "We were looking for something really specific. A highly scalable, and high performance NoSQL database. The answer was simple, ScyllaDB is a better fit for our use case." João Pedro Voltani
  • 29. MongoDB Versus ScyllaDB, in Production at Numberly "We are happy users of both, because we select the use cases based on their strengths" We use ScyllaDB for real-time and latency sensitive pipelines, mixed batch and real-time workloads, and web backends in GraphQL (imposing a schema). We see a trend into ScyllaDB today. Alexys Jacob Numberly adoption MongoDB OSS (AGPL) MongoDB 1.4 stable Cassandra OSS (Apache) MongoDB is very strong in web backends and real time queries over unpredictable behavioral data. We need to query and operate on top of this data. This is where MongoDB flexibility is good for us. Cassandra 1.0 Cassandra 2.0 Cassandra 3.0 ScyllaDB OSS (AGPL) ScyllaDB 1.0 ScyllaDB 2.0 ScyllaDB 3.0 2016 2017 2018 2019 2015 2013 2011 2010 2008 2009 MongoDB 2.0 MongoDB 3.0 WiredTiger (3.2) MongoDB 4.0 (SSPL) MongoDB 4.2 Numberly adoption MongoDB 3.6
  • 30. Takeaways 30 ■ ScyllaDB is the database for Performance ■ ScyllaDB and MongoDB are different databases ■ Knowing the differences is key for a successful transition ■ Identify the optimal data model ■ Follow a query-driven & performance-oriented design approach ■ Both have a large driver and connectors ecosystem ■ The path for migration is straightforward, and we're here to help! ■ Customers met their Performance challenges with ScyllaDB
  • 31. Stay in Touch Felipe Cardeneti Mendes felipemendes@scylladb.com @cardeneti82118 @fee-mendes Find me on LinkedIn! Pete Aven pete.aven@scylladb.com @peteaven @wpaven Find me on LinkedIn!
  翻译: