尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Kafka Streams
vs
Spark Structured Streaming
Modern Stream Processing Engines Compared
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl / DataMass Summit 2019
● A freelance IT consultant
● Specializing in Spark, Kafka, Kafka Streams, Scala
● Development | Consulting | Training
● "The Internals Of" online books
● Among contributors to Apache Spark
● Among Confluent Community Catalyst (Class of 2019 - 2020)
● Contact me at jacek@japila.pl
● Follow @JacekLaskowski on twitter for more #ApacheSpark
#ApacheKafka #KafkaStreams
Jacek Laskowski
Friendly reminder
Pictures...take a lot of pictures! 📷
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
The Features of Both
1. Stream Processing Engines
2. High-Level DSL for defining processing flow (logic)
a. Topology
b. Dataflow
3. Low-Level API for custom flows
4. Logical and physical plans
a. Logical “what” and executable “how”
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Kafka Streams (1 of 2)
1. Kafka Streams 2.3.0
2. Java and Scala APIs
3. Yet Another Command-Line Application (“YACA”)
a. High-availability and fault tolerance OOTB
b. Creating consumer groups OOTB
4. Support for Apache Kafka only
a. Use Kafka Connect to go beyond Kafka
5. Data Abstractions
a. High-level Streams DSL (KStream, KTable, KGlobalTable)
b. Low-level Processor API
6. One record at a time
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Kafka Streams (2 of 2)
1. ETL only
a. No support for SQL or Machine Learning
b. KSQL
2. Java 11 supported
3. Scala 2.12
4. No interactive shell / REPL for learning and prototyping
5. Rich join support (stream-stream, stream-table,
stream-global table)
6. Reading from and writing to a single Kafka cluster
7. Uses RocksDB for persistent state storage
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Kafka Streams Code / Topology (1 of 2)
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Kafka Streams Code / Execution Env (1 of 2)
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Spark Structured Streaming (1 of 2)
1. Apache Spark 2.4.4
2. Stream Processing API for Scala, Java, Python, SQL
a. Useful for software developers and data scientists
3. Requires cluster manager
a. Apache Hadoop’s YARN / Apache Mesos / DC/OS / Spark
Standalone
4. Lots of data sources
a. Kafka, JSON, parquet, CSV, Avro, ORC, socket
b. Data Source API
5. Data abstraction: streaming Dataset
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Spark Structured Streaming (2 of 2)
1. ETL + Machine Learning
a. Spark MLlib supports streaming Datasets
2. Java 8 only
3. Scala 2.12
4. spark-shell for learning and prototyping
5. Streaming joins and aggregations
6. Reading from one Kafka cluster and writing to another
Kafka cluster
7. Uses Hadoop DFS (HDFS) for checkpointing and
persistent state storage
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Spark “Streams” Code / Loading Data (1 of 3)
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Spark “Streams” Code / Processing (2 of 3)
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
Spark “Streams” Code / Saving Data (3 of 3)
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
“The Internals Of” Online Books
1. The Internals of Kafka Streams
2. The Internals of Spark Structured Streaming
Questions?
1. Follow @jaceklaskowski on twitter (DMs open)
2. Upvote my questions and answers on StackOverflow
3. Contact me at jacek@japila.pl
4. Connect with me at LinkedIn
© Jacek Laskowski / @JacekLaskowski / jacek@japila.pl

More Related Content

What's hot

Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark API
Databricks
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Databricks
 
Whirlpools in the Stream with Jayesh Lalwani
 Whirlpools in the Stream with Jayesh Lalwani Whirlpools in the Stream with Jayesh Lalwani
Whirlpools in the Stream with Jayesh Lalwani
Databricks
 
Zeppelin at twitter (sf data science meetup, july 2016)
Zeppelin at twitter (sf data science meetup, july 2016)Zeppelin at twitter (sf data science meetup, july 2016)
Zeppelin at twitter (sf data science meetup, july 2016)
Prasad Wagle
 
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
 
Keynote at spark summit east anjul
Keynote at spark summit east anjulKeynote at spark summit east anjul
Keynote at spark summit east anjul
Anjul Bhambhri
 
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Databricks
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Databricks
 
Efficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesEfficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out Databases
Jen Aman
 
Databricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep Dive
Databricks
 
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Databricks
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
Spark Summit
 
Future of data visualization
Future of data visualizationFuture of data visualization
Future of data visualization
hadoopsphere
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
Databricks
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
felixcss
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Lucidworks
 
Automatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked DataAutomatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked Data
Albert Meroño-Peñuela
 
An Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal MalohlavaAn Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal Malohlava
Spark Summit
 
What's New in Spark 2?
What's New in Spark 2?What's New in Spark 2?
What's New in Spark 2?
Eyal Ben Ivri
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
Edureka!
 

What's hot (20)

Writing Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark APIWriting Continuous Applications with Structured Streaming PySpark API
Writing Continuous Applications with Structured Streaming PySpark API
 
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
Spark Summit San Francisco 2016 - Matei Zaharia Keynote: Apache Spark 2.0
 
Whirlpools in the Stream with Jayesh Lalwani
 Whirlpools in the Stream with Jayesh Lalwani Whirlpools in the Stream with Jayesh Lalwani
Whirlpools in the Stream with Jayesh Lalwani
 
Zeppelin at twitter (sf data science meetup, july 2016)
Zeppelin at twitter (sf data science meetup, july 2016)Zeppelin at twitter (sf data science meetup, july 2016)
Zeppelin at twitter (sf data science meetup, july 2016)
 
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
 
Keynote at spark summit east anjul
Keynote at spark summit east anjulKeynote at spark summit east anjul
Keynote at spark summit east anjul
 
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
 
Efficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesEfficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out Databases
 
Databricks with R: Deep Dive
Databricks with R: Deep DiveDatabricks with R: Deep Dive
Databricks with R: Deep Dive
 
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
 
Future of data visualization
Future of data visualizationFuture of data visualization
Future of data visualization
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Automatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked DataAutomatic Query-Centric API for Routine Access to Linked Data
Automatic Query-Centric API for Routine Access to Linked Data
 
An Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal MalohlavaAn Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal Malohlava
 
What's New in Spark 2?
What's New in Spark 2?What's New in Spark 2?
What's New in Spark 2?
 
Performance of Spark vs MapReduce
Performance of Spark vs MapReducePerformance of Spark vs MapReduce
Performance of Spark vs MapReduce
 

Similar to  Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engines Compared

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Helena Edelson
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Dharmjit Singh
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
Databricks
 
실시간 Streaming using Spark and Kafka 강의교재
실시간 Streaming using Spark and Kafka 강의교재실시간 Streaming using Spark and Kafka 강의교재
실시간 Streaming using Spark and Kafka 강의교재
hkyoon2
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Helena Edelson
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache Flink
AKASH SIHAG
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
confluent
 
Edbt19 paper 329
Edbt19 paper 329Edbt19 paper 329
Apache Spark Streaming
Apache Spark StreamingApache Spark Streaming
Apache Spark Streaming
Bartosz Jankiewicz
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
Knoldus Inc.
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
Eventador
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
Kenny Gorman
 
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once SemanticsApache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
Yoshiyasu SAEKI
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
Slim Baltagi
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark Group
Phaneendra Chiruvella
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Simplilearn
 
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Codemotion
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
Cedric Vidal
 

Similar to  Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engines Compared (20)

Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
 
실시간 Streaming using Spark and Kafka 강의교재
실시간 Streaming using Spark and Kafka 강의교재실시간 Streaming using Spark and Kafka 강의교재
실시간 Streaming using Spark and Kafka 강의교재
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache Flink
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
 
Edbt19 paper 329
Edbt19 paper 329Edbt19 paper 329
Edbt19 paper 329
 
Apache Spark Streaming
Apache Spark StreamingApache Spark Streaming
Apache Spark Streaming
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
 
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once SemanticsApache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark Group
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
 
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 

More from Jacek Laskowski

Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing tools
Jacek Laskowski
 
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
Jacek Laskowski
 
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwójStackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
Jacek Laskowski
 
Introduction to Web Application Development in Clojure
Introduction to Web Application Development in ClojureIntroduction to Web Application Development in Clojure
Introduction to Web Application Development in Clojure
Jacek Laskowski
 
Introduction to Functional Programming in Scala
Introduction to Functional Programming in ScalaIntroduction to Functional Programming in Scala
Introduction to Functional Programming in Scala
Jacek Laskowski
 
Moje pierwsze kroki w programowaniu funkcyjnym w Scali
Moje pierwsze kroki w programowaniu funkcyjnym w ScaliMoje pierwsze kroki w programowaniu funkcyjnym w Scali
Moje pierwsze kroki w programowaniu funkcyjnym w Scali
Jacek Laskowski
 
Functional web development with Git(Hub), Heroku and Clojure
Functional web development with Git(Hub), Heroku and ClojureFunctional web development with Git(Hub), Heroku and Clojure
Functional web development with Git(Hub), Heroku and Clojure
Jacek Laskowski
 
Praktyczne wprowadzenie do OSGi i Enterprise OSGi
Praktyczne wprowadzenie do OSGi i Enterprise OSGiPraktyczne wprowadzenie do OSGi i Enterprise OSGi
Praktyczne wprowadzenie do OSGi i Enterprise OSGi
Jacek Laskowski
 
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
Jacek Laskowski
 
Apache Tomcat + Java EE = Apache TomEE
Apache Tomcat + Java EE = Apache TomEEApache Tomcat + Java EE = Apache TomEE
Apache Tomcat + Java EE = Apache TomEE
Jacek Laskowski
 
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)
Jacek Laskowski
 

More from Jacek Laskowski (11)

Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing tools
 
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
#Be #social #FTW aka Your #Professional #Development with #StackOverflow #Git...
 
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwójStackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
StackOverflow, GitHub, twitter, reddit i Twój profesjonalny rozwój
 
Introduction to Web Application Development in Clojure
Introduction to Web Application Development in ClojureIntroduction to Web Application Development in Clojure
Introduction to Web Application Development in Clojure
 
Introduction to Functional Programming in Scala
Introduction to Functional Programming in ScalaIntroduction to Functional Programming in Scala
Introduction to Functional Programming in Scala
 
Moje pierwsze kroki w programowaniu funkcyjnym w Scali
Moje pierwsze kroki w programowaniu funkcyjnym w ScaliMoje pierwsze kroki w programowaniu funkcyjnym w Scali
Moje pierwsze kroki w programowaniu funkcyjnym w Scali
 
Functional web development with Git(Hub), Heroku and Clojure
Functional web development with Git(Hub), Heroku and ClojureFunctional web development with Git(Hub), Heroku and Clojure
Functional web development with Git(Hub), Heroku and Clojure
 
Praktyczne wprowadzenie do OSGi i Enterprise OSGi
Praktyczne wprowadzenie do OSGi i Enterprise OSGiPraktyczne wprowadzenie do OSGi i Enterprise OSGi
Praktyczne wprowadzenie do OSGi i Enterprise OSGi
 
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
Developing modular applications with Java EE 6 and Enterprise OSGi + WebSpher...
 
Apache Tomcat + Java EE = Apache TomEE
Apache Tomcat + Java EE = Apache TomEEApache Tomcat + Java EE = Apache TomEE
Apache Tomcat + Java EE = Apache TomEE
 
(map Clojure everyday-tasks)
(map Clojure everyday-tasks)(map Clojure everyday-tasks)
(map Clojure everyday-tasks)
 

Recently uploaded

Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Gabi Münster
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
sapna sharmap11
 
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
 
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book NowMumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
radhika ansal $A12
 
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOWAI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
arash10gamer
 
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
rukmnaikaseen
 
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
2004kavitajoshi
 
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Russian Escorts in Delhi 9711199171 with low rate Book online
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
jasodak99
 
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
shivangimorya083
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
hanshkumar9870
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
EbtsamRashed
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
frp60658
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
mparmparousiskostas
 
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
gebegu
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
nitachopra
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
newdirectionconsulta
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Do People Really Know Their Fertility Intentions?  Correspondence between Sel...Do People Really Know Their Fertility Intentions?  Correspondence between Sel...
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Xiao Xu
 

Recently uploaded (20)

Direct Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering RoadshowDirect Lake Deep Dive slides from Fabric Engineering Roadshow
Direct Lake Deep Dive slides from Fabric Engineering Roadshow
 
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
Call Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call GirlCall Girls Hyderabad  (india) ☎️ +91-7426014248 Hyderabad  Call Girl
Call Girls Hyderabad (india) ☎️ +91-7426014248 Hyderabad Call Girl
 
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your DoorAhmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
Ahmedabad Call Girls 7339748667 With Free Home Delivery At Your Door
 
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book NowMumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
Mumbai Central Call Girls ☑ +91-9833325238 ☑ Available Hot Girls Aunty Book Now
 
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOWAI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
AI WITH THE HELP OF NAGALAND CAN WIN. DOWNLOAD NOW
 
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
🔥College Call Girls Kolkata 💯Call Us 🔝 8094342248 🔝💃Top Class Call Girl Servi...
 
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls HyderabadHyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
Hyderabad Call Girls Service 🔥 9352988975 🔥 High Profile Call Girls Hyderabad
 
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your DoorHyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
Hyderabad Call Girls 7339748667 With Free Home Delivery At Your Door
 
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
❣VIP Call Girls Chennai 💯Call Us 🔝 7737669865 🔝💃Independent Chennai Escorts S...
 
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
🔥Mature Women / Aunty Call Girl Chennai 💯Call Us 🔝 8094342248 🔝💃Top Class Cal...
 
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
Mumbai Call Girls service 9920874524 Call Girl service in Mumbai Mumbai Call ...
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 
IBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTXIBM watsonx.data - Seller Enablement Deck.PPTX
IBM watsonx.data - Seller Enablement Deck.PPTX
 
CAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdfCAP Excel Formulas & Functions July - Copy (4).pdf
CAP Excel Formulas & Functions July - Copy (4).pdf
 
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
Optimizing Feldera: Integrating Advanced UDFs and Enhanced SQL Functionality ...
 
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
一比一原版(sfu学位证书)西蒙弗雷泽大学毕业证如何办理
 
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
Call Girls Goa👉9024918724👉Low Rate Escorts in Goa 💃 Available 24/7
 
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdfsaps4hanaandsapanalyticswheretodowhat1565272000538.pdf
saps4hanaandsapanalyticswheretodowhat1565272000538.pdf
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
Do People Really Know Their Fertility Intentions?  Correspondence between Sel...Do People Really Know Their Fertility Intentions?  Correspondence between Sel...
Do People Really Know Their Fertility Intentions? Correspondence between Sel...
 

 Kafka Streams VS Spark Structured Streaming - Modern Stream Processing Engines Compared

  • 1. Kafka Streams vs Spark Structured Streaming Modern Stream Processing Engines Compared © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl / DataMass Summit 2019
  • 2. ● A freelance IT consultant ● Specializing in Spark, Kafka, Kafka Streams, Scala ● Development | Consulting | Training ● "The Internals Of" online books ● Among contributors to Apache Spark ● Among Confluent Community Catalyst (Class of 2019 - 2020) ● Contact me at jacek@japila.pl ● Follow @JacekLaskowski on twitter for more #ApacheSpark #ApacheKafka #KafkaStreams Jacek Laskowski
  • 3. Friendly reminder Pictures...take a lot of pictures! 📷 © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 4. The Features of Both 1. Stream Processing Engines 2. High-Level DSL for defining processing flow (logic) a. Topology b. Dataflow 3. Low-Level API for custom flows 4. Logical and physical plans a. Logical “what” and executable “how” © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 5. Kafka Streams (1 of 2) 1. Kafka Streams 2.3.0 2. Java and Scala APIs 3. Yet Another Command-Line Application (“YACA”) a. High-availability and fault tolerance OOTB b. Creating consumer groups OOTB 4. Support for Apache Kafka only a. Use Kafka Connect to go beyond Kafka 5. Data Abstractions a. High-level Streams DSL (KStream, KTable, KGlobalTable) b. Low-level Processor API 6. One record at a time © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 6. Kafka Streams (2 of 2) 1. ETL only a. No support for SQL or Machine Learning b. KSQL 2. Java 11 supported 3. Scala 2.12 4. No interactive shell / REPL for learning and prototyping 5. Rich join support (stream-stream, stream-table, stream-global table) 6. Reading from and writing to a single Kafka cluster 7. Uses RocksDB for persistent state storage © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 7. Kafka Streams Code / Topology (1 of 2) © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 8. Kafka Streams Code / Execution Env (1 of 2) © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 9. Spark Structured Streaming (1 of 2) 1. Apache Spark 2.4.4 2. Stream Processing API for Scala, Java, Python, SQL a. Useful for software developers and data scientists 3. Requires cluster manager a. Apache Hadoop’s YARN / Apache Mesos / DC/OS / Spark Standalone 4. Lots of data sources a. Kafka, JSON, parquet, CSV, Avro, ORC, socket b. Data Source API 5. Data abstraction: streaming Dataset © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 10. Spark Structured Streaming (2 of 2) 1. ETL + Machine Learning a. Spark MLlib supports streaming Datasets 2. Java 8 only 3. Scala 2.12 4. spark-shell for learning and prototyping 5. Streaming joins and aggregations 6. Reading from one Kafka cluster and writing to another Kafka cluster 7. Uses Hadoop DFS (HDFS) for checkpointing and persistent state storage © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 11. Spark “Streams” Code / Loading Data (1 of 3) © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 12. Spark “Streams” Code / Processing (2 of 3) © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 13. Spark “Streams” Code / Saving Data (3 of 3) © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  • 14. “The Internals Of” Online Books 1. The Internals of Kafka Streams 2. The Internals of Spark Structured Streaming
  • 15. Questions? 1. Follow @jaceklaskowski on twitter (DMs open) 2. Upvote my questions and answers on StackOverflow 3. Contact me at jacek@japila.pl 4. Connect with me at LinkedIn © Jacek Laskowski / @JacekLaskowski / jacek@japila.pl
  翻译: