An introduction to Apache Kafka® - Jfokus
-
Upload
khangminh22 -
Category
Documents
-
view
4 -
download
0
Transcript of An introduction to Apache Kafka® - Jfokus
© 2019 IBM Corporation
An introduction to Apache Kafka®
Kate Stanley
This is event-streaming, not just messaging
IBM Event StreamsApache Kafka Jfokus 2019
© 2019 IBM Corporation
87% of companies are transforming to be more customer-centric
Source: A commissioned study conducted by Forrester Consulting on behalf of IBM, September 2016
© 2019 IBM Corporation
Typical Event-driven Use Case | Customer Satisfaction
‘Zoom Air’ is a commercial airline
Re-accommodate passengers before they realize their journey has been disrupted
© 2019 IBM Corporation
Data-centric to event-centric
Source: Gartner May 2017, “CIO Challenge: Adopt Event-Centric IT for Digital Business Success ”
© 2019 IBM Corporation
Event-Driven in Action
Getting data to where it’s needed, before it’s needed
Respond to events before the moment passes
Responsive & personalisedcustomer experiences
Bring real time intelligence to your apps
© 2019 IBM Corporation
App
Event Backbone
1
2 App
3
4
Building Blocks1 :: Event Sources
2 :: Stream Processing
3 :: Event Archive
4 :: Notifications
Components of an event-streaming application
© 2019 IBM Corporation
Key Use Cases
Event Backbone
MLA P
I
Event input buffer for data analytics
Event Backbone
Bridge to cloud-native appsEvent-driven microservices
App
Event Backbone
© 2019 IBM CorporationEvent Driven Enterprise Workshop / © 2018 IBM Corporation
EVENT STREAMING
Stream History Immutable dataScalable consumption
MESSAGE QUEUING
Events vs Messaging
✓Request / ReplyTransient data
persistence Targeted reliable
delivery
© 2019 IBM Corporation
Stream history Immutable data Highly Available Scalable Many Consumers
Properties of the event central backbone
© 2019 IBM Corporation
Apache Kafka is an open source, distributed streaming platform
Publish and subscribe to streams of events
Store events in durable way
Process streams of events asthey occur
© 2019 IBM Corporation
Why is Apache Kafka so popular?
Engaging apps
Decisions driven by data
Business Trends
Apps that derive insight from large
volumes of data
Apps that react to changing
events
Stateless apps
Runs natively on cloud
Technology Trends
Immutable event stream historyEvent stream replay
Replication for HANaturally scales horizontally
Kafka arrived at the right time, captured mindshareamong developers and exploded in popularity
© 2019 IBM Corporation
Producers
0 1 2 3 4 5
TOPIC
0 1 2 3
0 1 2 3 4 5
PARTITION 0
PARTITION 1
PARTITION 2
6 7
4 5
© 2019 IBM Corporation
Producers Producer can choose acknowledgement level:
0 Fire-and-forgetFast, but risky
1 Waits for 1 broker to acknowledge
ALL Waits for all replica brokers to acknowledge
Producer can choose whether to retry:
0 Do not retryLoses messages on error
>0RetryRetry, might result in duplicates on error
Producer can also choose idempotenceRetry, might result in duplicates on error
0 1 2 3 4 5
TOPIC
0 1 2 3
0 1 2 3 4 5
PARTITION 0
PARTITION 1
PARTITION 2
6 7
4 5
© 2019 IBM Corporation
Consumers
0 1 2 3 4 5 6
CONSUMER BOffset 5
CONSUMER AOffset 2
Consumer can choose how to commit offsets:
Commits might go faster than processing
Manual, asynchronous
Fairly safe, but could re-process messages
Safe, but slows down processing
Can group sending messages and committing offsets into transactions
Automatic
Manual, synchronous
Exactly once semantics
Primarily aimed at stream processing applications
A common pattern is to commit offsets on a timer
© 2019 IBM Corporation
Consumer GroupsCONSUMER GROUP A
p0, offset 7
p1, offset 3
p2, offset 5
TOPIC
0 1 2 3
0 1 2 3 4 5
PARTITION 0
PARTITION 1
PARTITION 2
0 1 2 3 4 5 6 7
CONSUMER
CONSUMER
CONSUMER
CONSUMER
CONSUMER GROUP A
p0, offset 7
p1, offset 3p2, offset 5
CONSUMER
CONSUMER
© 2019 IBM Corporation
Kafka Streams
Client library for processing and analyzing data stored in Kafka
Processing happens in the app
Supports per-record processing – no batching
KStream<String, String> source = builder.stream("my-input").through("my-logger").filter((key,value) -> key.equals("bingo")).map((key,value) -> KeyValue.pair(key, value.toUpperCase())).to("my-output");
my-input
!!!
FILTER
my-output
MAP
my-logger
THROUGH
© 2019 IBM Corporation
Kafka Connect
Over 80 connectorsHDFS
Elasticsearch
MySQL
JDBC
IBM MQMQTT
CoAP
+ many others
© 2019 IBM Corporation
Kafka console scripts> bin/zookeeper-server-start.sh config/zookeeper.properties[2018-09-22 15:01:37,495] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)...
> bin/kafka-server-start.sh config/server.properties[2018-09-22 15:01:47,028] INFO Verifying properties (kafka.utils.VerifiableProperties)[2018-09-22 15:01:47,051] INFO Property socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties)...
> bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
© 2019 IBM Corporation
Kafka console scripts
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
This is a messageThis is another message
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
This is a messageThis is another message