Apache Kafka Essentials Cheatsheet
1. Introduction
Apache Kafka is a distributed event streaming platform designed to handle large-scale real-time data streams. It was originally developed by LinkedIn and later open-sourced as an Apache project. Kafka is known for its high-throughput, fault-tolerance, scalability, and low-latency characteristics, making it an excellent choice for various use cases, such as real-time data pipelines, stream processing, log aggregation, and more.
Kafka follows a publish-subscribe messaging model, where producers publish messages to topics, and consumers subscribe to those topics to receive and process the messages.
2. Installing and Configuring Kafka
To get started with Apache Kafka, you need to download and set up the Kafka distribution. Here’s how you can do it:
2.1 Downloading Kafka
Visit the Apache Kafka website (https://kafka.apache.org/downloads) and download the latest stable version.
2.2 Extracting the Archive
After downloading the Kafka archive, extract it to your desired location using the following commands:
# Replace kafka_version with the version you downloaded tar -xzf kafka_version.tgz cd kafka_version
2.3 Configuring Kafka
Navigate to the config
directory and modify the following configuration files as needed:
server.properties
: Main Kafka broker configuration.
zookeeper.properties
: ZooKeeper configuration for Kafka.
3. Starting Kafka and ZooKeeper
To run Kafka, you need to start ZooKeeper first, as Kafka depends on ZooKeeper for maintaining its cluster state. Here’s how to do it:
3.1 Starting ZooKeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
3.2 Starting Kafka Broker
To start the Kafka broker, use the following command:
bin/kafka-server-start.sh config/server.properties
4. Creating and Managing Topics
Topics in Kafka are logical channels where messages are published and consumed. Let’s learn how to create and manage topics:
4.1 Creating a Topic
To create a topic, use the following command:
bin/kafka-topics.sh --create --topic my_topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
In this example, we create a topic named my_topic
with three partitions and a replication factor of 1.
4.2 Listing Topics
To list all the topics in the Kafka cluster, use the following command:
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
4.3 Describing a Topic
To get detailed information about a specific topic, use the following command:
bin/kafka-topics.sh --describe --topic my_topic --bootstrap-server localhost:9092
5. Producing and Consuming Messages
Now that we have a topic, let’s explore how to produce and consume messages in Kafka.
5.1 Producing Messages
To produce messages to a Kafka topic, use the following command:
bin/kafka-console-producer.sh --topic my_topic --bootstrap-server localhost:9092
After running this command, you can start typing your messages. Press Enter to send each message.
5.2 Consuming Messages
To consume messages from a Kafka topic, use the following command:
bin/kafka-console-consumer.sh --topic my_topic --bootstrap-server localhost:9092
This will start consuming messages from the specified topic in the console.
5.3 Consumer Groups
Consumer groups allow multiple consumers to work together to read from a topic. Each consumer in a group will get a subset of the messages. To use consumer groups, provide a group id when consuming messages:
bin/kafka-console-consumer.sh --topic my_topic --bootstrap-server localhost:9092 --group my_consumer_group
6. Configuring Kafka Producers and Consumers
Kafka provides various configurations for producers and consumers to optimize their behavior. Here are some essential configurations:
6.1 Producer Configuration
To configure a Kafka producer, create a producer.properties
file and set properties like bootstrap.servers
, key.serializer
, and value.serializer
.
# producer.properties bootstrap.servers=localhost:9092 key.serializer=org.apache.kafka.common.serialization.StringSerializer value.serializer=org.apache.kafka.common.serialization.StringSerializer
Use the following command to run the producer with the specified configuration:
bin/kafka-console-producer.sh --topic my_topic --producer.config path/to/producer.properties
6.2 Consumer Configuration
For consumer configuration, create a consumer.properties
file with properties like bootstrap.servers
, key.deserializer
, and value.deserializer
.
# consumer.properties bootstrap.servers=localhost:9092 key.deserializer=org.apache.kafka.common.serialization.StringDeserializer value.deserializer=org.apache.kafka.common.serialization.StringDeserializer group.id=my_consumer_group
Run the consumer using the configuration file:
bin/kafka-console-consumer.sh --topic my_topic --consumer.config path/to/consumer.properties
7. Kafka Connect
Kafka Connect is a powerful framework that allows you to easily integrate Apache Kafka with external systems. It is designed to provide scalable and fault-tolerant data movement between Kafka and other data storage systems or data processing platforms. Kafka Connect is ideal for building data pipelines and transferring data to and from Kafka without writing custom code for each integration.
Kafka Connect consists of two main components: Source Connectors and Sink Connectors.
7.1 Source Connectors
Source Connectors allow you to import data from various external systems into Kafka. They act as producers, capturing data from the source and writing it to Kafka topics. Some popular source connectors include:
- JDBC Source Connector: Captures data from relational databases using JDBC.
- FileStream Source Connector: Reads data from files in a specified directory and streams them to Kafka.
- Debezium Connectors: Provides connectors for capturing changes from various databases like MySQL, PostgreSQL, MongoDB, etc.
7.2 Sink Connectors
Sink Connectors allow you to export data from Kafka to external systems. They act as consumers, reading data from Kafka topics and writing it to the target systems. Some popular sink connectors include:
- JDBC Sink Connector: Writes data from Kafka topics to relational databases using JDBC.
- HDFS Sink Connector: Stores data from Kafka topics in Hadoop Distributed File System (HDFS).
- Elasticsearch Sink Connector: Indexes data from Kafka topics into Elasticsearch for search and analysis.
7.3 Configuration
To configure Kafka Connect, you typically use a properties file for each connector. The properties file contains essential information like the connector name, Kafka brokers, topic configurations, and connector-specific properties. Each connector may have its own set of required and optional properties.
Here’s a sample configuration for the FileStream Source Connector:
name=my-file-source-connector connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector tasks.max=1 file=/path/to/inputfile.txt topic=my_topic
7.4 Running Kafka Connect
To run Kafka Connect, you can use the connect-standalone.sh
or connect-distributed.sh
scripts that come with Kafka.
Standalone Mode
In standalone mode, Kafka Connect runs on a single machine, and each connector is managed by a separate process. Use the connect-standalone.sh
script to run connectors in standalone mode:
bin/connect-standalone.sh config/connect-standalone.properties config/your-connector.properties
Distributed Mode
In distributed mode, Kafka Connect runs as a cluster, providing better scalability and fault tolerance. Use the connect-distributed.sh
script to run connectors in distributed mode:
bin/connect-distributed.sh config/connect-distributed.properties
7.5 Monitoring Kafka Connect
Kafka Connect exposes several metrics that can be monitored for understanding the performance and health of your connectors. You can use tools like JConsole, JVisualVM, or integrate Kafka Connect with monitoring systems like Prometheus and Grafana to monitor the cluster.
8. Kafka Streams
Kafka Streams is a client library in Apache Kafka that enables real-time stream processing of data. It allows you to build applications that consume data from Kafka topics, process the data, and produce the results back to Kafka or other external systems. Kafka Streams provides a simple and lightweight approach to stream processing, making it an attractive choice for building real-time data processing pipelines.
8.1 Key Concepts
Before diving into the details of Kafka Streams, let’s explore some key concepts:
- Stream: A continuous flow of data records in Kafka is represented as a stream. Each record in the stream consists of a key, a value, and a timestamp.
- Processor: A processor is a fundamental building block in Kafka Streams that processes incoming data records and produces new output records.
- Topology: A topology defines the stream processing flow by connecting processors together to form a processing pipeline.
- Windowing: Kafka Streams supports windowing operations, allowing you to group records within specified time intervals for processing.
- Stateful Processing: Kafka Streams supports stateful processing, where the processing logic considers historical data within a specified window.
8.2 Kafka Streams Application
To create a Kafka Streams application, you need to set up a Kafka Streams topology and define the processing steps. Here’s a high-level overview of the steps involved:
Create a Properties Object
Start by creating a Properties
object to configure your Kafka Streams application. This includes properties like the Kafka broker address, application ID, default serializers, and deserializers.
Properties props = new Properties(); props.put(StreamsConfig.APPLICATION_ID_CONFIG, "my-streams-app"); props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092"); props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass()); props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
Define the Topology
Next, define the topology of your Kafka Streams application. This involves creating processing steps and connecting them together.
StreamsBuilder builder = new StreamsBuilder(); // Create a stream from a Kafka topic KStream<String, String> inputStream = builder.stream("input_topic"); // Perform processing operations KStream<String, String> processedStream = inputStream .filter((key, value) -> value.startsWith("important_")) .mapValues(value -> value.toUpperCase()); // Send the processed data to another Kafka topic processedStream.to("output_topic"); // Build the topology Topology topology = builder.build();
Create and Start the Kafka Streams Application
Once the topology is defined, create a KafkaStreams
object with the defined properties and topology, and start the application:
KafkaStreams streams = new KafkaStreams(topology, props); streams.start();
8.3 Stateful Processing with Kafka Streams
Kafka Streams provides state stores that allow you to maintain stateful processing across data records. You can define a state store and use it within your processing logic to maintain state information.
8.4 Windowing Operations
Kafka Streams supports windowing operations, allowing you to group data records within specific time windows for aggregation or processing. Windowing is essential for time-based operations and calculations.
8.5 Interactive Queries
Kafka Streams also enables interactive queries, allowing you to query the state stores used in your stream processing application.
8.6 Error Handling and Fault Tolerance
Kafka Streams applications are designed to be fault-tolerant. They automatically handle and recover from failures, ensuring continuous data processing.
8.7 Integration with Kafka Connect and Kafka Producer/Consumer
Kafka Streams can easily integrate with Kafka Connect to move data between Kafka topics and external systems. Additionally, you can use Kafka producers and consumers within Kafka Streams applications to interact with external systems and services.
9. Kafka Security
Ensuring the security of your Apache Kafka cluster is critical to protecting sensitive data and preventing unauthorized access. Kafka provides various security features and configurations to safeguard your data streams. Let’s explore some essential aspects of Kafka security:
9.1 Authentication and Authorization
Kafka supports both authentication and authorization mechanisms to control access to the cluster.
Authentication
Kafka offers several authentication options, including:
- SSL Authentication: Secure Sockets Layer (SSL) enables encrypted communication between clients and brokers, ensuring secure authentication.
- SASL Authentication: Simple Authentication and Security Layer (SASL) provides pluggable authentication mechanisms, such as PLAIN, SCRAM, and GSSAPI (Kerberos).
Authorization
Kafka allows fine-grained control over access to topics and operations using Access Control Lists (ACLs). With ACLs, you can define which users or groups are allowed to read, write, or perform other actions on specific topics.
9.2 Encryption
Kafka provides data encryption to protect data while it’s in transit between clients and brokers.
SSL Encryption
SSL encryption, when combined with authentication, ensures secure communication between clients and brokers by encrypting the data transmitted over the network.
Encryption at Rest
To protect data at rest, you can enable disk-level encryption on the Kafka brokers.
Secure ZooKeeper
As Kafka relies on ZooKeeper for cluster coordination, securing ZooKeeper is also crucial.
Chroot
Kafka allows you to isolate the ZooKeeper instance used by Kafka by using a chroot path. This helps prevent other applications from accessing Kafka’s ZooKeeper instance.
Secure ACLs
Ensure that the ZooKeeper instance used by Kafka has secure ACLs set up to restrict access to authorized users and processes.
9.3 Secure Replication
If you have multiple Kafka brokers, securing replication between them is essential.
Inter-Broker Encryption
Enable SSL encryption for inter-broker communication to ensure secure data replication.
Controlled Shutdown
Configure controlled shutdown to ensure brokers shut down gracefully without causing data loss or inconsistency during replication.
Security Configuration
To enable security features in Kafka, you need to modify the Kafka broker configuration and adjust the client configurations accordingly.
Broker Configuration
In the server.properties
file, you can configure the following security-related properties:
listeners=PLAINTEXT://:9092,SSL://:9093 security.inter.broker.protocol=SSL ssl.keystore.location=/path/to/keystore.jks ssl.keystore.password=keystore_password ssl.key.password=key_password
Client Configuration
In the client applications, you need to set the security properties to match the broker configuration:
Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9093"); props.put("security.protocol", "SSL"); props.put("ssl.keystore.location", "/path/to/client_keystore.jks"); props.put("ssl.keystore.password", "client_keystore_password"); props.put("ssl.key.password", "client_key_password");
10. Replication Factor
Replication factor is a crucial concept in Apache Kafka that ensures data availability and fault tolerance within a Kafka cluster. It defines the number of copies, or replicas, of each Kafka topic partition that should be maintained across the brokers in the cluster. By having multiple replicas of each partition, Kafka ensures that even if some brokers or machines fail, the data remains accessible and the cluster remains operational.
10.1 How Replication Factor Works
When a new topic is created or when an existing topic is configured to have a specific replication factor, Kafka automatically replicates each partition across multiple brokers. The partition leader is the primary replica responsible for handling read and write requests for that partition, while the other replicas are called follower replicas.
10.2 Modifying Replication Factor
Changing the replication factor of an existing topic involves reassigning partitions and adding or removing replicas. This process should be performed carefully, as it may impact the performance of the cluster during rebalancing.
To increase the replication factor, you need to add new brokers and then reassign the partitions with the new replication factor using the kafka-reassign-partitions.sh
tool.
To decrease the replication factor, you need to reassign the partitions and remove replicas before removing the brokers from the cluster.
11. Partitions
Partitions are a fundamental concept in Apache Kafka that allows data to be distributed and parallelized across multiple brokers in a Kafka cluster. A topic in Kafka is divided into one or more partitions, and each partition is a linearly ordered sequence of messages. Understanding partitions is crucial for optimizing data distribution, load balancing, and managing data retention within Kafka.
11.1 How Partitions Work
When a topic is created, it is divided into a configurable number of partitions. Each partition is hosted on a specific broker in the Kafka cluster. The number of partitions in a topic can be set when creating the topic, and the partitions remain fixed after creation. Messages produced to a topic are written to one of its partitions based on the message’s key or using a round-robin mechanism if no key is provided.
11.2 Benefits of Partitions
Partitioning provides several advantages:
Benefit | Description |
Scalability | Partitions enable horizontal scaling of Kafka, as data can be distributed across multiple brokers. This allows Kafka to handle large volumes of data and high-throughput workloads. |
Parallelism | With multiple partitions, Kafka can process and store messages in parallel. Each partition acts as an independent unit, allowing multiple consumers to process data simultaneously, which improves overall system performance. |
Load Balancing | Kafka can distribute partitions across brokers, which balances the data load and prevents any single broker from becoming a bottleneck. |
11.3 Partition Key
When producing messages to a Kafka topic, you can specify a key for each message. The key is optional, and if not provided, messages are distributed to partitions using a round-robin approach. When a key is provided, Kafka uses the key to determine the partition to which the message will be written.
11.4 Choosing the Number of Partitions
The number of partitions for a topic is an important consideration and should be chosen carefully based on your use case and requirements.
Consideration | Description |
Concurrency and Throughput | A higher number of partitions allows for more parallelism and concurrency during message production and consumption. It is particularly useful when you have multiple producers or consumers and need to achieve high throughput. |
Balanced Workload | The number of partitions should be greater than or equal to the number of consumers in a consumer group. This ensures a balanced workload distribution among consumers, avoiding idle consumers and improving overall consumption efficiency. |
Resource Considerations | Keep in mind that increasing the number of partitions increases the number of files and resources needed to manage them. Thus, it can impact disk space and memory usage on the brokers. |
11.5 Modifying Partitions
Once a topic is created with a specific number of partitions, the number of partitions cannot be changed directly. Adding or reducing partitions requires careful planning and involves the following steps:
Increasing Partitions
To increase the number of partitions, you can create a new topic with the desired partition count and use Kafka tools like kafka-topics.sh
to reassign messages from the old topic to the new one.
Decreasing Partitions
Decreasing the number of partitions is more challenging and might involve reassigning messages manually to maintain data integrity.
12. Batch Size
Batch size in Apache Kafka refers to the number of messages that are accumulated and sent together as a batch from producers to brokers. By sending messages in batches instead of individually, Kafka can achieve better performance and reduce network overhead. Configuring an appropriate batch size is essential for optimizing Kafka producer performance and message throughput.
12.1 How Batch Size Works
When a Kafka producer sends messages to a broker, it can choose to batch multiple messages together before sending them over the network. The producer collects messages until the batch size reaches a configured limit or until a certain time period elapses. Once the batch size or time limit is reached, the producer sends the entire batch to the broker in a single request.
12.2 Configuring Batch Size
In Kafka, you can configure the batch size for a producer using the batch.size
property. This property specifies the maximum number of bytes that a batch can contain. The default value is 16384 bytes (16KB).
You can adjust the batch size based on your use case, network conditions, and message size. Setting a larger batch size can improve throughput, but it might also increase the latency for individual messages within the batch. Conversely, a smaller batch size may reduce latency but could result in a higher number of requests and increased network overhead.
12.3 Monitoring Batch Size
Monitoring the batch size is crucial for optimizing producer performance. You can use Kafka’s built-in metrics and monitoring tools to track batch size-related metrics, such as average batch size, maximum batch size, and batch send time.
13. Compression
Compression in Apache Kafka is a feature that allows data to be compressed before it is stored on brokers or transmitted between producers and consumers. Kafka supports various compression algorithms to reduce data size, improve network utilization, and enhance overall system performance. Understanding compression options in Kafka is essential for optimizing storage and data transfer efficiency.
13.1 How Compression Works
When a producer sends messages to Kafka, it can choose to compress the messages before transmitting them to the brokers. Similarly, when messages are stored on the brokers, Kafka can apply compression to reduce the storage footprint. On the consumer side, messages can be decompressed before being delivered to consumers.
13.2 Compression Algorithms in Kafka
Kafka supports the following compression algorithms:
Compression Algorithm | Description |
Gzip | Gzip is a widely used compression algorithm that provides good compression ratios. It is suitable for text-based data, such as logs or JSON messages. |
Snappy | Snappy is a fast and efficient compression algorithm that offers lower compression ratios compared to Gzip but with reduced processing overhead. It is ideal for scenarios where low latency is critical, such as real-time stream processing. |
LZ4 | LZ4 is another fast compression algorithm that provides even lower compression ratios than Snappy but with even lower processing overhead. Like Snappy, it is well-suited for low-latency use cases. |
Zstandard (Zstd) | Zstd is a more recent addition to Kafka’s compression options. It provides a good balance between compression ratios and processing speed, making it a versatile choice for various use cases. |
13.3 Configuring Compression in Kafka
To enable compression in Kafka, you need to configure the producer and broker properties.
Producer Configuration
In the producer configuration, you can set the compression.type
property to specify the compression algorithm to use. For example:
compression.type=gzip
Broker Configuration
In the broker configuration, you can specify the compression type for both producer and consumer requests using the compression.type
property. For example:
compression.type=gzip
13.4 Compression in Kafka Streams
When using Apache Kafka Streams, you can also configure compression for the state stores used in your stream processing application. This can help reduce storage requirements for stateful data in the Kafka Streams application.
13.5 Considerations for Compression
While compression offers several benefits, it is essential to consider the following factors when deciding whether to use compression:
Consideration | Description |
Compression Overhead | Applying compression and decompression adds some processing overhead, so it’s essential to evaluate the impact on producer and consumer performance. |
Message Size | Compression is more effective when dealing with larger message sizes. For very small messages, the overhead of compression might outweigh the benefits. |
Latency | Some compression algorithms, like Gzip, might introduce additional latency due to the compression process. Consider the latency requirements of your use case. |
Monitoring Compression Efficiency | Monitoring compression efficiency is crucial to understand how well compression is working for your Kafka cluster. You can use Kafka’s built-in metrics to monitor the compression rate and the size of compressed and uncompressed messages. |
14. Retention Policy
Retention policy in Apache Kafka defines how long data is retained on brokers within a Kafka cluster. Kafka allows you to set different retention policies at both the topic level and the broker level. The retention policy determines when Kafka will automatically delete old data from topics, helping to manage storage usage and prevent unbounded data growth.
14.1 How Retention Policy Works
When a message is produced to a Kafka topic, it is written to a partition on the broker. The retention policy defines how long messages within a partition are kept before they are eligible for deletion. Kafka uses a combination of time-based and size-based retention to determine which messages to retain and which to delete.
14.2 Configuring Retention Policy
The retention policy can be set at both the topic level and the broker level.
Topic-level Retention Policy
When creating a Kafka topic, you can specify the retention policy using the retention.ms
property. This property sets the maximum time, in milliseconds, that a message can be retained in the topic.
For example, to set a retention policy of 7 days for a topic:
bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic my_topic --partitions 3 --replication-factor 2 --config retention.ms=604800000
Broker-level Retention Policy
You can also set a default retention policy at the broker level in the server.properties
file. The log.retention.hours
property specifies the default retention time for topics that don’t have a specific retention policy set.
For example, to set a default retention policy of 7 days at the broker level:
log.retention.hours=168
14.3 Size-based Retention
In addition to time-based retention, Kafka also supports size-based retention. With size-based retention, you can set a maximum size for the partition log. Once the log size exceeds the specified value, the oldest messages in the log are deleted to make space for new messages.
To enable size-based retention, you can use the log.retention.bytes
property. For example:
log.retention.bytes=1073741824
14.4 Log Compaction
In addition to time and size-based retention, Kafka also provides a log compaction feature. Log compaction retains only the latest message for each unique key in a topic, ensuring that the most recent value for each key is always available. This feature is useful for maintaining the latest state of an entity or for storing changelog-like data.
To enable log compaction for a topic, you can use the cleanup.policy
property. For example:
cleanup.policy=compact
14.5 Considerations for Retention Policy
When configuring the retention policy, consider the following factors:
Consideration | Description |
Data Requirements | Choose a retention period that aligns with your data retention requirements. Consider the business needs and any regulatory or compliance requirements for data retention. |
Storage Capacity | Ensure that your Kafka cluster has sufficient storage capacity to retain data for the desired retention period, especially if you are using size-based retention or log compaction. |
Message Consumption Rate | Consider the rate at which messages are produced and consumed. If the consumption rate is slower than the production rate, you might need a longer retention period to allow consumers to catch up. |
Message Importance | For some topics, older messages might become less important over time. In such cases, you can use a shorter retention period to reduce storage usage. |
15. Kafka Monitoring and Management
Monitoring Kafka is essential to ensure its smooth operation. Here are some tools and techniques for effective Kafka monitoring:
Monitoring Tool | Description |
JMX Metrics | Kafka exposes various metrics through Java Management Extensions (JMX). Tools like JConsole and JVisualVM can help monitor Kafka’s internal metrics. |
Kafka Manager | Kafka Manager is a web-based tool that provides a graphical user interface for managing and monitoring Kafka clusters. It offers features like topic management, consumer group monitoring, and partition reassignment. |
Prometheus & Grafana | Integrate Kafka with Prometheus, a monitoring and alerting toolkit, and Grafana, a data visualization tool, to build custom dashboards for in-depth monitoring and analysis. |
Logging | Configure Kafka’s logging to capture relevant information for troubleshooting and performance analysis. Proper logging enables easier identification of issues. |
16. Handling Data Serialization
Kafka allows you to use different data serializers for your messages. Here’s how you can handle data serialization in Apache Kafka:
Data Serialization | Description |
Avro | Apache Avro is a popular data serialization system. You can use Avro with Kafka to enforce schema evolution and provide a compact, efficient binary format for messages. |
JSON | Kafka supports JSON as a data format for messages. JSON is human-readable and easy to work with, making it suitable for many use cases. |
String | Kafka allows data to be serialized as plain strings. In this method, the data is sent as strings without any specific data structure or schema. |
Bytes | The Bytes serialization is a generic way to handle arbitrary binary data. With this method, users can manually serialize their data into bytes and send it to Kafka as raw binary data. |
Protobuf | Google Protocol Buffers (Protobuf) offer an efficient binary format for data serialization. Using Protobuf can reduce message size and improve performance. |
17. Kafka Ecosystem: Additional Components
Kafka’s ecosystem offers various additional components that extend its capabilities. Here are some essential ones:
Tool/Component | Description |
Kafka MirrorMaker | Kafka MirrorMaker is a tool for replicating data between Kafka clusters, enabling data synchronization across different environments. |
Kafka Connect Converters | Kafka Connect Converters handle data format conversion between Kafka and other systems when using Kafka Connect. |
Kafka REST Proxy | Kafka REST Proxy allows clients to interact with Kafka using HTTP/REST calls, making it easier to integrate with non-Java applications. |
Schema Registry | Schema Registry manages Avro schemas for Kafka messages, ensuring compatibility and versioning. |
18. Conclusion
This was the Apache Kafka Essentials Cheatsheet, providing you with a quick reference to the fundamental concepts and commands for using Apache Kafka. As you delve deeper into the world of Kafka, remember to explore the official documentation and community resources to gain a more comprehensive understanding of this powerful event streaming platform.