Software Development

Introduction to Redpanda

Redpanda is an open-source streaming platform built to be fast, scalable, and reliable, catering to the needs of modern data-intensive applications. It offers an Apache Kafka-compatible API, making it easy to migrate existing applications. This article will explore RedPanda, understand its fundamentals, and demonstrate how to leverage its capabilities using Java.

1. What is Redpanda?

Redpanda is a high-performance streaming data platform designed for storing and processing real-time data streams. It helps build event-driven architectures, where applications communicate by exchanging messages (events). Redpanda decouples producers (data publishers) from consumers (data subscribers), enabling asynchronous communication and scalability.

1.1 Key Components of Redpanda

  • Storage Engine: Redpanda utilizes a high-performance storage engine optimized for modern hardware, enabling efficient storage and retrieval of massive volumes of data.
  • Stream Processing: It offers robust stream processing capabilities, allowing users to process and analyze data in real-time using frameworks like Kafka Streams or KSQL.
  • Distributed Consensus: Redpanda employs a distributed consensus protocol called Raft, ensuring strong consistency and fault tolerance across nodes in the cluster.
  • Compatibility with Kafka: Redpanda is fully compatible with the Apache Kafka protocol, making it seamless for Kafka users to migrate or integrate with Redpanda.

1.2 Features of Redpanda

  • Simplicity: Redpanda is lightweight and requires no external dependencies. It operates as a single binary, simplifying deployment and management.
  • Operational Simplicity: Redpanda offers a user-friendly web console and a command-line interface (Redpanda Keeper) for cluster management and monitoring. Redpanda also simplifies deployment and management with its intuitive administrative interfaces and automation tools, reducing operational overhead for DevOps teams.
  • High Performance: Redpanda is engineered for high throughput and low latency, making it suitable for demanding real-time applications such as real-time analytics, fraud detection, and financial trading.
  • Scalability: It scales horizontally with ease, allowing users to seamlessly expand their cluster to accommodate growing workloads and data volumes.
  • Reliability: With its distributed architecture and built-in fault tolerance mechanisms, Redpanda ensures data durability and availability even in the face of node failures.
  • Kafka API Compatibility: Redpanda seamlessly integrates with existing Kafka tools and applications due to its compatibility with the Kafka API, making it easier for Kafka users to transition.

2. Installation and Setup

As part of setting up RedPanda, it’s essential to understand the concept of a broker. In RedPanda, a broker is a fundamental component responsible for managing the storage and processing of data streams. Each RedPanda node functions as a broker within the cluster.

2.1 The Broker

A broker in RedPanda plays a crucial role in facilitating communication between producers and consumers of data. It acts as a mediator, receiving data from producers, storing it durably on disk, and making it available for consumption by consumers. Additionally, brokers handle various tasks such as replication, partitioning, and message routing to ensure fault tolerance, high availability, and scalability of the system.

In essence, A broker in RedPanda functions similarly to a centralized hub that orchestrates the flow of data within the streaming platform. By distributing data across multiple brokers in a cluster, RedPanda ensures resilience against failures and efficient utilization of resources.

2.2 Installation

Before proceeding, ensure that Docker is installed and operational on your machine.

To start a RedPanda cluster with just one broker, the simplest method is to utilise rpk (rpk container start -n 1). This command line utility is created to configure and administer RedPanda clusters.

We can also start a RedPanda cluster by downloading the docker-compose.yml file from the RedPanda Docs website to our local machine and use the docker-compose up -d to start a cluster in the directory where you saved the docker-compose.yml file.

3. Java Programs for Publishing and Consuming Messages

Below are simple Java programs demonstrating how to publish messages to Redpanda topics and then read messages from them.

3.1 Java Client Setup

We will use the Kafka Java client library to interact with RedPanda from a Java application. You can add it to your project using Maven or Gradle. Below are the dependencies for Maven:

<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>3.7.0</version>
</dependency>

3.2 Creating Topics with AdminClient

In addition to publishing and consuming messages, managing topics is a crucial aspect of working with Redpanda. The Kafka client library provides an AdminClient class that allows users to create, delete, and manage topics programmatically.

Below is an example Java program demonstrating how to create a topic using the AdminClient class from the Kafka client library:

public class TopicCreator {

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        // Kafka broker address
        String bootstrapServers = "localhost:9092";
        // Topic name
        String topicName = "new-topic";
        // Number of partitions for the topic
        int numPartitions = 3;
        // Replication factor for the topic
        short replicationFactor = 1;

        // Configure AdminClient properties
        Properties properties = new Properties();
        properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);

        // Create AdminClient
        try (AdminClient adminClient = AdminClient.create(properties)) {
            // Create a new topic
            NewTopic newTopic = new NewTopic(topicName, numPartitions, replicationFactor);
            adminClient.createTopics(Collections.singletonList(newTopic)).all().get();
            System.out.println("Topic created successfully: " + topicName);
        } catch (Exception e) {
            System.err.println("Error creating topic: " + e.getMessage());
        }
    }
}

In this program:

  • localhost:9092 is the address of the Redpanda broker.
  • The AdminClient is created with the provided configuration properties.
  • A NewTopic object is instantiated with the specified topic name, number of partitions, and replication factor.
  • The createTopics method of AdminClient is invoked to create the topic.

3.3 Publishing Messages to Redpanda Topic (Producer Example)

The following code demonstrates how to create a producer and send messages to a topic in RedPanda:

public class RedpandaProducer {

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        // Redpanda broker address
        String bootstrapServers = "localhost:9092";
        // Topic name
        String topic = "new-topic";

        // Configure producer properties
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

        // Create KafkaProducer
        KafkaProducer<String, String> producer = new KafkaProducer<String, String>(props);


        // Publish messages to the topic
        for (int i = 0; i < 10; i++) {
            ProducerRecord<String, String> record = new ProducerRecord<>(topic, "Message " + i);
            producer.send(record).get();
        }

        // Close the producer
        producer.close();
    }
}

This code defines a Java class named RedpandaProducer, which serves as a Kafka producer for publishing messages to a Redpanda topic. Let’s break down and explain the code block:

  • Configuration: Properties for configuring the producer are set up using a Properties object. Key configurations include:
    • BOOTSTRAP_SERVERS_CONFIG: Specifies the list of host/port pairs to use for establishing the initial connection to Redpanda. In this case, it’s set to localhost:9092.
    • KEY_SERIALIZER_CLASS_CONFIG and VALUE_SERIALIZER_CLASS_CONFIG: Specify the serializer classes for keys and values. Here, both are set to StringSerializer because we are dealing with strings.
  • Producer Initialization: An instance of KafkaProducer is created with the configured properties.
  • Message Publication: Inside a loop, the producer publishes messages to the specified topic. The loop iterates, and for each iteration, a new ProducerRecord is created with the topic name and a message. The producer.send(record) method asynchronously sends the record to the topic.

3.4 Reading Messages from Redpanda Topic (Consumer Example)

Here is how you can consume messages from a topic using the RedPanda Java client:

public class RedpandaConsumer {

    public static void main(String[] args) {
        String bootstrapServers = "localhost:9092"; // Redpanda broker address
        String topic = "new-topic"; // Topic name

        // Configure the consumer
        Properties props = new Properties();
        props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "consumer-group");
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);

        // Subscribe to the topic
        consumer.subscribe(Collections.singletonList(topic));

        // Poll for new messages
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                System.out.println("Received message: " + record.value());
            }
        }
    }
}

This code defines a Java class named RedpandaConsumer, which serves as a Kafka consumer for reading messages from our Redpanda topic. Here is a breakdown of the code:

  • Configuration: Properties for configuring the consumer are set up using a Properties object. Key configurations include:
    • AUTO_OFFSET_RESET_CONFIG: Here, it’s set to “earliest“, meaning the consumer will start reading from the earliest available offset.
    • BOOTSTRAP_SERVERS_CONFIG: Specifies the list of host/port pairs to use for establishing the initial connection to Redpanda. In this case, it’s set to localhost:9092.
    • GROUP_ID_CONFIG: Specifies the consumer group id. Each consumer in a group must have a unique group id. Here, it’s set to “consumer-group“.
    • KEY_DESERIALIZER_CLASS_CONFIG and VALUE_DESERIALIZER_CLASS_CONFIG: Specify the deserializer classes for keys and values. Here, both are set to StringDeserializer because we are dealing with strings.
  • Consumer Initialization: An instance of KafkaConsumer is created with the configured properties.
  • Topic Subscription: The consumer subscribes to the specified topic using consumer.subscribe(Collections.singletonList(topic)).
  • Message Consumption: Inside a continuous while loop, the consumer polls for new messages from the subscribed topic. It uses consumer.poll(Duration.ofMillis(100)) to fetch records, where Duration.ofMillis(100) specifies the maximum time to wait for records if none are available immediately. Once records are retrieved, the consumer iterates over them, printing the value of each message to the console.
Fig 1: Output from running the RedPanda Consumer Java example
Fig 1: Output from running the RedPanda Consumer Java example

4. Some Common Use Cases of RedPanda

  • Real-time Analytics: Redpanda can power real-time analytics platforms, enabling organizations to gain actionable insights from streaming data sources such as IoT devices, sensors, and application logs.
  • Log Aggregation: Redpanda can serve as a centralized log aggregation platform, collecting and storing logs from distributed applications and systems for monitoring, troubleshooting, and analysis.
  • Microservices Communication: Redpanda facilitates event-driven communication between microservices in modern cloud-native architectures, enabling seamless integration and decoupling of services.

5. Conclusion

In this article, we have introduced RedPanda, a streaming platform designed for modern data architectures. We have covered its key features, and installation, and demonstrated how to interact with RedPanda using Java code examples. With its high performance, scalability, and compatibility with existing tools, RedPanda is an excellent choice for building real-time streaming applications.

6. Download the Source Code

This was an article on an Introduction to Redpanda.

Download
You can download the full source code of this example here: Introduction to Redpanda

Omozegie Aziegbe

Omos holds a Master degree in Information Engineering with Network Management from the Robert Gordon University, Aberdeen. Omos is currently a freelance web/application developer who is currently focused on developing Java enterprise applications with the Jakarta EE framework.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button