Apache Pulsar: Distributed Pub-Sub Messaging System
Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo and part of the Apache Software Foundation.
Pulsar is a multi-tenant, high-performance solution for server-to-server messaging .
Pulsar’s key features include [4] :
- Native support for multiple clusters in a Pulsar instance, with seamless geo-replicationof messages across clusters
- Very low publish and end-to-end latency
- Seamless scalability out to over a million topics
- A simple client API with bindings for Java, Python, and C++
- Multiple subscription modes for topics (exclusive, shared, and failover)
- Guaranteed message delivery with persistent message storage provided by Apache BookKeeper
Architecture overview
At the highest level, a Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can replicate data amongst themselves [4].
The diagram below provides an illustration of a Pulsar cluster:
Pulsar Comparison with Apache Kafka
The table below lists the similarities and differences between Apache Pulsar and Apache Kafka [5]:
Kafka | Pulsar | |
---|---|---|
Concepts | Producer-topic-consumer group-consumer | Producer-topic-subscription-consumer |
Consumption | More focused on streaming, exclusive messaging on partitions. No shared consumption. | Unified messaging model and API.
|
Acking | Simple offset management
| Unified messaging model and API.
|
Retention | Messages are deleted based on retention. If a consumer doesn’t read messages before retention period, it will lose data. | Messages are only deleted after all subscriptions consumed them. No data loss even the consumers of a subscription are down for a long time. Messages are allowed to keep for a configured retention period time even after all subscriptions consume them. |
TTL | No TTL support | Supports message TTL |
Conclusion
Apache Pulsar is an effort undergoing incubation at The Apache Software Foundation (ASF) [3] sponsored by the Apache Incubator PMC. It seems that it will be a competitive alternative to Apache Kafka due to its unique features.
Resources:
[1] https://pulsar.apache.org/
[2] https://developer.yahoo.com/open-source/
[3] https://apache.org/
[4] https://pulsar.apache.org/docs/latest/getting-started/ConceptsAndArchitecture/
[5] https://streaml.io/blog/pulsar-streaming-queuing/
Published on Java Code Geeks with permission by Furkan Kamaci, partner at our JCG program. See the original article here: Apache Pulsar: Distributed Pub-Sub Messaging System Opinions expressed by Java Code Geeks contributors are their own. |