Apache Pulsar Becomes Top-Level
September 25, 2018
Pulsar is a next-generation, Open Source distributed
publish-and-subscribe messaging system designed for scalability,
flexibility, and no data loss.
"We are very proud of Pulsar reaching this important milestone. This is
the testament to all work done over the years by all the contributors,
before and after starting our journey within The Apache Software
Foundation," said Matteo Merli, Vice President of Apache Pulsar. "During
the incubation process, it has been amazing to see the community grow
and the project mature at such a high pace. The last year has seen the
evolution of Pulsar from a its original messaging core into an
integrated platform for data in motion. We are thrilled to continue
innovation in this exciting and fast moving space."
Pulsar is a highly scalable, low latency messaging platform running on
commodity hardware. It provides simple pub-sub and queue semantics over
topics, lightweight compute framework, automatic cursor management for
subscribers, and cross-datacenter replication. The project was
originally developed at Yahoo (now part of Oath), and was submitted to
the Apache Incubator June 2017.
The initial goal for Pulsar was to create a multi-tenant scalable
messaging system that could serve as a unified platform for a wide set
of very demanding use cases. Thanks to the original design we have been
able to iterate and expand the scope of the project, adding lightweight
compute and a connector frameworks that allow users to process data and
integrate with external systems, all from within Pulsar.
The unique architecture of Pulsar, which separates the serving and
storage layers, leveraging Apache BookKeeper as the storage component,
has proven to be a key strong point. The two layers architecture enables
Pulsar to offer a vastly simplified approach to the cluster operations,
allowing operators to easily expand clusters and replace failed nodes,
or by providing a much higher write and read availability.
Apache Pulsar is in use at MercadoLibre, Oath, One Click Retail, STICorp,
TaxiStartup, Yahoo Japan Corporation, and Zhaopin.com, among others.
"Launching Pulsar at Yahoo in 2015, our goal has always been to make
Pulsar widely used and well-integrated with other large-scale open
source software," said Joe Francis, Director, Storage and Messaging,
Oath. "We are excited for Pulsar's graduation and to see the growth of
its vibrant open-source developer community within The Apache Software
Foundation. At Oath we run Apache Pulsar at scale across many major
products — including Yahoo Mail, Yahoo Finance, Yahoo Sports and Oath Ad
Platforms — and in multiple data centers across the globe, with full
mesh replication. Pulsar will continue to be an integral part of our
tech stack, in streaming and also as a bridge between public and private
clouds in our hybrid cloud strategy."
"At Zhaopin.com, we have used Apache Pulsar to build our enterprise
event bus, because it has many enterprise features to address the
shortcomings of existing messaging systems, such as message durability,
low latency," said Hui Li, Director of Infrastructure Group at Zhaopin.
"We also contributed a few exciting
features to Pulsar, and are planning to work with the community to
contribute more. It's been thrilling to watch the community grow, and
I'm very proud and excited to see that the project is graduating. Pulsar
has a bright future, and I'm looking forward to what's to come."
"We have used Apache Pulsar as a centralized pub-sub messaging platform
for many of our services/applications.
It has remarkable features; multi-tenancy and horizontal scalability
that allows us to deal with a large number of services/applications on a
single system, and durability, high throughput and low latency bring
reliable real-time pub-sub messaging to users," said Nozomi Kurihara,
Manager of Messaging Platform team at Yahoo Japan Corporation. "We're
very excited with the graduation of Pulsar, and strongly believe it will
play a significant role in the next generation of stream processing. We
will continue to contribute by sending more pull-requests and by
holding community events etc. in our aspirations for continued growth."
two years of struggling with the complexity of other technologies, we
turned to Apache Pulsar for a new platform that would simplify our data
pipeline," said Jowanza Joseph, Principal Software Engineer at One Click
Retail. "Because of Pulsar's unique combination of messaging and stream
processing, we've been able to replace multiple systems with one
solution that works seamlessly in our Kubernetes environment. Pulsar
functions has allowed us to dramatically simplify our stream processing
pipeline and to reduce the cost associated with production grade stream
processing systems. Seeing Pulsar become a top-level Apache project is a
great milestone that validates our confidence in the current and future
innovations of Pulsar and the Pulsar community."
"With the graduation, we hope to take the Apache Pulsar project and
community to the next level and to reach a wider set of users and
contributors, with the ultimate goal of building a strong ecosystem,"
added Merli. "We welcome anyone to join our efforts by helping with
code, documentation or technical discussions in our forums."
“The fact that Apache Pulsar has gone
from incubator project to top-level in two short years is a testament to
the community growth around the project,” said Matteo Merli, co-founder
of Streamlio, architect, original lead developer of Pulsar while at
Yahoo! and recently named vice president of Apache Pulsar.
“Organizations are rapidly adopting Pulsar and it has become
instrumental in a broad range of modern data-driven applications.”
In numerous tests utilizing the real-world OpenMessaging benchmark,
Pulsar scored the highest for performance, scalability and durability
compared to existing messaging platforms. In fact a recent test by
Streamlio found Pulsar had 7x greater throughput than another popular
open source messaging project, a benefit that is critical to the
immediacy of fast data processing.
Messaging solutions for streaming data are a critical and central
component of software infrastructure for modern digital companies,
providing the glue that connects diverse data with users and
applications. From real-time customer interaction to fraud detection,
logistics and autonomous vehicles, the need to act on data quickly
rather than waiting on slow batch processing is everywhere. Pulsar
provides the enterprise-class technology needed to enable companies to
move beyond the limits of traditional batch-centric approaches to the
data-driven future where they can immediately process and act on
fast-moving data as quickly as it arrives.
Since the submission of Pulsar to the Apache Incubator by Yahoo!,
Streamlio has witnessed and helped drive Pulsar’s growing adoption. For
example, one of the world’s largest industrial companies is using Pulsar
to help them process and analyze industrial IoT (Internet of Things)
sensor data from power generation equipment in real-time, and a large
media company is using Pulsar to help them handle and track distribution
of their digital assets.