- [Feature] Allow for tagging of producer instances similar to how consumers can be tagged.
- [Enhancement] Raise
WaterDrop::ProducerNotTransactionalError
when attempting to use transactions on a non-transactional producer. - [Fix] Disallow closing a producer from within a transaction.
- [Fix] WaterDrop should prevent opening a transaction using a closed producer.
This release contains BREAKING changes. Make sure to read and apply upgrade notes.
- [Breaking] Require Ruby
3.1+
. - [Breaking] Remove ability to abort transactions using
throw(:abort)
. Please useraise WaterDrop::Errors::AbortTransaction
. - [Breaking] Disallow (similar to ActiveRecord) exiting transactions with
return
,break
orthrow
. - [Breaking] License changed from MIT to LGPL with an additional commercial option. Note: there is no commercial code in this repository. The commercial license is available for companies unable to use LGPL-licensed software for legal reasons.
- [Enhancement] Make variants fiber safe.
- [Enhancement] In transactional mode do not return any
dispatched
messages as none will be dispatched due to rollback. - [Enhancement] Align the
LoggerListener
async messages to reflect, that messages are delegated to the internal queue and not dispatched. - [Fix] Ensure, that
:dispatched
key for#produce_many_sync
always contains delivery handles (final) and not delivery reports.
PLEASE MAKE SURE TO READ AND APPLY THEM!
Replace:
producer.transaction do
messages.each do |message|
# Pipe all events
producer.produce_async(topic: 'events', payload: message.raw_payload)
end
# And abort if more events are no longer needed
throw(:abort) if KnowledgeBase.more_events_needed?
end
With:
producer.transaction do
messages.each do |message|
# Pipe all events
producer.produce_async(topic: 'events', payload: message.raw_payload)
end
# And abort if more events are no longer needed
raise(WaterDrop::AbortTransaction) if KnowledgeBase.more_events_needed?
end
Previously, transactions would abort if you exited early using return
, break
, or throw
. This could create unexpected behavior, where users might not notice the rollback or have different intentions. For example, the following would trigger a rollback:
MAX = 10
def process(messages)
count = 0
producer.transaction do
messages.each do |message|
count += 1
producer.produce_async(topic: 'events', payload: message.raw_payload)
# This would trigger a rollback.
return if count >= MAX
end
end
end
This is a source of errors, hence such exits are no longer allowed. You can implement similar flow control inside of your methods that are wrapped in a WaterDrop transaction:
MAX = 10
def process(messages)
producer.transaction do
# Early return from this method will not affect the transaction.
# It will be committed
insert_with_limit(messages)
end
end
def insert_with_limit(messages)
count = 0
messages.each do |message|
count += 1
producer.produce_async(topic: 'events', payload: message.raw_payload)
# This would trigger a rollback.
return if count >= MAX
end
end
- [Maintenance] Alias
WaterDrop::Errors::AbortTransaction
withWaterDrop::AbortTransaction
. - [Maintenance] Lower the precision reporting to 100 microseconds in the logger listener.
- [Fix] Consumer consuming error: Local: Erroneous state (state) post break flow in transaction.
- [Change] Require 'karafka-core'
>= 2.4.3
- [Enhancement] Introduce
reload_on_transaction_fatal_error
to reload the librdkafka after transactional failures - [Enhancement] Flush on fatal transactional errors.
- [Enhancement] Add topic scope to
report_metric
(YadhuPrakash) - [Enhancement] Cache middleware reference saving 1 object allocation on each message dispatch.
- [Enhancement] Provide
#idempotent?
similar to#transactional?
. - [Enhancement] Provide alias to
#with
named#variant
. - [Fix] Prevent from creating
acks
altering variants on idempotent producers.
- [Fix] Fix missing requirement of
delegate
for non-Rails use-cases. Always require delegate for variants usage (samsm)
- [Feature] Support context-base configuration with low-level topic settings alterations producer variants.
- [Enhancement] Prefix random default
SecureRandom.hex(6)
producers ids withwaterdrop-hex
to indicate type of object.
This release contains BREAKING changes. Make sure to read and apply upgrade notes.
- [Feature] Support custom OAuth providers.
- [Breaking] Drop Ruby
2.7
support. - [Breaking] Change default timeouts so final delivery
message.timeout.ms
is less thatmax_wait_time
so we do not end up with not final verdict. - [Breaking] Update all the time related configuration settings to be in
ms
and not mixed. - [Breaking] Remove no longer needed
wait_timeout
configuration option. - [Breaking] Do not validate or morph (via middleware) messages added to the buffer prior to
flush_sync
orflush_async
. - [Enhancement] Provide
WaterDrop::Producer#transaction?
that returns only when producer has an active transaction running. - [Enhancement] Introduce
instrument_on_wait_queue_full
flag (defaults totrue
) to be able to configure whether non critical (retryable) queue full errors should be instrumented in the error pipeline. Useful when building high-performance pipes with WaterDrop queue retry backoff as a throttler. - [Enhancement] Protect critical
rdkafka
thread executable code sections. - [Enhancement] Treat the queue size as a gauge rather than a cumulative stat (isturdy).
- [Fix] Fix a case where purge on non-initialized client would crash.
- [Fix] Middlewares run twice when using buffered produce.
- [Fix] Validations run twice when using buffered produce.
PLEASE MAKE SURE TO READ AND APPLY THEM!
The wait_timeout
WaterDrop configuration option is no longer needed. You can safely remove it.
producer = WaterDrop::Producer.new
producer.setup do |config|
# Other config...
# Remove this, no longer needed
config.wait_timeout = 30
end
All time-related values are now configured in milliseconds instead of some being in seconds and some in milliseconds.
The values that were changed from seconds to milliseconds are:
max_wait_timeout
wait_backoff_on_queue_full
wait_timeout_on_queue_full
wait_backoff_on_transaction_command, default
If you have configured any of those yourself, please replace the seconds representation with milliseconds:
producer = WaterDrop::Producer.new
producer.setup do |config|
config.deliver = true
# Replace this:
config.max_wait_timeout = 30
# With
config.max_wait_timeout = 30_000
# ...
end
In this release, we've updated our default settings to address a crucial issue: previous defaults could lead to inconclusive outcomes in synchronous operations due to wait timeout errors. Users often mistakenly believed that a message dispatch was halted because of these errors when, in fact, the timeout was related to awaiting the final dispatch verdict, not the dispatch action itself.
The new defaults in WaterDrop 2.7.0 eliminate this confusion by ensuring synchronous operation results are always transparent and conclusive. This change aims to provide a straightforward understanding of wait timeout errors, reinforcing that they reflect the wait state, not the dispatch success.
Below, you can find a table with what has changed, the new defaults, and the current ones in case you want to retain the previous behavior:
Config | Previous Default | New Default |
---|---|---|
root max_wait_timeout |
5000 ms (5 seconds) | 60000 ms (60 seconds) |
kafka message.timeout.ms |
300000 ms (5 minutes) | 50000 ms (50 seconds) |
kafka transaction.timeout.ms |
60000 ms (1 minute) | 55000 ms (55 seconds) |
This alignment ensures that when using sync operations or invoking #wait
, any exception you get should give you a conclusive and final delivery verdict.
As of version 2.7.0
, WaterDrop has changed how message buffering works. Previously, messages underwent validation and middleware processing when they were buffered. Now, these steps are deferred until just before dispatching the messages. The buffer functions strictly as a thread-safe storage area without performing any validations or middleware operations until the messages are ready to be sent.
This adjustment was made primarily to ensure that middleware runs and validations are applied when most relevant—shortly before message dispatch. This approach addresses potential issues with buffers that might hold messages for extended periods:
-
Temporal Relevance: Validating and processing messages near their dispatch time helps ensure that actions such as partition assignments reflect the current system state. This is crucial in dynamic environments where system states are subject to rapid changes.
-
Stale State Management: By delaying validations and middleware to the dispatch phase, the system minimizes the risk of acting on outdated information, which could lead to incorrect processing or partitioning decisions.
# Prior to 2.7.0 this would raise an error
producer.buffer(topic: nil, payload: '')
# => WaterDrop::Errors::MessageInvalidError
# After 2.7.0 buffer will not, but flush_async will
producer.buffer(topic: nil, payload: '')
# => all good here
producer.flush_async(topic: nil, payload: '')
# => WaterDrop::Errors::MessageInvalidError
The timing of middleware execution has been adjusted. Middleware, which was previously run when messages were added to the buffer, will now only execute immediately before the messages are flushed from the buffer and dispatched. This change is similar to the validation-related changes.
- [Enhancement] Instrument
producer.connected
andproducer.closing
lifecycle events.
- [Enhancement] Expose
#partition_count
for building custom partitioners that need to be aware of number of partitions on a given topic.
- [Enhancement] Provide ability to label message dispatches for increased observability.
- [Enhancement] Provide ability to commit offset during the transaction with a consumer provided.
- [Change] Change transactional message purged error type from
message.error
tolibrdkafka.dispatch_error
to align with the non-transactional error type. - [Change] Remove usage of concurrent ruby.
- [Enhancement] Return delivery handles and delivery report for both dummy and buffered clients with proper topics, partitions and offsets assign and auto-increment offsets per partition.
- [Fix] Fix a case where buffered test client would not accumulate messages on failed transactions
- [Improvement] Introduce
message.purged
event to indicate that a message that was not delivered to Kafka was purged. This most of the time refers to messages that were part of a transaction and were not yet dispatched to Kafka. It always means, that given message was not delivered but in case of transactions it is expected. In case of non-transactional it usually means#purge
usage or exceedingmessage.timeout.ms
solibrdkafka
removes this message from its internal queue. Non-transactional producers do not use this and pipe purges toerror.occurred
. - [Fix] Fix a case where
message.acknowledged
would not havecaller
key. - [Fix] Fix a bug where critical errors (like
IRB::Abort
) would not abort the ongoing transaction.
- [Improvement] Introduce a
transaction.finished
event to indicate that transaction has finished whether it was aborted or committed. - [Improvement] Use
transaction.committed
event to indicate that transaction has been committed.
- [Feature] Introduce transactions support.
- [Improvement] Expand
LoggerListener
to inform about transactions (info level). - [Improvement] Allow waterdrop to use topic as a symbol or a string.
- [Improvement] Enhance both
message.acknowledged
anderror.occurred
(forlibrdkafka.dispatch_error
) with full delivery_report. - [Improvement] Provide
#close!
that will force producer close even with outgoing data after the ma wait timeout. - [Improvement] Provide
#purge
that will purge any outgoing data and data from the internal queues (both WaterDrop and librdkafka). - [Fix] Fix the
librdkafka.dispatch_error
error dispatch for errors with negative code.
- [Improvement] early flush data from
librdkafka
internal buffer before closing. - [Maintenance] Update the signing cert as the old one expired.
- [Improvement] Provide
log_messages
option toLoggerListener
so the extensive messages data logging can disabled.
- [Fix] Add cause to the errors that are passed into instrumentation (konalegi)
- [Improvement] Use original error
#inspect
forWaterDrop::Errors::ProduceError
andWaterDrop::Errors::ProduceManyError
instead of the current empty string.
- [Change] Use
Concurrent::AtomicFixnum
to track operations in progress to prevent potential race conditions on JRuby and TruffleRuby (not yet supported but this is for future usage). - [Change] Require
karafka-rdkafka
>= 0.13.2
. - [Change] Require 'karafka-core'
>= 2.1.1
- [Refactor] Introduce a counter-based locking approach to make sure, that we close the producer safely but at the same time not to limit messages production with producing lock.
- [Refactor] Make private methods private.
- [Refactor] Validate that producer is not closed only when attempting to produce.
- [Refactor] Improve one 5 minute long spec to run in 10 seconds.
- [Refactor] clear client assignment after closing.
- [Refactor] Remove no longer needed patches.
- [Fix] Fork detection on a short lived processes seems to fail. Clear the used parent process client reference not to close it in the finalizer (#356).
- [Change] Require
karafka-rdkafka
>= 0.13.0
. - [Change] Require 'karafka-core'
>= 2.1.0
- [Improvement] Introduce
client_class
setting for ability to replace underlying client with anything specific to a given env (dev, test, etc). - [Improvement] Introduce
Clients::Buffered
useful for writing specs that do not have to talk with Kafka (id-ilych) - [Improvement] Make
#produce
method private to avoid confusion and make sure it is not used directly (it is not part of the official API). - [Change] Change
wait_on_queue_full
fromfalse
totrue
as a default. - [Change] Rename
wait_on_queue_full_timeout
towait_backoff_on_queue_full
to match what it actually does. - [Enhancement] Introduce
wait_timeout_on_queue_full
with proper meaning. That is, this represents time after which despite backoff the error will be raised. This should allow to raise an error in case the backoff attempts were insufficient. This prevents from a case, where upon never deliverable messages we would end up with an infinite loop. - [Fix] Provide
type
for queue full errors that references the appropriate public API method correctly.
- Rename
wait_on_queue_full_timeout
towait_backoff_on_queue_full
. - Set
wait_on_queue_full
tofalse
if you did not use it and do not want.
- [Enhancement] Include topic name in the
error.occurred
notification payload. - [Enhancement] Include topic name in the
message.acknowledged
notification payload. - [Maintenance] Require
karafka-core
2.0.13
- [Fix] Require missing Pathname (#345)
- [Feature] Introduce a configurable backoff upon
librdkafka
queue full (false by default).
- [Feature] Pipe all the errors including synchronous errors via the
error.occurred
. - [Improvement] Pipe delivery errors that occurred not via the error callback using the
error.occurred
channel. - [Improvement] Introduce
WaterDrop::Errors::ProduceError
andWaterDrop::Errors::ProduceManyError
for any inline raised errors that occur. You can get the original error by using the#cause
. - [Improvement] Include
#dispatched
messages handler in theWaterDrop::Errors::ProduceManyError
error, to be able to understand which of the messages were delegated tolibrdkafka
prior to the failure. - [Maintenance] Remove the
WaterDrop::Errors::FlushFailureError
in favour of correct error that occurred to unify the error handling. - [Maintenance] Rename
Datadog::Listener
toDatadog::MetricsListener
to align with Karafka (#329). - [Fix] Do not flush when there is no data to flush in the internal buffer.
- [Fix] Wait on the final data flush for short-lived producers to make sure, that the message is actually dispatched by
librdkafka
or timeout.
Please note, this is a breaking release, hence 2.5.0
.
- If you used to catch
WaterDrop::Errors::FlushFailureError
now you need to catchWaterDrop::Errors::ProduceError
.WaterDrop::Errors::ProduceManyError
is based on theProduceError
, hence it should be enough. - Prior to
2.5.0
there was always a chance of partial dispatches viaproduce_many_
methods. Now you can get the info on all the errors viaerror.occurred
. - Inline
Rdkafka::RdkafkaError
are now re-raised viaWaterDrop::Errors::ProduceError
and available under#cause
. AsyncRdkafka::RdkafkaError
errors are still directly available and you can differentiate between errors using the eventtype
. - If you are using the Datadog listener, you need to:
# Replace require:
require 'waterdrop/instrumentation/vendors/datadog/listener'
# With
require 'waterdrop/instrumentation/vendors/datadog/metrics_listener'
# Replace references of
::WaterDrop::Instrumentation::Vendors::Datadog::Listener.new
# With
::WaterDrop::Instrumentation::Vendors::Datadog::MetricsListener.new
- Replace the local rspec locator with generalized core one.
- Make
::WaterDrop::Instrumentation::Notifications::EVENTS
list public for anyone wanting to re-bind those into a different notification bus.
- Include
caller
in the error instrumentation to align with Karafka.
- Remove empty debug logging out of
LoggerListener
. - Do not lock Ruby version in Karafka in favour of
karafka-core
. - Make sure
karafka-core
version is at least2.0.9
to make sure we runkarafka-rdkafka
.
- Use monotonic time from Karafka core.
- Add support to customizable middlewares that can modify message hash prior to validation and dispatch.
- Fix a case where upon not-available leader, metadata request would not be retried
- Require
karafka-core
2.0.7.
- Set
statistics.interval.ms
to 5 seconds by default, so the defaults cover all the instrumentation out of the box.
If you want to disable librdkafka
statistics because you do not use them at all, update the kafka
statistics.interval.ms
setting and set it to 0
:
producer = WaterDrop::Producer.new
producer.setup do |config|
config.deliver = true
config.kafka = {
'bootstrap.servers': 'localhost:9092',
'statistics.interval.ms': 0
}
end
- Fix invalid error scope visibility.
- Cache partition count to improve messages production and lower stress on Kafka when
partition_key
is on.
- Add temporary patch on top of
rdkafka-ruby
to mitigate metadata fetch timeout failures.
- Support for librdkafka 0.13
- Update Github Actions
- Change auto-generated id from
SecureRandom#uuid
toSecureRandom#hex(6)
- Remove shared components that were moved to
karafka-core
from WaterDrop
- Allow sending tombstone messages (#267)
- Replace local statistics decorator with the one extracted to
karafka-core
.
- Small refactor of the DataDog/Statsd listener to align for future extraction to
karafka-common
. - Replace
dry-monitor
with home-brew notification layer (API compatible) and allow for usage withActiveSupport::Notifications
. - Remove all the common code into
karafka-core
and add it as a dependency.
- Replace
dry-validation
with home-brew validation layer and drop direct dependency ondry-validation
. - Remove indirect dependency on dry-configurable from DataDog listener (no changes required).
- Replace
dry-configurable
with home-brew config and drop direct dependency ondry-configurable
.
- Update rdkafka patches to align with
0.12.0
and0.11.1
support.
- Rename StdoutListener to LoggerListener (#240)
- Add Datadog listener for metrics + errors publishing
- Add Datadog example dashboard template
- Update Readme to show Dd instrumentation usage
- Align the directory namespace convention with gem name (waterdrop => WaterDrop)
- Introduce a common base for validation contracts
- Drop CI support for ruby 2.6
- Require all
kafka
settings to have symbol keys (compatibility with Karafka 2.0 and rdkafka)
- Ruby 3.1 support
- Change the error notification key from
error.emitted
toerror.occurred
. - Normalize error tracking and make all the places publish errors into the same notification endpoint (
error.occurred
). - Start semantic versioning WaterDrop.
- Source code metadata url added to the gemspec
- Replace
:producer
with:producer_id
in events and updateStdoutListener
accordingly. This change aligns all the events in terms of not publishing the whole producer object in the events. - Add
error.emitted
into theStdoutListener
. - Enable
StdoutLogger
in specs for additional integration coverage.
- #218 - Fixes a case, where dispatch of callbacks the same moment a new producer was created could cause a concurrency issue in the manager.
- Fix some unstable specs.
- Fixes an issue where multiple producers would emit stats of other producers causing the same stats to be published several times (as many times as a number of producers). This could cause invalid reporting for multi-kafka setups.
- Fixes a bug where emitted statistics would contain their first value as the first delta value for first stats emitted.
- Fixes a bug where decorated statistics would include a delta for a root field with non-numeric values.
- Introduces support for error callbacks instrumentation notifications with
error.emitted
monitor emitted key for tracking background errors that would occur on the producer (disconnects, etc). - Removes the
:producer
key fromstatistics.emitted
and replaces it with:producer_id
not to inject whole producer into the payload - Removes the
:producer
key frommessage.acknowledged
and replaces it with:producer_id
not to inject whole producer into the payload - Cleanup and refactor of callbacks support to simplify the API and make it work with Rdkafka way of things.
- Introduces a callbacks manager concept that will also be within in Karafka
2.0
for both statistics and errors tracking per client. - Sets default Kafka
client.id
towaterdrop
when not set. - Updates specs to always emit statistics for better test coverage.
- Adds statistics and errors integration specs running against Kafka.
- Replaces direct
RSpec.describe
reference with auto-discovery - Patches
rdkafka
to provide functionalities that are needed for granular callback support.
- Update
dry-*
to the recent versions and update settings syntax to match it - Update Zeitwerk requirement
- Remove rdkafka patch in favour of spec topic pre-creation
- Do not close client that was never used upon closing producer
- Add support for
partition_key
- Switch license from
LGPL-3.0
toMIT
- Switch flushing on close to sync
- Remove Ruby 2.5 support and update minimum Ruby requirement to 2.6
- Fix the
finalizer references object to be finalized
warning issued with 3.0
- Redesign of the whole API (see
README.md
for the use-cases and the current API) - Replace
ruby-kafka
withrdkafka
- Switch license from
MIT
toLGPL-3.0
- #113 - Add some basic validations of the kafka scope of the config (Azdaroth)
- Global state removed
- Redesigned metrics that use
rdkafka
internal data + custom diffing - Restore JRuby support
- Release to match Karafka 1.4 versioning.
- Support for new
dry-configurable
- #119 - Support exactly once delivery and transactional messaging (kylekthompson)
- #119 - Support delivery_boy 1.0 (kylekthompson)
- Ruby 2.7.0 support
- Fix missing
delegate
dependency onruby-kafka
- Ruby 2.6.5 support
- Expose setting to optionally verify hostname on ssl certs #109 (tabdollahi)
- Drop Ruby 2.4 support
- Drop Ruby 2.3 support
- Drop support for Kafka 0.10 in favor of native support for Kafka 0.11.
- Ruby 2.6.3 support
- Support message headers
sasl_over_ssl
support- Unlock Ruby Kafka + provide support for 0.7 only
- #60 - Rename listener to StdoutListener
- Drop support for Kafka 0.10 in favor of native support for Kafka 0.11.
- Support ruby-kafka 0.7
- Support message headers
sasl_over_ssl
supportssl_client_cert_key_password
support- #87 - Make stdout listener as instance
- Use Zeitwerk for gem code loading
- #93 - zstd compression support
- #99 - schemas are renamed to contracts
- Bump delivery_boy (0.2.7 => 0.2.8)
- Bump dependencies to match Karafka
- drop jruby support
- drop ruby 2.2 support
- Due to multiple requests, unlock of 0.7 with an additional post-install message
- Lock ruby-kafka to 0.6 (0.7 support targeted for WaterDrop 1.3)
- #55 - Codec settings unification and config applier
- #54 - compression_codec api sync with king-konf requirements
- #45 - Allow specifying a create time for messages
- #47 - Support SCRAM once released
- #49 - Add lz4 support once merged and released
- #50 - Potential message loss in async mode
- Ruby 2.5.0 support
- Gem bump to match Karafka framework versioning
- #48 - ssl_ca_certs_from_system
- #52 - Use instrumentation compatible with Karafka 1.2
- Added high level retry on connection problems
- #37 - ack level for producer
- Gem bump
- Ruby 2.4.2 support
- Raw ruby-kafka driver is now replaced with delivery_boy
- Sync and async producers
- Complete update of the API
- Much better validations for config details
- Complete API remodel - please read the new README
- Renamed send_messages to deliver
- Bump to match Karafka
- Renamed
hosts
toseed_brokers
- Removed the
ssl
scoping forkafka
config namespace to better match Karafka conventions - Added
client_id
option on a root config level - Added
logger
option on a root config level - Auto Propagation of config down to ruby-kafka
- Removed support for Ruby 2.1.*
Ruby 2.3.3 as default- Ruby 2.4.0 as default
- Gem dump x2
- Dry configurable config (#20)
- added .rspec for default spec helper require
- Added SSL capabilities
- Coditsu instead of PG dev tools for quality control
- Dev tools update
- Gem update
- Specs updates
- File naming convention fix from waterdrop to water_drop + compatibility file
- Additional params (partition, etc) that can be passed into producer
- Driver change from Poseidon (not maintained) to Ruby-Kafka
- Version dump - this WaterDrop version no longer relies on Aspector to work
- #17 - Logger for Aspector - WaterDrop no longer depends on Aspector
- #8 - add send date as a default value added to a message - wont-fix. Should be implemented on a message level since WaterDrop just transports messages without adding additional stuff.
- #11 - same as above
- Resolved bug #15. When you use waterdrop in aspect way, message will be automatically parse to JSON.
- Removed default to_json casting because of binary/other data types incompatibility. This is an incompatibility. If you use WaterDrop, please add a proper casting method to places where you use it.
- Gem dump
- Poseidon options extractions and tweaks
- Switched raise_on_failure to ignore all StandardError failures (or not to), not just specific once
- Reloading inside connection pool connection that seems to be broken (one that failed) - this should prevent from multiple issues (but not from single one) that are related to the connection
- Required acks and set to -1 (most secure but slower)
- Added a proxy layer to to producer so we could replace Kafka with other messaging systems
- Gem dump
- proper poseidon clients names (not duplicated)
- kafka_host, kafka_hosts and kafka_ports settings consistency fix
- Added null-logger gem
- raise_on_failure flag to ignore (if false) that message was not sent
- Renamed WaterDrop::Event to WaterDrop::Message to follow Apache Kafka naming convention
- Gems cleanup
- Requirements fix
- Initial gem release