Reports

Could you provide the log entries which shows that messages are missed?

It seems that the consumer commits the current offset, and then logs that it has consumed a message from the partition.

What could be happening is that after committing, the pod is terminated by (lets say) Kubernetes without giving your program enough time to finish logging out that it has consumed the message.

You can configure terminationGracePeriodSeconds as part of your pod deployment specification.

As part of your python program, you can also capture the SIGTERM event when your pod is asked to stop.

signal.signal(signal.SIGTERM, graceful_shutdown)

graceful_shutdown would be a method which would instruct your consumer handle any current messages it has received from kafka, commit it's offsets back, log out that it has handled those messages, and finally, gracefully stop the kafka consumer.

At that point it can then exit cleanly.

79295738