Python Kafka Consumer Doesn't Receive The Message From Beginning?
Solution 1:
TL;DR
you need to provide a new group.id every time you want to read the topic from beginning while keeping the setting auto_offset_reset='earliest':
KafkaConsumer('quickstart-events', bootstrap_servers=['localhost:9092'], auto_offset_reset='smallest', group_id='newGroup')
If your code is printing the output when running for the first time but not in subsequent runs anymore, and your problem is also solved when restarting Kafka (your PC) you are hitting the concept of the Consumer Group in Kafka. As this is quite an essential concept I highly recommend to get familiar with it here.
The consumer Group of an application ensures that it does not read a message twice. Each Consumer has a consumer group name (even though you might not see in directly in your code). The offset position of the consumer Group is stored in an internal Kafka topic.
Now running the code for the first time after restarting Kafka, Kafka does not know yet the consumer group and applies the policy provided in the auto_offset_reset configuration. In your case it reads from earliest available commit. The second time you run your code, it does not need to look into this policy because it already knows the consumer and it will not allow the consumer to consume the message again.
Therefore, if you restart Kafka, this internal knowledge of the consumer is also gone and again the auto_offset_reset policy is applied.
Just keep in mind that this is rather a hack and should not be done to often on productive systems as consumerGroups will be idle.
As a sid note: You console-consumer creates a new consumer group every single time you run it. The setting "--from-beginning" just ensures that auto_offset_reset is set to 'earliest'.
Post a Comment for "Python Kafka Consumer Doesn't Receive The Message From Beginning?"