Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit NYC 2019

Walmart.com generates millions of events per second. At WalmartLabs, I’m working in a team called the Customer Backbone (CBB), where we wanted to upgrade to a platform capable of processing this event volume in real-time and store the state/knowledge of possibly all the Walmart Customers generated by the processing. Kafka streams’ event-driven architecture seemed like the only obvious choice.
However, there are a few challenges w.r.t. Walmart’s scale:
• the clusters need to be large and the problems thereof.
• infinite retention of changelog topics, wasting valuable disk.
• slow stand-by task recovery in case of a node failure (changelog topics have GBs of data)
• no repartitioning in Kafka Streams.
As part of the event-driven development and addressing the challenges above, I’m going to talk about some bold new ideas we developed as features/patches to Kafka Streams to deal with the scale required at Walmart.
• Cold Bootstrap: Where in case of a Kafka Streams node failure, how instead of recovering from the change-log topic, we bootstrap the standby from active’s RocksDB using JSch and zero event loss by careful offset management.
• Dynamic Repartitioning: We added support for repartitioning in Kafka Streams where state is distributed among the new partitions. We can now elastically scale to any number of partitions and any number of nodes.
• Cloud/Rack/AZ aware task assignment: No active and standby tasks of the same partition are assigned to the same rack.
• Decreased Partition Assignment Size: With large clusters like ours (less than 400 nodes and 3 stream threads per node), the size of Partition Assignment of the KS cluster being few 100MBs, it takes a lot of time to settle a rebalance.
Key Takeaways:
• Basic understanding of Kafka Streams.
• Productionizing Kafka Streams at scale.
• Using Kafka Streams as Distributed NoSQL DB
However, there are a few challenges w.r.t. Walmart’s scale:
• the clusters need to be large and the problems thereof.
• infinite retention of changelog topics, wasting valuable disk.
• slow stand-by task recovery in case of a node failure (changelog topics have GBs of data)
• no repartitioning in Kafka Streams.
As part of the event-driven development and addressing the challenges above, I’m going to talk about some bold new ideas we developed as features/patches to Kafka Streams to deal with the scale required at Walmart.
• Cold Bootstrap: Where in case of a Kafka Streams node failure, how instead of recovering from the change-log topic, we bootstrap the standby from active’s RocksDB using JSch and zero event loss by careful offset management.
• Dynamic Repartitioning: We added support for repartitioning in Kafka Streams where state is distributed among the new partitions. We can now elastically scale to any number of partitions and any number of nodes.
• Cloud/Rack/AZ aware task assignment: No active and standby tasks of the same partition are assigned to the same rack.
• Decreased Partition Assignment Size: With large clusters like ours (less than 400 nodes and 3 stream threads per node), the size of Partition Assignment of the KS cluster being few 100MBs, it takes a lot of time to settle a rebalance.
Key Takeaways:
• Basic understanding of Kafka Streams.
• Productionizing Kafka Streams at scale.
• Using Kafka Streams as Distributed NoSQL DB