Upgrade to add your own logo

Experiences Operating Apache Kafka at Scale (Noa Resare,Apple,Inc) Kafka Summit SF 2019

Running Apache Kafka sometimes presents interesting challenges, especially when operating at scale. In this talk we share some of our experiences operating Apache Kafka as a service across a large company. What happens when you create a lot of partitions and then need to restart brokers? What if you find yourself with a need to reassign almost all partitions in all of your clusters? How do you track progress on large-scale reassignments? How do you make sure that moving data between nodes in a cluster does not impact producers and consumers connected to the cluster? We invite you to dive into a few of the issues we have encountered and share debugging and mitigation strategies.