DevOps | Cloud | Analytics | Open Source | Programming





How To Monitor Important Performance Metrics in Kafka ?



In this Post , we will learn What Are The Most Important Metrics to Monitor in Kafka and How To Monitor Important Performance Metrics in Kafka ? Kafka monitoring is a Crucial Part of the Process. Since Kafka is Big and Complex in Architecture , when Something goes down , it is a head-scratching task for the Developers to find out the root cause. Having a handy list of metrics to monitor at the First hand helps in this regard.

However , since Kafka has a pretty large list of flags and variables working under the hood , it is challenging to look at them all. This makes a List of Primary Metrics even more Productive. Also it is a Good Practice to keep Monitoring these Metrics occasionally to ensure the Health of the Kafka System is Good. Hence , we have compiled the below list of Metrics which are Primarily Important to be under the Radar at all times in a Kafka System. We will see what are the important metrics with respect to Producer , Broker and Consumer in the Kafka ecosystem.  

1. Metrics for Producer:

Kafka producers are not close-knit part of the Kafka ecosystem . But nonetheless certain metrics related to Producers needs to be monitored as producers has to keep publishing data to the broker(s).

  • Rate of Response from Brokers - Producers can get three types of responses from the brokers based on the data received (by the brokers). The scenarios can be -
    • Message received but not committed - request.required.acks == 0
    • Message received and committed(written) at least once (by replica)  request.required.acks == 1
    • Message received and committed(written) by all the replicas request.required.acks == all
So Based on the type of commitment principle used , there could be low response rate from brokers.


response-rate

  • Request Rate - This rate defines the speed at which producers send data to brokers. This should be in tandem with the broker's digestion speed to ensure data is been committed.
 

  • Batch Size - It is efficient to group bunch of messages as a batch and then to send. Default batch size is 16KB. If batch size quota is full and linger.ms(wait time to send a batch) is reached , batch of message is sent.

batch.size

2. Metrics for Broker:

Below are some of the important metrics with respect to the Kafka Broker. Some of the metrics are available through JMX.

  • Number of Active Controller - ONLY ONE PER CLUSTER should be Active Controller. It helps to Select Kafka Leader , Consumer Group Assignment etc. Use the Zookeeper Shell to find out Who Active Controller is .

$ ./bin/zookeeper-shell.sh :2181 get /controller

  • Max Size of the Request to Broker  - The maximum size of any request sent in the window for a broker.

request-size-max


  • Average Size of the Request to Broker  - The average size of all requests in the window for a broker. Compare this to the above Flag to keep track of sizes.

request-size-avg

  • Average Count of Requests to the Broker - The average number of requests sent per second which the broker is handling.

request-rate

  • Average Response from the Broker - This is the Average count of responses received per second from the broker.

response-rate

  • Number of Under-Replicated Partitions - Ideally should Always be ZERO. This ensures the Replication process is not getting lagged.

UnderReplicatedPartitions

  • Number of Offline Partitions - Should Always be ZERO. A Non-Zero number means Partition is down and hence means your topic might be unavailable.

OfflinePartitionsCount

  • Total Broker Partitions - How Many Partitions a Broker is Managing. Keep it Balanced

PartitionCount –----> Number of partitions on this broker


  • Under Minimum ISR Partition Count - Number of partitions whose in-sync replicas count is less than Minimum ISR .

UnderMinIsrPartitionCount

  • Number of partitions on the broker - This should be Even , as far as possible, across all brokers.

PartitionCount

  • Lag in number of messages per follower replica - This helps to understand if the replica is slow or has stopped replicating from the leader.

ConsumerLag,clientId=(\[-.\\w\]+),topic=(\[-.\\w\]+),partition=(\[0-9\]+)

  • Active Connections - The current number of active connections for Producer

    connection-count

 

  • In-Sync-Replicas -   Ideally the count of  in-sync replicas (ISRs) in case of a particular partition stays mostly static. However if you are expanding the Kafka cluster or Deleting partitions in such cases the ISR number would change.

IsrShrinksPerSec/IsrExpandsPerSec

  • Total Time To Service a Request - This metric measures how much time is taken by the broker to serve a request in terms of requesting Producers to send data or requesting consumers to fetch new data or inter-broker request with regard to new data. This value should not change for most of the times . However if it changes rapidly , it obviously would render a slow-down in the request serving process. And therefore it is good idea to cross-check queue, local , remote and response values as this metric is the sum total of these four metrics.

TotalTimeMs

3. Metrics for Consumer:

 

  • Bytes Per Sec - This sets the Average number of bytes consumed per second for a specific topic or across all topics.

bytes-consumed-rate

  • Records Per sec - This defines the Average number of records consumed per second for a specific topic or across all topics

records-consumed-rate

  • Fetch Request Count - This defines how many times fetch requests are coming per second from the consumer

fetch-rate

4. Performance Measuring Tools:

Apache Kafka has some out-of-the-box performance testing tools. These are available as

  • bin/kafka-producer-perf-test.sh

When you run the below command , it will show the metrics with respect to producing a total of messages(5000 in this example). It will also show the following details -

  • The start Time
  • The end time
  • Compression
  • Message size
  • Batch Size
  • Total data sent in MB (for the total 5000 messages in this case)
  • MB/sec,
  • Total number of messages sent (5000 for this example)
  • Total time is second for producing the 5000 messages

bin/kafka-producer-perf-test.sh --broker-list localhost:9092 --messages 5000 --topic <TOPIC\_NAME> 
--broker-list <LIST OF BROKERS> --producer.config config.properties --print-metrics

  • bin/kafka-consumer-perf-test

  When you run the below command , it will show the metrics with respect to consuming a total of messages(5000 in this example). It will also show the following details -

  • The start Time
  • The end time
  • Compression
  • Message size
  • Batch Size
  • Total data sent in MB (for the total 5000 messages in this case)
  • MB/sec,
  • Total number of messages consumed (5000 for this example)
  • Total time is second for consuming the 5000 messages

bin/kafka-consumer-perf-test.sh --broker-list localhost:9092 --messages 5000 --topic <TOPIC\_NAME>
--broker-list <LIST OF BROKERS> --consumer.config config.properties --print-metrics

These are some of the Metrics that are commonly used. However note that these are not the Be-All-and-End-All . Of course based on the Kafka set-up , there would be other Relevant Metrics to be monitored as well. This was just to give you a quick Handy List. Hope this was useful.  

Additional Read  


kafka producer metrics example ,kafka consumer metrics prometheus ,kafka monitoring tools ,kafka metrics prometheus ,kafka monitoring best practices ,kafka consumer metrics java example ,kafka metrics reporter ,kafka metrics api ,kafka monitoring tool ,application monitoring ,application monitoring tools ,application monitoring tools open source ,application performance monitoring ,aws monitor ,elasticsearch for logging ,infrastructure monitor ,kafka monitor ,kubernetes monitor ,kubernetes monitoring tools ,log kafka ,log monitoring ,log monitoring tools ,application performance monitoring ,java monitoring ,kafka monitor ,log kafka ,linux monitor performance ,kafka jmx metrics ,monitoring kafka ,confluent kafka ,confluent's ,kafka connection ,kafka stream ,kafka ,streaming kafka ,kafka apache ,kafka is ,kafka monitoring tool ,cloud monitor ,elk logs ,elk monitoring ,grafana monitor ,kafka jmx ,apache kafka performance metrics ,confluent kafka performance metrics ,kafka broker performance metrics ,kafka performance metrics ,kafka performance metrics definition ,kafka performance metrics download ,kafka performance metrics example ,kafka performance metrics examples ,kafka performance metrics framework ,kafka performance metrics geeksforgeeks ,kafka performance metrics github ,kafka performance metrics header ,kafka performance metrics healthcare ,kafka performance metrics include ,kafka performance metrics java ,kafka performance metrics javascript ,kafka performance metrics jmeter ,kafka performance metrics json ,kafka performance metrics keras ,kafka performance metrics kit ,kafka performance metrics kubernetes ,kafka performance metrics library ,kafka performance metrics list ,kafka performance metrics login ,kafka performance metrics network ,kafka performance metrics nhs ,kafka performance metrics not found ,kafka performance metrics not working ,kafka performance metrics online ,kafka performance metrics oracle ,kafka performance metrics query ,kafka performance metrics query example ,kafka performance metrics quotes ,kafka performance metrics url ,kafka performance metrics website ,kafka performance metrics wiki ,kafka performance metrics windows ,kafka performance metrics xls ,kafka performance metrics xml ,kafka performance metrics yaml ,kafka performance metrics youtube ,kafka performance metrics zoho ,kafka performance monitoring ,kafka performance testing metrics ,kafka producer performance metrics ,monitoring kafka performance metrics ,apache kafka monitoring metrics ,confluent kafka monitoring metrics ,kafka connect metrics monitoring ,kafka connect monitoring metrics ,kafka jmx metrics monitoring ,kafka monitor performance metrics ,kafka monitoring metrics ,kafka monitoring metrics bundesbank ,kafka monitoring metrics definition ,kafka monitoring metrics deutsch ,kafka monitoring metrics download ,kafka monitoring metrics eba ,kafka monitoring metrics example ,kafka monitoring metrics examples ,kafka monitoring metrics github ,kafka monitoring metrics header ,kafka monitoring metrics keras ,kafka monitoring metrics kubernetes ,kafka monitoring metrics linux ,kafka monitoring metrics list ,kafka monitoring metrics location ,kafka monitoring metrics login ,kafka monitoring metrics not found ,kafka monitoring metrics not working ,kafka monitoring metrics online ,kafka monitoring metrics query ,kafka monitoring metrics quora ,kafka monitoring metrics script ,kafka monitoring metrics tutorial ,kafka monitoring metrics url ,kafka monitoring metrics xcode ,kafka monitoring metrics xls ,kafka monitoring metrics xml ,kafka monitoring metrics yaml ,kafka monitoring metrics youtube ,kafka monitoring metrics zoho ,monitoring kafka connect metrics ,monitoring kafka performance metrics
how to monitor a kafka cluster ,how to monitor apache kafka ,how to monitor kafka ,how to monitor kafka broker ,how to monitor kafka cluster ,how to monitor kafka connect ,how to monitor kafka connectors ,how to monitor kafka consumer lag ,how to monitor kafka consumers ,how to monitor kafka dashboard ,how to monitor kafka eclipse ,how to monitor kafka file ,how to monitor kafka key ,how to monitor kafka kit ,how to monitor kafka lag ,how to monitor kafka messages ,how to monitor kafka metrics ,how to monitor kafka notes ,how to monitor kafka performance ,how to monitor kafka producer ,how to monitor kafka queue ,how to monitor kafka remote ,how to monitor kafka streams ,how to monitor kafka topics ,how to monitor kafka using prometheus ,how to monitor kafka version ,how to monitor kafka view ,how to monitor kafka with grafana ,how to monitor kafka with jmx ,how to monitor kafka with splunk ,how to monitor kafka xamarin ,how to monitor kafka xampp ,how to monitor kafka youtube ,how to monitor kafka zookeeper ,how to use kafka-monitor