Kafka Monitoring Technical Reference

General information

Overview

Kafka monitoring is a Gateway configuration file that enables monitoring of Kafka Brokers through a set of samplers with customised JMX plug-in settings.

Kafka is a distributed streaming platform that allows you to:

  • Publish and subscribe to stream of records.
  • Store streams of records in a fault-tolerant way.
  • Process streams of records as they occur.

It is important to monitor Kafka because it carries crucial data that many applications rely on. Geneos provides a JMX server sampler configuration to monitor Kafka.

This technical reference provides information on the metrics and dataviews for the samplers available through the Kafka integration. If you are setting up the Kafka integration for the first time, see Kafka Monitoring User Guide.

Metrics and dataviews

Kafka monitoring dataviews

The JMX Server sampler configurations are used to monitor Kafka.

Kafka broker (per broker metrics)

This provides the state of the Kafka broker:

Column Name Description
Kafka Row name.
Version Kafka binary version.
State State of the Kafka broker.
Kafka Status Manipulated base of the Kafka state value. The following are available:
  • Broker State = 0: Not Running extends Broker States
  • Broker State = 1: Starting extends Broker States
  • Broker State = 2: Recovering from Unclean Shutdown extends Broker States
  • Broker State = 3: Running as Broker extends Broker States
  • Broker State = 4: Running as Controller extends Broker States
  • Broker State = 5: Pending Controlled Shutdown extends Broker States
  • Broker State = 6: Broker Shutting Down extends Broker States
PartitionCount Total number of partitions for all topics in the broker which is is usually even across all brokers.
LeaderCount Leader Replica Count. The Leader is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.
UnderReplicatedPartitions Number of partitions under replicated per broker. Replicas are the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently active.
ActiveControllerCount The number of active controllers in the cluster. One of the brokers is elected as the controller for the whole cluster. It will be responsible for:
  • leadership change of a partition (each leader can independently update ISR).
  • new topics.
  • deleted topics.
  • replica reassignment.
OfflinePartitionsCount The number of partitions that do not have an active leader and are hence not writable or readable.
PreferredReplicaImbalanceCount The imbalance count in the preferred replica.
IsrExpand If a broker goes down, the ISR for some partitions will shrink. When that broker is up again, the ISR will be expanded once the replicas are fully caught up. Other than that, the expected value for both the ISR shrink and expansion rates is 0.
IsrShrink When a broker is brought up after a failure, it starts syncing by reading from the leader. Once synced, it gets added back to the ISR.
   

ISR is the set of in-sync replicas. This is the subset of the replicas list that is currently alive and synced with the leader.

MBeans for Kafka Broker

  • kafka.server:type=KafkaServer,name=BrokerState
  • kafka.server:type=app-info,id=0
  • kafka.server:type=ReplicaManager,name=PartitionCount
  • kafka.server:type=ReplicaManager,name=LeaderCount
  • kafka.controller:type=KafkaController,name=ActiveControllerCount
  • kafka.controller:type=KafkaController,name=OfflinePartitionsCount
  • kafka.controller:type=KafkaController,name=PreferredReplicaImbalanceCount
  • kafka.server:type=ReplicaManager,name=IsrShrinksPerSec
  • kafka.server:type=ReplicaManager,name=IsrExpandsPerSec

Kafka topics (per topic metrics)

This provides all the metrics available for a topic in the broker:

Column Description
ID Topic and matrix names.
Topic Topic name.
Name Matrix name.
Count / EventType / MeanRate / RateUnit Attribute values.
   

MBeans for Kafka-Topics

  • kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=*
  • kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topic=*
  • kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec,topic=*
  • kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec,topic=*
  • kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=*
  • kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec,topic=*

Kafka cluster

This shows the number of partitions that do not have an active leader and are hence not writable or readable per topic for the entire cluster.

Column Description
Name Topic name and partition number.
Topic Topic name.
Partition Partition number.
UnderReplicatedPartition Number of under replicated partitions.
   

MBeans for Kafka-Cluster Metrics

  • kafka.cluster:type=Partition,name=UnderReplicated,topic=*,partition=*

Kafka heap memory usage

Column Description
Committed Amount of memory in bytes that is committed for the Java virtual machine to use.
UsageInit Amount of memory in bytes that the Java virtual machine initially requests from the operating system for memory management.
UsageMax Maximum amount of memory in bytes that can be used for memory management.
UsageUsed Amount of used memory in bytes.
PercentageUsed Percentage of maximum usable memory currently used.
   

MBeans for Kafka-HeapMemoryUsage

  • java.lang:type=Memory