State investigation

Value and Problem Solved

The "State Insights" feature provides users with comprehensive insights into the state and checkpoints of their Flink jobs. It helps users:

Identify bottlenecks, data skew, and resource utilization issues.
Understand the flow and behavior of state size, checkpoints status and operator throughput.
Correlate performance metrics to checkpoints status and logs during pipeline execution.

This feature simplifies debugging and optimization by presenting detailed state and checkpoint, enabling users to react to issues and optimize their pipelines effectively.

Functionality and How It Works

Overview Boxes

Summarize Checkpoint related detalis and metrics such as:

The number of Checkpoints - sucess and fail
The number of restarts
CPU and Memory size of

State Size

An over time job state size tracking to detect and monitor status and issues.
state current size.

Operator throughput

Enabales the user to view the throughput of any operator in order to detect and view spikes during checkpoints

Checkpoints and Jobs performance Overtime

a correlated graph to view a jobs CPU and memory usage along side the checkpoints Every bar is a bucket of checkpoints where:

sucessful checkpoint is colored in green
fail checkpoint is colored in related

hovering over any bucket displayes additional information such as:

TIme frame
Checkpoints status
CPU and memory

Checkpoints Table

-A detailed list of all checkpoints, their duraion, size and status.

The user could sort and filter by any column in order to find the issue faster

-A list of all state related logs sorted by time and detialed. -The user could sort and filter by any column

Interaction with the Timeline

Users can drag and drop the timeline touchpoint using a brush to navigate to checkpoints
All graphs and logs will updates accordingly.

Value and Problem Solved​

Functionality and How It Works​

Overview Boxes​

State Size​

Operator throughput​

Checkpoints and Jobs performance Overtime​

Checkpoints Table​

State related logs​

Interaction with the Timeline​