Metrics
Diffusion™ metrics provide information about the server, client sessions, topics and log events. Diffusion can provide metrics in three main ways: via the web console, via JMX-compatible MBeans and via Prometheus.
Methods of accessing metrics
There are multiple ways to access the metrics. As of Diffusion 6.3, the same information is available through each access method.
- Web console metrics
- The metrics are available through the Diffusion web console. This is the most convenient way to access metrics for development and testing purposes, but does not support aggregating metrics across multiple servers or recording and retrieving historical data. JMX or Prometheus access are more suitable for production systems.
- MBeans for JMX
- Diffusion registers MBeans with the Java Management Extensions (JMX) service. This enables monitoring of the metrics using the JMX tools that are available from a range of vendors.
- Prometheus
- Diffusion provides endpoints for the Prometheus monitoring system. To use Prometheus, your Diffusion server needs to have a Commercial with Scale & Availability license, or an evaluation license such as the Community Evaluation license. See License types for more information.
Accessing metrics
The metrics can be accessed in the following recommended ways:
- As MBeans, using a JMX tool, such as VisualVM or JConsole. See the table below for MBean interfaces. For more information, see Using Java VisualVM or Using JConsole.
- Using the Diffusion management console. For more information, see Diffusion management console.
- As Prometheus endpoints at http://localhost:8080/metrics, provided
you have a suitable license.
If not accessing from the same machine as the
Diffusion
server,
replace localhost with the IP address or hostname.
You can change the request path or disable the Prometheus service by changing the http-service binding in the webserver configuration. See Configuring the Diffusion web server.
The Prometheus service respects the Accept request header to determine whether to use the legacy Prometheus 0.0.4 or the OpenMetrics 1.0.0 text format. See https://prometheus.io/docs/instrumenting/exposition_formats/.
Collecting custom metrics using metric collectors
A metric collector is a way to collect metrics for a particular set of topics or sessions, configured by you.
You can use the Diffusion web console or JMX to define metric collectors. See Configuring metrics for details.
Collected metrics are published to the console, JMX and optionally via Prometheus.
Metric types
Metrics are divided into counters, gauges, and info metrics. These have the same meaning as OpenMetrics types used by Prometheus.
- Counter metric
- A counter metric has a cumulative value. The value is initialized to zero when the server is started and will only increase over a server's lifetime. For example, the total number of bytes received by the server is reported using a counter metric.
- Gauge metric
- A gauge metric has a value that can increase and decrease. For example, the number of connected sessions is reported using a gauge metric.
- Info metric
- An info metric reports informational text about the server. For example, the release version of the server is reported using an info metric.
Built-in metrics
This section describes the built-in metrics that are always available, aside from any metric collectors you may have created.
Metrics are not persisted between server restarts. Restarting the server will set all counter metrics back to zero.
The following is a list of all the top level statistics and their attributes.
- The metric name may differ. In this case, the legacy name is listed in the column 'Legacy export'.
- The Info type does not exist in the legacy format. If using the legacy format, 'Info' should be read as 'Gauge'.
Metric name | Type | Description | OpenMetrics export | Legacy export |
---|---|---|---|---|
Log metrics | MBean | |||
count | Counter | Number of log events for a given ID code and severity level (levels are error, warn, info, debug, trace). | diffusion_log_events_total{code="PUSH-12345",level="warn"} | diffusion_log_events{code="PUSH-12345",level="warn"} |
Network metrics | MBean | |||
inbound_bytes | Counter | Data received from the network in bytes. | diffusion_network_inbound_bytes_total | diffusion_network_inbound_bytes_count |
outbound_bytes | Counter | Data sent to the network in bytes. | diffusion_network_outbound_bytes_total | diffusion_network_outbound_bytes |
Remote server metrics | MBean | |||
bytes | Gauge | Stored data replicated from remote servers, in bytes. | diffusion_remote_server_bytes | |
Session metrics | MBean | |||
connected | Gauge | Number of connected sessions. | diffusion_sessions_connected | |
inbound_bytes | Counter | Session data received from the network in bytes. | diffusion_sessions_inbound_bytes_total | diffusion_sessions_inbound_bytes |
inbound_messages | Counter | Session data received from the network in messages. | diffusion_sessions_inbound_messages_total | diffusion_sessions_inbound_messages |
open | Gauge | Number of open sessions. | diffusion_sessions_open | |
outbound_bytes | Counter | Session data sent to the network in bytes. | diffusion_sessions_outbound_bytes_total | diffusion_sessions_outbound_bytes |
outbound_messages | Counter | Session data sent to the network in messages. | diffusion_sessions_outbound_messages_total | diffusion_sessions_outbound_messages |
peak | Gauge | Peak number of sessions. | diffusion_sessions_peak | |
total | Counter | Total sessions opened. | diffusion_sessions_total | |
Topic metrics | MBean | |||
count | Gauge | Current number of topics. | diffusion_topics_current | diffusion_topics_count |
total | Counter | Total number of topics. | diffusion_topics_total | |
bytes | Gauge | The value data stored by the topics, in bytes. | diffusion_topics_bytes | |
persistence_bytes | Gauge | The value data stored to persist the topic, in bytes. | diffusion_topics_persistence_bytes | |
subscriptions | Gauge | Number of direct subscriptions to the topics. | diffusion_topics_subscriptions | |
subscribers | Gauge | Number of sessions subscribed to one or more topics. | diffusion_topics_subscribers | |
subscriber_updates | Counter | Number of updates sent to subscribers. | diffusion_topics_subscriber_updates_total | diffusion_topics_subscriber_updates |
subscriber_update_bytes | Counter | Data sent to subscribers, before message compression, in bytes. | diffusion_topics_subscriber_update_bytes_total | diffusion_topics_subscriber_update_bytes |
subscriber_update_compressed_bytes | Counter | Data sent to subscribers, after message compression, in bytes. | diffusion_topics_subscriber_update_compressed_bytes_total | diffusion_topics_subscriber_update_compressed_bytes |
value_updates | Counter | Number of updates to a topic that provide a full value. | diffusion_topics_value_updates_total | diffusion_topics_value_updates |
value_count | Gauge | Number of values held by topics. | diffusion_topics_value_count | |
delta_updates | Counter | Number of updates to a topic that provide a partial value. | diffusion_topics_delta_updates_total | diffusion_topics_delta_updates |
value_bytes | Counter | On each change of topic value, this metric increases by the size of the new value. | diffusion_topics_value_bytes_total | diffusion_topics_value_bytes |
delta_bytes | Counter | On each change of topic value, this metric increases by the size of an internal delta representing the difference the previous and new values. | diffusion_topics_delta_bytes_total | diffusion_topics_delta_bytes |
Server metrics | MBean | |||
release | Info | Diffusion release information. | diffusion_release_info | diffusion_release |
license_properties | Info | Diffusion license information. | diffusion_license_info | diffusion_license |
free_memory | Gauge | Free memory available in the java heap. | diffusion_server_free_memory_bytes | |
license_expiry_date | Info | License expiry date. | diffusion_server_license_expiry_date | |
max_memory | Gauge | Maximum java heap memory that can be allocated. | diffusion_server_max_memory_bytes | |
number_of_topics | Gauge | Number of topics hosted by this server. | diffusion_server_number_of_topics | |
session_locks | Info | Allocated session locks. | diffusion_server_session_locks | |
start_date | Info | Date and time at which this server was started. | diffusion_server_start_date | |
start_date_millis | Gauge | Time at which this server was started, as milliseconds since the epoch. | diffusion_server_start_date_millis | |
time_zone | Info | Time zone this server is using. | diffusion_server_time_zone | |
total_memory | Gauge | Total memory allocated to the java heap. | diffusion_server_total_memory_bytes | |
uptime | Info | Time this server has been running, as a formatted string. for example, "3 hours 4 minutes 23 seconds". | diffusion_server_uptime | |
uptime_millis | Gauge | Time this server has been running, in milliseconds. | diffusion_server_uptime_millis | |
used_physical_memory_size | Gauge | Used physical memory, in bytes. | diffusion_server_used_physical_memory_size_bytes | |
used_swap_space_size | Gauge | Used swap space, in bytes. | diffusion_server_used_swap_space_size_bytes | |
user_directory | Info | Directory in which this server was started. | diffusion_server_user_directory | |
user_name | Info | User account under which this server is running. | diffusion_server_user_name | |
Operating System Metrics | OperatingSystem MBean | |||
os_architecture | Info | Operating system architecture. | os_architecture | |
os_name | Info | Operating system name. | os_name | |
os_version | Info | Operating system version. | os_version | |
os_max_file_descriptors | Gauge | Maximum number of open file descriptors. | os_max_file_descriptors | |
os_available_processors | Gauge | Available processors. | os_available_processors | |
os_physical_memory_bytes | Gauge | Physical memory size in bytes. | os_physical_memory_bytes | |
os_process_cpu_load | Gauge | Server cpu utilization. | os_process_cpu_load | |
os_system_cpu_load | Gauge | System cpu utilization. | os_system_cpu_load | |
os_system_load_average | Gauge | System load average. | os_system_load_average | |
Memory Metrics | Memory MBean | |||
java_memory_heap_usage | Gauge | Jvm heap memory usage. | java_memory_heap_usage | |
java_memory_non_heap_usage | Gauge | Jvm non-heap memory usage. | java_memory_non_heap_usage | |
Connector Metrics | MBean | |||
keep_alive_queue_maximum_depth | Gauge | The maximum queue depth used for clients in the keep-alive state. | diffusion_connector_keep_alive_queue_maximum_depth | |
keep_alive_time | Gauge | The time in milliseconds that an unexpectedly disconnected client is kept alive before closing. | diffusion_connector_keep_alive_time | |
number_of_acceptors | Gauge | The number of acceptors. | diffusion_connector_number_of_acceptors | |
queue_definition | Info | The queue definition. | diffusion_connector_queue_definition | |
total_number_of_connections | Counter | The number of connections accepted since the connector was started. | diffusion_connector_total_number_of_connections_total | |
uptime | Info | The time this connector has been running as a formatted string, or 0 if the connector is not running. | diffusion_connector_uptime | |
uptime_millis | Gauge | The time this connector has been running in milliseconds, or 0 if the connector is not running. | diffusion_connector_uptime_millis | |
Client Statistics Metrics | MBean | |||
client_output_frequency | Gauge | Statistics output frequency in milliseconds. | diffusion_client_statistics_client_output_frequency | |
client_reset_frequency | Gauge | The frequency at which the counters are reset. | diffusion_client_statistics_client_reset_frequency | |
concurrent_client_count | Gauge | The current client session count. | diffusion_client_statistics_concurrent_client_count | |
connection_counts | Info | The current client session count, broken down by client type. | diffusion_client_statistics_connection_counts | |
maximum_concurrent_client_count | Gauge | The maximum number of concurrent client sessions. | diffusion_client_statistics_maximum_concurrent_client_count | |
maximum_daily_client_count | Gauge | The count of client sessions started in a day. | diffusion_client_statistics_maximum_daily_client_count | |
Multiplexer Manager Metrics | MBean | |||
number_of_multiplexers | Gauge | The number of multiplexers. | diffusion_multiplexer_manager_number_of_multiplexers | |
System Properties Metrics | ||||
java_version | Info | Java version. | java_version | |
java_vendor | Info | Java vendor. | java_vendor | |
java_vm_name | Info | Java VM name. | java_vm_name |
Delta compression ratio
value_bytes and delta_bytes can be used to capture the theoretical delta compression ratio of the application data flowing through the topics. Both the console and the JMX MBean perform this calculation. The ratio is a value between 0 and 1. The closer the ratio is to 1, the more benefit the application data will obtain from delta streaming. If value_bytes is 0, there have been no updates, so the delta compression ratio is reported as zero. Otherwise it is calculated as:
1 - delta_bytes / value_bytes
Delta streaming is enabled for subscriptions by default, but can be disabled on a per-topic basis using the PUBLISH_VALUES_ONLY topic property. If delta streaming is enabled, a stable set of subscribers remain connected, and no session has a significant backlog (so conflation is not applied), the following relationship should hold:
subscriber_update_bytes ≅ delta_updates x subscribers
Delta streaming can also be used to update topic values. If the delta compression ratio is high, but delta_updates is zero (or low, relative to value_updates), consider whether your application can use the stateful update stream API to take advantage of delta streaming.
Log metrics
Log metrics record information about server log events. Separate metrics are kept for each unique pair of log code and log severity level that has been logged.
The log severity levels are: error, warn, info, debug, trace.
A JMX MBean is created for each pair of log code and log severity that has been logged at least once.
Here is an example MBean name: com.pushtechnology.diffusion:type=LogMetrics,server="server_name",level=warncode=PUSH-12345
Session metrics versus network metrics
The network inbound_bytes and outbound_bytes metrics include bytes that are not counted by the equivalent session metrics.
The session metrics include bytes from transport framing and all session traffic (including additional HTTP traffic from long polling).
- TLS overhead
- Web server traffic (for example, browsers downloading the web console pages)
- Rejected connection attempts