Advanced Metrics
Advanced metrics in BlazeBee are designed for environments where basic system visibility is insufficient. These collectors expose low-level kernel, hardware, and service-specific signals that are critical for troubleshooting performance degradation, capacity issues, and hardware anomalies.
Advanced collectors are disabled by default and must be explicitly enabled at build time and in configuration.
Purpose
Section titled “Purpose”This page describes:
- What advanced collectors are available
- What subsystems they observe
- When and why they should be used
- How they integrate into the standard metrics pipeline
These metrics are intended for experienced operators and diagnostic workflows.
Available Advanced Collectors
Section titled “Available Advanced Collectors”Thermal
Section titled “Thermal”Collector name: thermal
- Reads temperature data from thermal zones
- Sources include
/sys/class/thermal - Reports per-sensor temperatures in degrees
Use cases:
- Detect overheating CPUs or SoCs
- Monitor passive cooling efficiency
- Prevent thermal throttling
Collector name: power
- Exposes battery and power-supply information
- Reads from
/sys/class/power_supply - Includes charge level, voltage, current, and status
Use cases:
- Edge devices
- Battery-backed systems
- Power consumption diagnostics
Pressure (PSI)
Section titled “Pressure (PSI)”Collector name: pressure
- Reports Linux PSI (Pressure Stall Information)
- Covers CPU, memory, and IO pressure
- Indicates how often tasks are stalled due to resource contention
Use cases:
- Diagnosing latency spikes
- Capacity planning
- Identifying hidden resource saturation
Systemd
Section titled “Systemd”Collector name: systemd
- Queries systemd unit states
- Tracks service health and activation status
- Requires systemd-based Linux distributions
Use cases:
- Service availability monitoring
- Detecting failed or flapping units
- Infrastructure observability without agents
Collector name: ntp
- Reports clock offset and synchronization state
- Uses system time sources
- Indicates drift relative to reference clocks
Use cases:
- Distributed systems
- Time-sensitive workloads
- Debugging clock skew issues
Hardware Monitoring (HWMON)
Section titled “Hardware Monitoring (HWMON)”Collector name: hwmon
- Reads hardware sensors via
/sys/class/hwmon - Includes voltages, fan speeds, temperatures
Use cases:
- Bare-metal monitoring
- Detecting failing components
- Environmental diagnostics
Network Statistics (Extended)
Section titled “Network Statistics (Extended)”Collector names: arp, netstat, conntrack
- ARP cache statistics
- Kernel network counters
- Connection tracking table usage
Use cases:
- Network debugging
- Detecting connection leaks
- Firewall and NAT diagnostics
File Descriptors
Section titled “File Descriptors”Collector name: filefd
- Reports file descriptor limits and usage
- Reads from
/proc/sys/fsand process tables
Use cases:
- Preventing FD exhaustion
- Debugging connection-heavy services
- Capacity tuning
Additional Advanced Collectors
Section titled “Additional Advanced Collectors”Depending on build features, BlazeBee may also support:
mdraid— RAID array healthedac— memory error detectionschedstat— scheduler statisticsentropy— kernel entropy poolfilesystem— detailed FS statspowercap— RAPL power limits
Enabling Advanced Metrics
Section titled “Enabling Advanced Metrics”Build-Time Requirement
Section titled “Build-Time Requirement”Advanced collectors are included only when built with the large feature set or explicit collector features.
cargo build --release --features "blazebee-mqtt-v3 large"Runtime Configuration
Section titled “Runtime Configuration”Each advanced collector must be enabled in config.toml:
[[metrics.collectors.enabled]]name = "pressure"[metrics.collectors.enabled.metadata]topic = "metrics/pressure"qos = 1retain = true- One block per collector
- Topic structure is fully user-defined
- QoS and retain flags apply per collector
Operational Considerations
Section titled “Operational Considerations”- Some collectors require elevated privileges
- Linux-only functionality
- Increased I/O and parsing overhead
- Recommended for diagnostic or targeted deployments
Summary
Section titled “Summary”Advanced metrics provide deep visibility into system internals. They are powerful but should be enabled selectively. When used appropriately, they enable early detection of failures, performance bottlenecks, and hardware issues that are invisible to standard monitoring.