Skip to main content

General

Statistical Anomaly Detection

Detect hidden patterns and deviations using statistical algorithms without setting hard thresholds.

Hard thresholds (e.g., "Alert if Temp > 100") are useful but limited. They fail to detect subtle issues, like a machine vibrating slightly more than usual but still within "safe" limits.

For standard threshold-based logic, see the Visual Rule Editor.

Proxus Anomaly Detection uses statistical algorithms to establish the "normal" baseline of your data stream in real-time and flag deviations automatically.

Filtering & Scope (Criteria)

Anomaly Detection works in conjunction with the Criteria Expression field to define the scope of analysis.

  • No Criteria (Empty): The algorithm analyzes all numeric metrics in the incoming data stream for the devices targeting this rule.
  • With Criteria: The algorithm acts only on data packets that match the criteria. The criteria serves as a pre-filter.
  • If a packet does not match the criteria, it is ignored and does not affect the statistical model.
  • If it matches, the payload values are fed into the algorithm.

Example Use Case: To detect anomalies only on specific sensors (e.g., "Vibration"), you should use a criteria filter:

[Payload][Key = 'Vibration']

Algorithms & Use Cases

Z-Score (Standard Deviation)

Detects outliers based on how far a value is from the mean (average). Best for stable processes with normally distributed data (Bell Curve).

  • Use Case: Monitoring a stable temperature in a controlled clean room.
  • Scenario: Room temperature is normally 22°C ±0.5°C. A sudden spike to 24°C is statistically significant even if it doesn't breach a high limit.
  • Recommended Settings:
  • AnalysisWindow: 100
  • ZScoreThreshold: 3.0 (Triggers at > 3 Sigma deviation)

Rolling Median

Detects deviation from the median value over a window. More robust against random noise/spikes than Z-Score.

  • Use Case: Sensor data with occasional electrical noise or glitches.
  • Scenario: A pressure sensor reads 50, 51, 50, 200 (noise), 50. The median remains ~50, so the 200 spike is instantly flagged without skewing the baseline.
  • Recommended Settings:
  • AnalysisWindow: 50
  • RollingMedianMultiplier: 2.5

Rate of Change (Velocity)

Detects how fast a value is changing between two consecutive data points.

  • Use Case: Detecting leaks or sudden mechanical failures.
  • Scenario: A fuel tank level decreases slowly over days. If it suddenly drops by 5% in 1 second, it indicates a rupture, even if the tank is still half full.
  • Recommended Settings:
  • RateOfChangeThreshold: 10.0 (Max allowed change per update)

Exponential Moving Average (EMA)

Weights recent data more heavily than older data. Good for detecting trend shifts (e.g., drift).

  • Use Case: Detecting wear and tear that causes gradual performance degradation.
  • Scenario: A motor's current draw slowly creeps up from 10A to 12A over an hour due to friction. EMA tracks this drift better than a simple average.
  • Recommended Settings:
  • EmaAlpha: 0.2 (Weight for new data)
  • EmaDifferenceMultiplier: 2.0 (Sensitivity)

Interquartile Range (IQR)

A robust method for outlier detection that ignores the top/bottom 25% of values. Ideal for non-normal distributions.

  • Use Case: Process data with frequent, expected operational spikes (e.g., machine startup cycles).
  • Scenario: A packaging machine consumes high power during sealing (expected). IQR learns this "normal range" including the spikes, but flags a sustained over-current.
  • Recommended Settings:
  • AnalysisWindow: 200
  • IqrMultiplier: 1.5

Moving Average

Simple deviation from the unweighted average.

  • Use Case: General-purpose smoothing and deviation detection.
  • Scenario: Ensuring a conveyor belt speed stays consistent.
  • Recommended Settings:
  • AnalysisWindow: 60
  • MovingAverageThreshold: 5.0 (Absolute allowed deviation)

Key Parameters Explained

Analysis Window

Defines how many historical data points to keep in memory to calculate the statistical baseline.

  • Default: 100 points.

  • Trade-off:

  • Small Window (<50): Reacts fast to new norms, but might miss slow-developing anomalies. * Large Window (>200): Very stable baseline, good for detecting long-term drift, but consumes more memory per rule.

    Thresholds (Sensitivity)

    Each algorithm has a sensitivity parameter.

    • Lower Value: More sensitive. May trigger alerts on minor deviations.
    • Higher Value: Less sensitive. Triggers only on significant anomalies.
    info
    Real-Time Calculation

    These algorithms operate in-memory on the Edge Gateway. When a Gateway restarts, the statistical baseline is rebuilt over the first few Analysis Window packets.