
Jan 10, 2025
Designing a Rule Engine for Industrial Automation
Master industrial rule engines: from simple alerts to complex multi-device workflows. Learn how Proxus combines visual rules with C# scripting to handle 10,000+ rules per second with zero allocation overhead.
Industrial automation projects typically start with a simple requirement:
"Send an alarm if this tag exceeds that value."
But real plants quickly outgrow simple threshold checks. Within weeks, customers ask for:
- Multi‑step workflows (IF X THEN Y, THEN Z)
- Time-based dependencies (IF X for >5 minutes THEN...)
- Dependencies across multiple lines or sites (IF Line1.OEE < 60% AND Line2.Running THEN...) - Integration with ERP, CMMS, and ticketing systems (THEN create SAP maintenance order) - Feedback loops and state machines (Rule A triggers Rule B, which triggers Rule A again?) Most off-the-shelf rule engines falter here. They either offer simplicity (and shatter under complex logic) or complexity (and require a PhD to understand). Proxus takes a different approach: combining visual rules for the 80% case with code for the 20% case, all on a single, unified execution runtime that scales to 10,000+ rules per second. ## The Problem with Traditional Rule Engines Before diving into Proxus's solution, let's understand why rule engines are so hard: ### Challenge #1: Scope Creep A simple threshold rule grows into a state machine. What started as "alert when temp > 80°C" becomes "alert when temp > 80°C for >2 minutes, but not if we're in cooldown mode, but do alert if temp > 85°C immediately."
Challenge #2: Multi-Device
Correlation "Trigger alarm when Robot A is running AND Pressure Sensor B shows declining pressure AND not in maintenance window." This requires: - Reading multiple devices - Correlating their states - Checking context (time windows, override flags)Triggering actions conditionally ### Challenge #3: Performance at Scale A plant with 5,000 devices and 2,000 rules needs to evaluate millions of conditions per second. A naive approach (query database for each rule, parse JSON) hits performance limits fast. ### Challenge #4: Versioning & Deployment When you update a rule, how do you: - Test it without affecting production? - Roll back if it causes issues? - Track which version is running where? Most rule engines offer poor answers. ## Proxus Rule Engine: Visual + Code Unified Proxus solves these by offering two complementary approaches on a single runtime: ## 1. Visual Rule Builder (No-Code) For engineers without programming background, the visual rule builder provides an intuitive drag-and-drop interface. Think of it as "Excel for automation." ### Building Blocks #### Triggers (When to evaluate) - Tag change: Rule triggers when a specific tag changes value - Schedule: Rule triggers on time-based patterns (daily 8 AM, every hour, etc.) - Event from external system: Rule triggers when MES publishes a message or webhook fires - Manual trigger: Operators can trigger rules via UI for testing #### Conditions (What to check) - Simple comparisons:
Temperature> 80Range checks:
Humidity between 40% and 60%Boolean logic:
(A > 10) AND (B < 5) OR (C=="Fault" )- Time windows:During 06:00-18:00orNot during maintenance_window- State checks:If device status=="Running"- Thresholds with hysteresis: Prevents flickering (trigger at>80, clear at <75) #### Actions (What to do) - Notifications: Email, SMS, Slack, Teams - PLC writes: Write values back to devices (turn on pump, close valve) - MQTT publish: Send messages to the UNS for other systems to consume - HTTP calls: Webhook integration (trigger webhook on external API) - Create tickets: Automatic CMMS/SAP integration (create maintenance order, change request) - Log event: Record to local database or ClickHouse - Escalation: If not acknowledged in X minutes, escalate to higher tierExample Visual Rule ``` TRIGGER: Tag "Line1.Robot.Temperature" changes CONDITION: IF Temperature> 85°C
AND Robot.Running == true AND NOT (MaintenanceMode == true) AND (Alert_Temperature_Cooldown < 5 minutes ago) ACTION: - Stop Robot (write 0 to "Line1.Robot.MotorCommand" ) - Publish to MQTT: "ProxusMfg/Istanbul/Line1/alerts/TemperatureEmergency" - Create SAP notification: "Emergency stop triggered - high temperature" - Set flag: Alert_Temperature_Cooldown=now() - Send Slack: "@production_team Robot stopped due to temperature"
## 2. C# Scripting for Power Users When visual rules cannot express the logic, drop down to **C# code**. Proxus provides a well-defined SDK: ### Why C#? - **Familiar**: C# is widely used in industry - **Safe**: Compiled, typed (no runtime interpretation errors) - **Fast**: JIT-compiled to machine code - **Batteries included**: LINQ, async/await, memory pools ### Example: Multi-Device Correlation with MLcsharp public class PredictiveMaintenanceRule : RuleScript { public async Task EvaluateAsync(RuleContext context) { // Get latest values for multiple devices var vibration=context.GetTag("Production/Motor/Vibration"); var temperature=context.GetTag("Production/Motor/Temperature"); var age_hours=context.GetTag("Production/Motor/AgeHours"); // Complex logic: apply ML model var anomaly_score=await context.AI.ScoreAsync(new[] { vibration, temperature, age_hours }); if (anomaly_score> 0.85) { // Multi-step action await context.Publish("alert/motor_failure_risk", new { score = anomaly_score, timestamp = DateTime.UtcNow, recommended_action = "Schedule maintenance within 48 hours" });// Create CMMS ticket await context.External.CreateTicket(new TicketRequest { System = "SAP", Type = "Maintenance", Priority = "High", Description = $"Motor failure predicted (ML score: {anomaly_score:P})" }); } } } ```
### Auto-Injected Safety
Proxus automatically wraps your code with:
csharp // Your code is wrapped like this: try { // Your code executes here // With automatic disposal of resources } catch (Exception ex) { logger.LogError($"Rule failed: {ex.Message}"); // Automatic alerting }This means you cannot accidentally crash the platform or leave resources open.
### Zero-Allocation Patterns
For rules that execute 1,000+ times per second, Proxus recommends pooling:
```csharp // Allocate once, reuse forever private static readonly ArrayPool<byte> _pool = ArrayPool<byte>.Shared;
public async Task EvaluateAsync(RuleContext context) { // Rent from pool, not allocate var buffer = _pool.Rent(1024); try { // Use buffer ProcessData(buffer); } finally { _pool.Return(buffer); } } ```
This minimizes garbage collection pressure (critical for high-performance rules).
## 3. A Single Execution Model
Here is the magic: both visual rules and C# scripts compile to the same actor-based runtime. This means:
### Unified Performance ``` Visual rule: Compiled to bytecode → Actor execution engine C# script: Compiled to IL → Actor execution engine
Result: Same performance profile, same execution semantics ```
### Horizontal Scalability ``` Single gateway: 10,000 rules/second ✓ 2 gateways: 20,000 rules/second ✓ 10 gateways: 100,000 rules/second ✓
Rules execute in parallel across gateways with zero coordination needed ```
### Clear Evolution Path
Start: Simple visual rule Grow: Add conditions and time windows (still visual) Scale: Extract complex logic to C# (reuse visual rule as scaffold) Evolve: Full C# engine with caching and optimizationNo rewriting. No rip-and-replace. Just smooth evolution.
## Advanced Use Cases
### Use Case 1: OEE Calculation ``` Trigger: Every minute (on schedule)
Action: Compute OEE for each production line Availability = (Planned Time - Downtime) / Planned Time Performance = (Ideal Cycle Time × Pieces Produced) / Run Time Quality = Good Pieces / Total Pieces OEE = Availability × Performance × Quality
Result: Publish OEE to UNS for dashboards ```
### Use Case 2: Dynamic Threshold Based on Context ``` Trigger: Tag "Production.Temperature" changes
Condition (C# logic): - If night shift: Temperature threshold = 75°C - If day shift: Temperature threshold = 78°C - If production line warming up: Threshold = 85°C (first 10 min) - If maintenance mode: No alert (bypass rule)
Action: Alert only if exceeds context-aware threshold ```
### Use Case 3: Multi-Site Correlation ``` Trigger: Any line reports downtime
Action (C#): 1. Query all lines: Are 2+ lines down simultaneously? 2. If yes: Check if it's a utility failure (power, compressed air, water) 3. If utility failure confirmed: Escalate to site director 4. If specific to one line: Alert line supervisor 5. Log correlation for analytics ```
## Performance: 10,000+ Rules Per Second
How does Proxus achieve this?
### 1. Compiled Execution Rules are compiled to native code, not interpreted.
### 2. Actor-Based Concurrency Each rule runs in isolation. No locks, no contention.
### 3. Smart Caching
First evaluation: Read tags from device (slow) Cached result: Read from in-memory cache (fast) Invalidation: Cache cleared only when tag actually changes### 4. Lazy Evaluation If Condition A is false, Condition B and C are never evaluated.
### 5. Batch Processing Multiple rules that depend on the same device are evaluated together, with a single device read.
Result: A modern laptop can evaluate 10,000 rules/second with <1 ms latency. ## Security & Code Injection Prevention Allowing users to upload C# code is risky. Proxus mitigates this via: ### 1. Namespace Blacklist Dangerous namespaces are blocked: - ❌
System.Reflection(code generation) - ❌System.Net(external network access) - ❌System.IO(file system access) - ✅ Whitelisted namespaces:System.Linq,System.Collections,System.Text### 2. Code Analysis Before compilation, the C# code is analyzed for suspicious patterns: - Infinite loops - Allocations exceeding limits - Calls to blacklisted APIs ### 3. Sandboxing Each C# rule runs in its own AppDomain with resource limits: - Memory: 512 MB max - Execution time: 5 seconds max (timeout) - CPU affinity: Constrained to specific cores ### 4. Audit Trail All rule uploads, modifications, and executions are logged with user attribution. ## Versioning & Deployment Strategy Rules evolve. The question is: how do you manage versions safely? ### Proxus Versioning ModelRule: "Emergency Stop - Temperature" Version 1 (2025-01-10): Threshold=85°C Status: ACTIVE (all gateways) Version 2 (2025-01-15): Threshold=82°C (more conservative) Status: STAGING (test gateway only) Testing for 7 days... Version 3 (2025-01-22): Threshold=80°C (even more conservative) Status: APPROVED (but not deployed yet) Ready to deploy on user approval Version 2 (2025-01-20): Status: ROLLED_BACK (bug found, reverted to v1)### Safe Deployment 1. Write rule on central server 2. Test on designated gateway 3. Approve after validation 4. Deploy to production gateways (can schedule for off-hours) 5. Monitor execution for anomalies 6. Rollback instantly if issues detected ## Integration with External Systems Rules do not live in isolation. They trigger actions in: ### WebhooksAction: HTTP POST to external API Example: When production exceeds target POST https://api.supplier.com/orders/create Body: { "product_id" : "X" , "quantity" : 100, "due_date" : "2025-01-20" }### Message BrokersAction: Publish to Kafka topic Example: Real-time anomaly detection Topic: "manufacturing/anomalies" Message: { "device_id" : "motor_1" , "anomaly_type" : "vibration" , "score" : 0.92 } Consumers: ML models, dashboards, archival systems### ERP IntegrationAction: Create SAP maintenance order Proxus automatically: - Authenticates to SAP (OAuth/SAML) - Creates notification - Links to equipment master data - Assigns to maintenance team## FAQ: Rule Engine Myths ### Q: Can I use visual rules for everything? A: ~80% of use cases. The remaining 20% (complex correlations, ML) need C#. Start visual; evolve to code only where needed. ### Q: What if my rule has a bug? A: Rules execute in sandboxes with timeouts. A runaway rule cannot crash the platform. Plus, you can rollback instantly to a previous version. ### Q: Can I run thousands of rules? A: Yes. Proxus can handle 10,000+ rules/second per gateway. Distribute across gateways for higher throughput. ### Q: How do I test rules? A: Deploy to a test gateway, execute with replay data, validate results. Rules are versioned, so you can test in parallel with production. ### Q: Can rules call external APIs? A: Yes, via webhook actions. But be cautious: external APIs can fail or be slow. Use timeouts and circuit breakers. ## Migration from Legacy Rule Engines If you are migrating from Wonderware, FactoryTalk, or other platforms: ### Phase 1: Audit Existing Rules (1 week) - Export all rules from legacy system - Document logic and actions - Identify patterns ### Phase 2: Re-implement in Proxus (2-4 weeks) - Start with most-used rules (80/20 principle) - Implement in visual builder first - Migrate complex rules to C# as needed ### Phase 3: Validation (1-2 weeks) - Run both systems in parallel - Compare outputs - Validate edge cases ### Phase 4: Cutover (1 day) - Switch to Proxus - Rollback procedure ready - Monitor closely ## Best Practices 1. Start simple: Use visual rules for initial implementations 2. Version everything: Never modify rules; create new versions 3. Test rigorously: Use test gateways before production rollout 4. Monitor executions: Track rule latency, errors, and trigger frequency 5. Document logic: Add comments explaining why, not just what 6. Use timeouts: All C# rules have built-in timeouts; respect them 7. Iterate based on data: Tune thresholds based on production telemetry 8. Use sandboxing: Contain complex rules to prevent cascading failures ## Getting Started ### Simple Rule (5 minutes) 1. Open Proxus UI 2. Click "New Rule" 3. Select trigger: Tag change 4. Select condition: Temperature> 80 5. Select action: Send email 6. Deploy to edge gateway### Advanced Rule (30 minutes) 1. Create C# rule script 2. Implement multi-device correlation 3. Test locally 4. Deploy to test gateway 5. Monitor for 24 hours 6. Promote to production
## Conclusion: Rules as First-Class Citizens
In Proxus, rules are not an afterthought bolted onto the platform. They are first-class citizens: versioned, tested, monitored, and optimized. Whether you need simple alerts or complex industrial workflows, the unified visual + code approach scales from hobby projects to mission-critical automation running thousands of rules per second.
Ready to automate? Deploy your first rule or explore complex rule engine patterns in our guide.
For advanced consultation on rule design and optimization, contact our automation team.