A scene every integrator has lived through
Monday morning, 6 a.m. The phone rings.
It’s the production supervisor: “Line 3 is down, the MES is showing an alarm but I don’t know why, and we’ve already lost 40 minutes of the shift.”
You remote into the SCADA. You see the alarm. But the fault code doesn’t tell you much — FaultCode: 47. You open the PLC manual. Generic description. You call the maintenance tech. He’s already there, but also lacks context: he doesn’t know which production order was running, what the target was, or how much this has already hit the shift OEE.
That’s the gap. The PLC knows what happened to the machine. The MES knows what was happening in production. But nobody connected those two sources in real time, with language the tech understands without consulting three systems.
That’s exactly what we’re going to build in this series.
What major manufacturers are already doing
Before touching any code, it’s worth placing this project in context.
Siemens launched the Industrial Copilot in 2023 — an LLM-powered assistant integrated into TIA Portal and SIMATIC. It answers questions about ladder logic, suggests fault diagnostics, and generates automation code. Unveiled at Hannover Messe as the future of automation engineering.
Rockwell Automation announced a partnership with Microsoft to integrate Azure OpenAI into FactoryTalk. The stated goal is exactly what we’re building: turning factory floor data into natural language for operators and technicians.
Bosch uses LLMs internally to cross-reference maintenance histories with OPC-UA data and suggest predictive maintenance windows before failures occur.
The pattern all of them follow is the same: AI in the information loop, never in the control loop. The model suggests, the human decides and acts. We’ll follow exactly that principle here.
What we’ll build in this series
A Node.js process that:
- Reads PLC tags via OPC-UA in real time (EP01 — this episode)
- Queries the MES API to enrich data with production order context (EP02)
- Calculates shift OEE (Availability, Performance, Quality) (EP03)
- Sends everything to Claude via API and receives a natural-language diagnosis (EP04–06)
- Sends an Adaptive Card on Microsoft Teams to the maintenance tech (EP07–08)
The final result looks like this on the tech’s phone:
🔴 LINE 3 — DOWN · 18 min
Diagnosis: Position sensor fault (Code 47).
Same fault occurred 3x this week.
Suggested action: replace sensor before night shift.
Shift OEE dropped from 91% → 72%.
[Confirm] [Escalate] [Dismiss]
In this EP01, we build the foundation: OPC-UA connection and tag reading.
Why Node.js and not Python?
The most common question. Direct answer:
- node-opcua is the most mature OPC-UA library in the JavaScript ecosystem — actively maintained, with full support for all protocol security modes
- Node.js runs well on industrial edge hardware (Raspberry Pi, Moxa gateways, Advantech)
- The same process that reads OPC-UA will call the Claude API and send the Teams card — JavaScript unifies everything without switching languages
- Python also works for OPC-UA, but the integration ecosystem for Teams and REST APIs is more verbose
If your environment already has consolidated Python, the logic we’re building translates directly.
Prerequisites
- Node.js ≥ 18
- Access to an OPC-UA server (real PLC, Prosys Simulation Server, or the simulation server we’ll create at the end of this tutorial)
- Basic JavaScript async/await knowledge
Installation
mkdir oee-agent && cd oee-agent
npm init -y
npm install node-opcua
The tags we’ll monitor
To calculate Availability we need three pieces of information from the PLC:
| Tag | Type | Meaning |
|---|---|---|
MachineState | Int32 | 0 = Stopped · 1 = Running · 2 = Alarm |
FaultCode | Int32 | Active alarm code (0 = none) |
GoodPartCount | Int32 | Good parts produced in the shift |
NodeIds vary by vendor. Examples below use ns=2;s=PLC1.MachineState — adjust for your environment.
Step 1 — Connect to the OPC-UA server
// opc-client.js
import { OPCUAClient, MessageSecurityMode, SecurityPolicy } from 'node-opcua';
const endpointUrl = 'opc.tcp://192.168.1.100:4840';
const client = OPCUAClient.create({
applicationName: 'OEEAgent',
connectionStrategy: {
initialDelay: 1000,
maxRetry: 5,
},
securityMode: MessageSecurityMode.None,
securityPolicy: SecurityPolicy.None,
endpointMustExist: false,
});
export async function connect() {
await client.connect(endpointUrl);
console.log('[OPC-UA] Connected to', endpointUrl);
const session = await client.createSession();
console.log('[OPC-UA] Session created');
return session;
}
export async function disconnect(session) {
await session.close();
await client.disconnect();
console.log('[OPC-UA] Disconnected');
}
Note:
SecurityMode.Noneis for development on isolated networks only. In production, useSignAndEncryptwith certificates — the same requirement Siemens Industrial Copilot enforces on its OPC-UA integrations.
Step 2 — Validate tags with a single read
Before subscribing, confirm the NodeIds are correct:
// read-once.js
import { connect, disconnect } from './opc-client.js';
const NODE_IDS = [
'ns=2;s=PLC1.MachineState',
'ns=2;s=PLC1.FaultCode',
'ns=2;s=PLC1.GoodPartCount',
];
async function main() {
const session = await connect();
const dataValues = await session.read(
NODE_IDS.map((nodeId) => ({ nodeId, attributeId: 13 }))
);
dataValues.forEach((dv, i) => {
console.log(`${NODE_IDS[i]} = ${dv.value.value} (${dv.statusCode.name})`);
});
await disconnect(session);
}
main().catch(console.error);
If all three tags return status: Good, the foundation is ready.
Step 3 — Real-time subscription
A one-off read doesn’t capture the exact instant the machine stopped — and that timestamp is the most critical data point for accurately calculating Availability.
ClientSubscription solves this: the PLC pushes data to the agent the moment the value changes, no polling required.
// opc-subscription.js
import { connect } from './opc-client.js';
import {
ClientSubscription,
ClientMonitoredItem,
TimestampsToReturn,
AttributeIds,
} from 'node-opcua';
const TAGS = {
machineState: 'ns=2;s=PLC1.MachineState',
faultCode: 'ns=2;s=PLC1.FaultCode',
partCount: 'ns=2;s=PLC1.GoodPartCount',
};
export const state = {
machineState: null,
faultCode: 0,
partCount: 0,
lastStateChange: null,
downtimeAccumMs: 0,
shiftStartMs: Date.now(),
_downtimeStart: null,
};
function onStateChange(tag, newValue, sourceTimestamp) {
const prev = state[tag];
state[tag] = newValue;
state.lastStateChange = sourceTimestamp ?? new Date();
if (tag !== 'machineState') return;
const label = ['Stopped', 'Running', 'Alarm'][newValue] ?? `State ${newValue}`;
console.log(`[${state.lastStateChange.toISOString()}] MachineState → ${label}`);
// Machine stopped
if (prev === 1 && newValue !== 1) {
state._downtimeStart = state.lastStateChange;
console.log('[OEE] Downtime started');
}
// Machine resumed
if (prev !== 1 && newValue === 1 && state._downtimeStart) {
const durationMs = state.lastStateChange - state._downtimeStart;
state.downtimeAccumMs += durationMs;
state._downtimeStart = null;
console.log(`[OEE] Downtime ended — ${(durationMs / 60000).toFixed(1)} min`);
}
}
export async function startSubscription() {
const session = await connect();
const subscription = ClientSubscription.create(session, {
requestedPublishingInterval: 1000,
requestedMaxKeepAliveCount: 10,
maxNotificationsPerPublish: 100,
publishingEnabled: true,
priority: 10,
});
subscription.on('started', () =>
console.log(`[OPC-UA] Subscription active (id: ${subscription.subscriptionId})`)
);
for (const [key, nodeId] of Object.entries(TAGS)) {
const item = ClientMonitoredItem.create(
subscription,
{ nodeId, attributeId: AttributeIds.Value },
{ samplingInterval: 500, discardOldest: true, queueSize: 10 },
TimestampsToReturn.Both
);
item.on('changed', (dv) => onStateChange(key, dv.value.value, dv.sourceTimestamp));
}
return { session, subscription, state };
}
Step 4 — Calculate Availability
// oee-calculator.js
/**
* Availability = Productive Time / Planned Time
* @param {object} state Current agent state
* @param {number} shiftMins Planned shift duration (default: 480min = 8h)
*/
export function calcAvailability(state, shiftMins = 480) {
const nowMs = Date.now();
const currentDowntimeMs = state._downtimeStart ? (nowMs - state._downtimeStart) : 0;
const totalDowntimeMs = state.downtimeAccumMs + currentDowntimeMs;
const elapsedMs = Math.min(nowMs - state.shiftStartMs, shiftMins * 60 * 1000);
const uptimeMs = Math.max(elapsedMs - totalDowntimeMs, 0);
return {
availability: parseFloat(((uptimeMs / elapsedMs) * 100).toFixed(2)),
plannedMins: parseFloat((elapsedMs / 60000).toFixed(1)),
downtimeMins: parseFloat((totalDowntimeMs / 60000).toFixed(1)),
uptimeMins: parseFloat((uptimeMs / 60000).toFixed(1)),
};
}
Step 5 — Putting it all together
// index.js
import { startSubscription } from './opc-subscription.js';
import { calcAvailability } from './oee-calculator.js';
async function main() {
console.log('[OEE Agent] EP01 started\n');
const { state } = await startSubscription();
setInterval(() => {
const oee = calcAvailability(state, 480);
console.log('\n--- OEE Report ---');
console.log(`Availability : ${oee.availability}%`);
console.log(`Planned time : ${oee.plannedMins} min`);
console.log(`Downtime : ${oee.downtimeMins} min`);
console.log(`Uptime : ${oee.uptimeMins} min`);
console.log('------------------\n');
}, 60_000);
}
main().catch(console.error);
Testing without a real PLC
No PLC access right now? Spin up a local simulation server:
// sim-server.js
import { OPCUAServer, Variant, DataType } from 'node-opcua';
const server = new OPCUAServer({ port: 4840 });
await server.initialize();
const ns = server.engine.addressSpace.getOwnNamespace();
const device = ns.addObject({
organizedBy: server.engine.addressSpace.rootFolder.objects,
browseName: 'PLC1',
});
let machineState = 1;
ns.addVariable({
componentOf: device,
browseName: 'MachineState',
nodeId: 'ns=2;s=PLC1.MachineState',
dataType: DataType.Int32,
value: { get: () => new Variant({ dataType: DataType.Int32, value: machineState }) },
});
await server.start();
console.log('OPC-UA server running at opc.tcp://localhost:4840');
// Simulate a stop every 30s for 5s
setInterval(() => {
machineState = 2;
setTimeout(() => { machineState = 1; }, 5000);
}, 30_000);
Using AI in industrial environments safely
In upcoming episodes we’ll connect this data to Claude to generate natural-language diagnostics. Before we get there, it’s important to understand the security model we’ll follow — the same one adopted by Siemens, Rockwell, and Bosch.
The core principle: AI in the information loop, never in the control loop.
| The agent DOES | The agent DOES NOT |
|---|---|
| Suggest diagnostics | Trigger PLC commands |
| Alert the tech on Teams | Stop or start machines |
| Recommend actions | Make autonomous decisions |
| Summarize the shift | Modify process parameters |
Other safeguards we’ll implement throughout the series:
- Fallback without AI — if the Claude API is unavailable, the agent keeps running and sends the alarm without the diagnosis
- Anonymized data — the LLM receives
FaultCode: 47and metrics, not customer names, product names, or strategic company data - Latency — Claude responses take 1–3 seconds. Appropriate for alerts, inappropriate for real-time control
- Human in the loop — the Teams card always requires a technician action before anything happens
For data-restricted environments: in upcoming episodes we’ll also cover how to run a local model (Llama/Mistral via Ollama) for scenarios where data cannot leave the factory network — which is the approach Bosch uses in some of their plants.
What’s coming in EP02
Now that the agent is reading the PLC, in the next episode we’ll connect to the MES REST API to enrich that data with production context: which product is running, what the parts/hour target is, who the shift operator is.
With that, the agent will be able to say not just “the machine stopped”, but “Order 4471 is at risk of a 2-hour delay”.