From device manual to live cloud data, with no firmware engineer in the loop

The shortest path from "I have a connected device" to "live data is in my cloud" used to be a six-week firmware project. The version we ship today is a drag-and-drop, and the live data lands within a few minutes. We get asked, often, what is actually happening in the middle. This post is the answer.

The customer does three things. They drag a device datasheet PDF onto our live demo. They pick a gateway target (Linux x86, ARM64, ARM32, or ESP32). They click deploy. Two to four minutes later, the dashboard shows the first reading. The "no firmware engineer in the loop" part is literal: we do not require the customer to write a driver, configure a Modbus register table, set up TLS, or think about MQTT topics. They never see the layer underneath.

The layer underneath has five stages.

Stage 1: PDF parsing

The device manual is PDF. Sometimes it is structured with proper register tables in actual table elements. More often it is a scanned document from 2003 that became a PDF by way of a fax machine. Both are inputs we accept.

The first pass is layout-aware text extraction. We use a vision-language model to identify the structural blocks: register tables, command tables, parameter tables, modes, units, scaling factors. The output is not raw text. It is a structured intermediate representation that says "this section is a register table with columns address, name, type, unit, scale" along with the rows.

We then run a separate verification pass that compares the extracted IR to known patterns for the device's protocol family. A Modbus register table has predictable shapes: addresses in 4xxxx or 3xxxx, types from a known set (uint16, int16, float32, swapped variants), units from a controlled vocabulary. If the extracted IR does not match, we flag it for human review immediately, before we generate any code. We covered the verification logic in detail in another post.

About 70% of datasheets parse cleanly on the first try. The remainder go through what we call the engineer-in-the-loop fallback, which we wrote about separately.

Stage 2: DeviceSpec generation

The structured IR becomes a DeviceSpec JSON object. This is our canonical internal representation of "what this device is and how to talk to it." It is protocol-agnostic at this layer. A Modbus device, a BLE device, an OPC-UA device, and an HTTP-polled device all become a DeviceSpec with the same shape, just with different protocol tags inside.

The DeviceSpec includes:

Identity (manufacturer, model, firmware version range)
Transport configuration (protocol, host, port, polling rates, retries)
Register or characteristic table (addresses, types, units, scaling, sane ranges, alert thresholds)
Command table (writable registers, write-only commands)
Behavioral metadata (which registers are safe to read on a hot device, which require a specific mode)

We documented how this format works and why we made it the canonical surface for both code generation and verification.

Stage 3: Code generation

The DeviceSpec compiles to a Python device class. The Python class is the customer-facing artifact, and it is what they see if they ever want to inspect or override anything. It looks like this for a typical Modbus TCP device:

from scadable import Device, ModbusTCP, register
 
class FlowMeterX42(Device):
    transport = ModbusTCP(
        host="192.168.1.42", port=502, unit_id=1,
        timeout_ms=2000, reconnect_backoff_s=(1, 2, 4, 8, 16),
    )
 
    flow_rate_lpm = register(
        address=40020, kind="holding", dtype="float32",
        scale=1.0, unit="L/min", poll_interval_s=2.0,
        sane_range=(0.0, 500.0),
    )
 
    pressure_kpa = register(
        address=40022, kind="holding", dtype="int16",
        scale=0.1, unit="kPa", poll_interval_s=2.0,
        sane_range=(0.0, 1000.0),
    )

The customer never has to write this file. But it is human-readable, version-controlled, and editable, which matters when something goes wrong on a Tuesday and a real engineer wants to look at what is happening. This is one of the reasons we picked Python over a JSON or YAML config, even though Python adds a compile step.

Stage 4: Compilation to gateway driver

The Python device class gets compiled to a YAML driver, which is the artifact the gateway runtime actually consumes. The YAML is small, deterministic, and parses in microseconds on a memory-constrained device. The compiler emits one YAML file per device, plus a manifest that the gateway uses to load them on boot.

This is also where target-specific decisions happen. A device that compiles for a Linux gateway gets a different generated YAML than the same device compiled for ESP32, because the ESP32 cannot afford a 10 KB JSON parser at runtime. We covered the architectural reasons in the ESP32 launch post. The customer does not see this difference. They picked their gateway target in step 3 and the compiler did the rest.

Stage 5: Deploy and first event

The compiled YAML, plus an mTLS certificate provisioned through our EST endpoint, plus a manifest entry, gets pushed to the gateway. The gateway runtime (Rust binary, around 18 MB on Linux, 380 KB on ESP32) reads the manifest, instantiates the driver, opens the protocol connection, and starts polling.

The first reading hits the SCADABLE cloud through our unified API endpoint. We translate it into the customer's preferred shape (one of MQTT, HTTPS webhook, Kafka, S3, or a direct database insert). The customer sees the value on the dashboard.

End to end, this whole sequence runs in two to four minutes for a device whose datasheet parses cleanly. For one that does not, the engineer-in-the-loop adds a few hours to a few days, depending on what the failure is.

What this is and is not

What this is: a working pipeline today. The five stages are shipped. We run them in production for Corvita Biomedical's neonatal device fleet, where every register read is also an audit event, and we have the Modbus to MQTT bridge running underneath the same architecture for the industrial customers.

What this is not: a magic AI that handles every device on the open market. The 30% of datasheets that need human review need it because the source PDF was ambiguous in a way no current model can disambiguate without context that lives in someone's head. The engineer-in-the-loop is not a workaround. It is part of the design.

What this is becoming: a thinner and thinner middle. Every device we onboard makes the next one of the same family parse faster, because the device library grows. Every customer-specific override gets folded back into the verification layer. The pipeline gets shorter every month.

See it work

The full drag-and-drop is on the live demo. You can drop your own datasheet, watch the stages run, and see the generated Python device class. If you want to talk about how this maps to your specific device fleet, let us talk.

The pipeline is the part of SCADABLE we are most proud of. It is also the part we are most aware needs to keep getting better. The mark of a good middle layer is that the customer eventually forgets it exists. We are still earning that.