We shipped ESP32 support, and it took a year longer than Linux

A year ago we shipped the Linux gateway and started telling people SCADABLE supports any device with a TCP/IP stack and 50 MB of RAM. Privately, we were worried about that bound. Public statement: the platform is generic. Private reality: every customer call had a slide where someone asked about ESP32, and we said "soon."

ESP32 shipped this month. It took thirteen months from "let us start" to "a Schneider PLC is feeding our cloud through a $5 microcontroller." Here is what those thirteen months were actually about, because the Twitter version ("we ported it to ESP32") is missing the four problems that make ESP32 an entirely different platform from Linux.

The shape of the problem

A Raspberry Pi 4 (the lower bound of our Linux target) has 4 GB of RAM, a multi-gigabyte SD card, a real filesystem, a stable Rust toolchain, and Linux scheduling. An ESP32 (the lower bound of our embedded target) has 520 KB of SRAM, a 4 MB SPI flash partition, no filesystem you would call by that name, FreeRTOS instead of Linux, and a Rust toolchain that is alive but on a different planet.

Saying "we support ESP32" is not the same kind of claim as saying "we support a Raspberry Pi 4." The first claim is engineering. The second claim is just compilation.

Problem 1: 520 KB of RAM, no stdlib, no `cargo install`

Our Linux gateway is 18 MB compiled, runs in about 40 MB of resident memory, and pulls in around 90 transitive crates from crates.io. None of that survives on an ESP32. We rewrote the gateway core as a no_std Rust binary that compiles to around 380 KB and runs in under 200 KB of working memory at peak.

The cost of that rewrite was almost entirely in the dependency graph. Every crate we relied on for the Linux gateway (rumqttc, tokio, reqwest, serde, tracing) had to be evaluated for no_std support. About a third of them have first-class support, a third have it behind a feature flag with rough edges, and a third do not have it at all and we wrote the equivalent ourselves. The ones we wrote ourselves were the ones that needed network buffers smaller than 16 KB, which is to say, all of the network primitives.

The mental model shift is the part that takes the longest. On Linux, you allocate when you want to. On ESP32, every allocation is a question about whether you will fragment the heap, and whether the worst-case payload from your protocol layer fits in the buffer you can statically reserve. Rust's borrow checker is a help here, not a hindrance: a lot of the patterns we wrote on Linux ("clone the buffer, send it through a channel, drop the original") become "pass a slice with a lifetime that spans the operation" on ESP32. The code is uglier and faster.

Problem 2: No filesystem you would recognize

The Linux gateway stores its config, its mTLS cert, and its rolling buffer in /etc/scadable/ and /var/lib/scadable/ like any well-behaved daemon. The ESP32 has SPI flash partitioned through the ESP32 partition table, with NVS (non-volatile storage) for key-value pairs and SPIFFS or LittleFS for file-shaped data. Neither of these is POSIX. Both of them have wear-leveling concerns we had to learn the hard way.

The ESP32 storage layer in our gateway is now its own component, with three explicit modes: ephemeral (RAM only, lost on reboot, used for the buffer cache), persistent (NVS, used for cert and config), and image (the OTA partition pair, A/B, used for firmware itself). Switching between them is not a matter of "where do I write this file"; it is a matter of which storage primitive can survive a sudden power loss in the middle of a write, which the chip's flash controller has documented opinions about.

We lost about three weeks figuring out that NVS commits are not atomic the way the docs imply. The fix is a small write-ahead log we maintain in a separate NVS namespace, which the runtime replays on boot if it sees a partial commit. Every embedded engineer reading this post is nodding because they wrote the same thing.

Problem 3: Real-time scheduling and the FreeRTOS task model

Linux schedules our gateway as a single-process daemon with a tokio async runtime inside it. The kernel handles preemption, the runtime handles cooperative multitasking, and we never think about either.

ESP32 runs FreeRTOS, which is a real-time scheduler with two cores and a task model that wants you to declare your tasks explicitly with stack sizes and priorities. There is no async runtime in the Rust ecosystem that will save you from this. The naive port of our gateway runs as a single FreeRTOS task with a 32 KB stack, which works for a hello-world, blows up the moment you have three concurrent network operations, and runs out of stack the first time the TLS handshake nests its callbacks too deep.

The right answer turned out to be three tasks: a network task pinned to core 0 (Wi-Fi and TCP), a protocol task pinned to core 1 (Modbus framing, MQTT codec), and a runtime task that owns the SCADABLE control loop and uses lock-free queues to talk to the other two. Each task has an explicitly sized stack, derived empirically by running the worst-case operation under a stack-watermark instrumented build and adding 25%.

This is not a thing you can hide from the device author. The Python device class running on top of all of this is unchanged between Linux and ESP32, but the runtime had to be re-architected from "one big async runtime" to "three constrained tasks with declared stacks."

Problem 4: Per-device cert provisioning at factory programming time

The hardest of the four. On Linux, we provision the gateway by SSHing in, dropping a registration code, and letting the EST flow do its work. None of that exists on ESP32. The device gets one shot to be programmed at the factory line, and after that you are not getting back into it without a serial connection and a benevolent technician.

The pattern we landed on is what AWS calls "Fleet Provisioning by Claim", with a SCADABLE-shaped variation. At factory programming time, the ESP32 receives a single shared "claim certificate" baked into firmware, plus a per-device serial that lives in NVS. On first boot, the device authenticates to our EST endpoint with the claim cert, hands over the serial, and receives a unique mTLS certificate that the claim cert can no longer access. The claim cert is rotated quarterly across the entire production run; the per-device cert is rotated every 24 hours by the runtime, which we documented in the ESP32 to AWS IoT Core mTLS post.

The version of this that took us six months was the wrong version: per-device cert baked into firmware at the factory, which means a unique firmware image per device, which means the contract manufacturer's flashing tool has to know how to inject per-device data into the binary, which they will get wrong, and which makes recall a nightmare. The version we ship is one image per fleet, one shared claim cert, one per-device serial in NVS. The contract manufacturer flashes a CSV of serials. The chip provisions itself on first boot.

The "library not framework" payoff

The reason we did all of this, instead of accepting "Linux only" as a permanent constraint, is the economics of bill of materials. A Linux-class gateway (Raspberry Pi CM4, industrial enclosure, power supply, basic IO) is roughly $40 to $80 of BoM at small volume. An ESP32-class gateway is $5 to $12. For a hardware company shipping 10,000 units, that is the difference between integration costing 5% of unit margin and integration costing 30% of unit margin.

The promise we make customers is that the same Python device class runs on both, with no rewrites. That promise was a lie until this month. It is now true, and the proof is that for our YC AI-native demo, we wrote a Modbus TCP device class for a Schneider PLC, deployed it to a Linux gateway (RPi 4) for the first take, and to an ESP32 for the second. Same source. Same generated YAML. Different runtime.

The ESP32 take is the one we are going to use, because nobody on the planet wants to argue about whether the platform supports embedded at the BoM-margin moment. They want to see a $5 chip publishing a register read into the cloud.

What is still left

We are not done. Three things on the immediate roadmap:

OTA at scale. A/B partition swap on a single device is shipped. Staged rollout (5%, 25%, 100%) across a fleet, with rollback triggers, is in flight. We will write that post when it ships, not before.
ESP32-S3 and ESP32-C6 targets. The S3 has a vector instruction set we can use for cheap on-device signal processing. The C6 has Wi-Fi 6 and Thread, which opens the door to direct integration with industrial wireless meshes. Both are compile-target work, the runtime is already there.
The factory provisioning UX. Today the contract manufacturer flashes a CSV. This works at the scale of our pilot customers. At the scale of a real consumer product run, it needs to be a tool with a barcode scanner and a pass-fail light. That is product work, not platform work.

Why this matters for buyers

If you are evaluating SCADABLE and you are shipping a connected hardware product at any volume above 1,000 units, the question is not whether we support Linux. It is whether the cloud-side platform you choose forces you to ship a $50 BoM gateway on every device. The answer for SCADABLE is now: no, $5 is fine, here is the device class, here is the deploy command, let us talk.

The shape of the problem

Problem 1: 520 KB of RAM, no stdlib, no cargo install