Embedded engineering is the next AI-native service

In Fall 2025, Gustaf Alströmer published Y Combinator's "AI-Native Service Companies" Request for Startups. The piece named four verticals where AI has crossed the line from augmenting skilled human services to replacing them: insurance brokerage, accounting and tax and audit, compliance, and healthcare administration. The framing was unusually direct.

"AI models are improving really fast, and the total spend on services is many times larger than the spend on software."

That is the entire investment thesis behind the wave of companies you have already heard of. Crosby in legal. Eve in litigation. Sully.ai in medical admin. Decagon in customer experience. Each one picks a bounded, repetitive, high-margin service category, points an LLM at it, and wraps the model in just enough domain-specific runtime to make the output trustworthy. The model does the work. The platform makes the work safe. The human sells the outcome.

We believe embedded engineering belongs on that list, and that nobody has built the company yet.

This post is the argument for why. It is also a marker we are willing to be measured against.

The category nobody has named

Embedded engineering is the work of making a piece of physical hardware talk to software. Read the device manual. Understand the register map or the BLE characteristic table or the Modbus slave layout. Write a driver. Integrate it with the rest of the system. Deploy it. Verify it on the real device. Maintain it when the firmware version bumps and a register address shifts by two bytes.

This work happens in three places, and you pay for it in all three.

You pay for it as employees. Embedded engineers and firmware contractors are some of the most expensive technical hires in the market right now, partly because the labor pool is small and partly because the cost of getting it wrong is high. A bad driver does not throw a 500. It bricks a device in a hospital, or it silently misreports a pressure reading on a piece of process equipment, or it gets recalled.

You pay for it as integration consulting. There is a class of firms whose entire revenue model is "we will write the firmware integration for your medical device, your industrial controller, your fleet of meters." They charge by the engagement, the timelines stretch, and the deliverable is code that the customer often cannot meaningfully maintain after the contract ends.

You pay for it as opportunity cost. The single most common pattern we see in hardware startups is a six to twelve month delay between "the device works on the bench" and "the device talks to our backend in production." That delay is almost always firmware integration work. It is the gap where startups die.

Add it up across the OEM, medical device, industrial automation, building automation, energy, and consumer hardware sectors and the global spend on this kind of work runs into the tens of billions of dollars per year. It is, in the language of the YC RFS, a service market many times larger than the software market that sits next to it.

It is also exactly the shape of work that AI is now ready to absorb.

Why embedded passes the AI-native test

Sarah Tavel laid out the test in her August 2023 essay "AI Startups: Sell Work, Not Software." Her argument was that the AI-native opportunity is not selling tools to professionals; it is selling the output of the professional's job back to the buyer, priced against the labor it replaces. She called it "selling the work."

Julien Bek's 2026 follow-up at Sequoia, "Services: The New Software," extended the thesis with a precision the original lacked. The market a model can address is not the entire profession. It is the bounded, repetitive subset of the profession where the input is structured enough for a model to read and the output is verifiable enough for a runtime to check. Bek's framing: AI-native services companies win when the work is "high frequency, narrow scope, and machine-checkable."

Embedded engineering passes all three.

The work is high frequency. Every new sensor on every new gateway in every new product is a fresh integration. The world ships millions of new device variants per year. Each one needs a driver.

The scope is narrow. A device manual is a structured PDF with a register table, a function code list, a wire diagram, and a few pages of operating notes. A Bluetooth peripheral has a GATT service definition that is, by spec, a structured tree. An OPC-UA server publishes a node hierarchy that is itself a self-describing schema. The input is closer to a database than a novel.

And the output is machine-checkable. A driver either reads the right value from the right register at the right scale, or it does not. You can test it against the physical device. You can run it through a sandbox that compares its output against a known-good fixture. You can replay captured traffic and check that the parser produces the same values it produced last time. The verification problem is not "does this read like a human wrote it." The verification problem is "does this return 23.4 when the device is at 23.4 degrees."

That last point is the one most analysts miss. Embedded is not a soft domain. It is one of the hardest possible AI domains because a wrong answer has physical consequences, and it is also one of the most testable because the physical consequence is also the test signal. You can build a runtime that catches the model's mistakes because the device itself is the ground truth.

What changed in 2025

Until 2025, this argument did not work in practice. The reason is not that the LLMs were too weak. The reason is that the runtime layer underneath them did not exist.

Imagine you are running an LLM that emits Python driver code from a manual. Even at 95% accuracy on the first pass, the 5% failure mode is not a typo. It is a register read that returns the wrong value, or a unit conversion that is off by a factor of 1000, or a polling loop that hammers a device until its watchdog reboots it. You cannot ship that into production by hand-reviewing every output. The economics collapse.

What you need is a layer underneath the model that catches these failures automatically. That layer has three pieces.

First, sandboxing. The driver code runs inside a constrained runtime that cannot crash the gateway, cannot escape its memory budget, cannot saturate the network, cannot block the main control loop. If the model emits something pathological, the runtime contains the blast radius.

Second, automated test harnesses. When the driver is generated, the platform also generates a fixture suite from the device manual: the register map says address 0x100 is a 16-bit unsigned integer holding temperature in tenths of a degree Celsius, so the harness writes 234 to that address and asserts the driver returns 23.4. Most of the LLM's mistakes get caught here, before the driver ever touches a real device.

Third, observable deployment. When the driver does ship, every read, every write, every parse error, every retry is instrumented and streamed back to the platform. If the model got something subtly wrong (a unit conversion, a sign bit, an endianness assumption) the platform sees it within minutes of the device coming online, not months later when an operator notices the chart looks weird.

Those three pieces (sandbox, harness, observability) are not separately new. What is new is that they have been productized and integrated tightly enough that an LLM can sit on top of them and the loop closes. The model emits code. The harness checks it. The sandbox isolates it. The observability layer feeds the failures back into the prompt for the next attempt. The platform makes the AI usable.

This is the pattern in every AI-native services company that has worked. Crosby is not just an LLM that drafts contracts. It is a contract-drafting LLM wrapped in a redlining engine, a clause library, and an audit trail. Sully.ai is not just a transcription model. It is a transcription model wrapped in a coding ontology, an EHR write path, and a compliance check. The model is the thin part. The runtime is the thick part. The runtime is the moat.

The pattern, stated cleanly

The AI-native services pattern, applied to embedded:

The AI does the bounded work. Read the manual. Generate the driver. Generate the test fixture. Suggest the deployment configuration.
The runtime catches the errors. Sandboxed execution, automated property tests against the device, observability that surfaces silent failures fast.
The human sells the outcome. Not "here is a tool to write your firmware." But "your device will be talking to your backend in 48 hours, and we are accountable for it."

That third line is the one that makes this a services company and not a developer tools company. The customer does not buy a code generator. They buy "my device works in production." The pricing reflects the labor it replaces, not the tokens it consumed.

This is the cleanest articulation of the AI-native services thesis we have found, and it maps onto embedded with no translation loss.

Why hardware was the gap

The AI-native services map already has flags planted in legal, medical admin, customer experience, accounting, compliance, and insurance. The companies are funded, the categories are named, the LPs are paying attention. Hardware is conspicuously empty.

The reason hardware was empty until now is the same reason it is the largest remaining opportunity. The trust bar is higher. A wrong contract is embarrassing. A wrong driver is a recall. The runtime layer required to make AI usable in this domain had to be built before the AI itself could ship, and almost nobody was patient enough to build it.

We have spent the last eighteen months building exactly that runtime. The sandbox. The harness. The observability layer. The cert lifecycle. The deployment pipeline. The platform that catches what the model gets wrong. We did this before we put a generative model on top, on purpose, because the model is the easy part and the runtime is the hard part, and the order matters.

Our first customer is Corvita Biomedical, a neonatal medical device startup. Neonatal because the trust bar in this domain is approximately as high as it gets in commercial hardware. If the runtime is good enough for an incubator that holds a premature infant, the runtime is good enough for everything downstream of that.

What we are building, in one sentence

SCADABLE is the embedded engineering equivalent of Crosby for legal, Sully.ai for medical admin, and Decagon for customer experience. The AI generates the integration code from the manual. The runtime makes the integration code production-safe. The hardware company stops paying for the bottleneck.

We believe this is a category. We believe the category will have a winner. And we believe the winner is whoever builds the runtime first, because the runtime is what makes the AI usable, and the AI is what makes the economics work.

The YC RFS named four verticals. The framework names a fifth.

If you are a hardware company staring at the firmware-integration bottleneck and wondering whether AI is real enough to use, we run 30-minute architecture reviews for founders evaluating exactly that question. Book at https://cal.com/rahbaral/quick-chat.

The category nobody has named

Why embedded passes the AI-native test

What changed in 2025

The pattern, stated cleanly

Why hardware was the gap

What we are building, in one sentence

We shipped ESP32 support, and it took a year longer than Linux

ESP32 plus Modbus TCP, from sensor to cloud in 40 lines of Python

How we generate Python device classes from datasheets (and the failure modes we still see)