At Aarno Labs we have developed automated binary and firmware analysis technologies across multiple DARPA (HACCS and AMP) and ARPA-H (DIGIHEALS and UPGRADE) programs. Along the way we have assembled something useful in its own right: a corpus of roughly 1,300 real IoT open-source binaries, extracted from the firmware of consumer routers, gateways, and IP cameras.
We have made this corpus available to the wider research community at https://github.com/Aarno-Labs/iot-binary-dataset/.
This post describes what we’re making available as part of the corpus and a study we performed using the corpus about a simple but uncomfortable question: when an IoT device ships, how current is the open-source software inside it? The popular mental model is that a device is “secure when it leaves the factory” and only drifts into risk as vulnerabilities are discovered over the years that follow. Our data says the opposite. The typical device is outdated from the start—it ships on day one with open-source libraries that are already several years old and already carry publicly disclosed CVEs that anyone could have looked up before the firmware was released.
Why this matters
Aarno Labs builds tooling to discover, understand, and remediate vulnerabilities in production binaries—including end-of-life devices whose vendors will never ship another update. That work is motivated directly by the finding below: the dominant source of risk in deployed IoT is not exotic zero-days, it is known vulnerabilities in stale dependencies that were never current to begin with and were never refreshed. If a device launches five years behind upstream, every one of those five years of public security fixes is missing on the first day a customer plugs it in. Understanding the scale of that gap is the first step toward closing it.
The dataset
The collection contains ~1,300 distinct 32-bit stripped ELF binaries spanning three architectures (MIPS, ARM, and PowerPC) and five widely deployed open-source projects: OpenSSL (as libcrypto, libssl, and the openssl tool), BusyBox, dnsmasq, lighttpd, and curl (as curl and libcurl). The binaries were extracted from publicly available firmware images from 17 vendors across roughly 250 distinct device product lines. Each binary is paired with a JSON metadata file giving its architecture, the program version recovered from the binary, the upstream release date of that version, and the vendor, product, and release date of every firmware image in which it appeared. The repository also includes the corresponding source code for each of the software releases represented in the dataset.
The date metadata information is what makes the corpus suited to this analysis, and we want to be transparent about where it comes from. We extracted the library version of each stripped binary ourselves, through a mostly automated binary-analysis pipeline: harvesting strings from the binary, locating embedded version banners (for example, OpenSSL 0.9.8p 16 Nov 2010 or dnsmasq-2.15), and resolving them to a specific upstream release and its publication date.
We released the dataset—binaries, metadata, and source tarballs for the matching versions—so that other researchers can reproduce these results and perform their own studies. Firmware corpora that are both labeled and openly redistributable are scarce, and we hope this one is useful for work on software composition analysis, N-day detection, and firmware supply-chain security.
How we measured outdated and vulnerable
We computed two quantities for each binary, both anchored to the firmware's own release date so that nothing depends on when we happened to collect the sample.
Library age at firmware release is simply the firmware's release date minus the bundled library version's upstream release date. A 2012 firmware carrying a 2007 OpenSSL is five years stale, full stop.
Known CVEs at firmware release is the number of distinct CVEs that (a) apply to the shipped library version and (b) were published on or before the firmware's release date—that is, vulnerabilities a diligent engineer could already have found the day they shipped. Version applicability and publication dates came from authoritative sources: the National Vulnerability Database's CPE version ranges for OpenSSL, BusyBox, dnsmasq, and lighttpd, and curl's own first-party machine-readable advisory feed. We built a version comparator that understands OpenSSL's letter-suffix scheme (so that 0.9.8 < 0.9.8a < … < 0.9.8za) and validated it against every distinct version string in the corpus. Using the NVD/vendor publication date as the “known” date is mildly conservative—real disclosure sometimes precedes NVD publication—so if anything we undercount.
What we found
| Metric (per binary, first shipment) | Median | Mean | Max |
|---|---|---|---|
| Library age at firmware release | 5.4 years | 5.45 | 16.8 |
| Known CVEs at firmware release | 15 | 20.9 | 80 |
The pattern is stark and consistent. Measured at each binary's earliest firmware shipment—the most charitable “from the start” reading—the median open-source library was already 5.4 years old, and the median binary shipped carrying 15 already-published CVEs. 96.3% of binaries shipped with at least one known vulnerability in the bundled library.
| Library | Binaries | Median age (y) | Median known CVEs | % with ≥1 CVE |
|---|---|---|---|---|
| libcrypto (OpenSSL) | 488 | 5.84 | 35 | 98.6 % |
| openssl (OpenSSL) | 135 | 5.11 | 26 | 98.5 % |
| libssl (OpenSSL) | 111 | 3.35 | 20 | 96.4 % |
| curl | 39 | 3.59 | 23 | 97.4 % |
| libcurl | 39 | 3.56 | 22 | 97.4 % |
| busybox | 205 | 9.04 | 4 | 99.0 % |
| dnsmasq | 234 | 4.71 | 2 | 89.3 % |
| lighttpd | 80 | 3.58 | 8 | 91.2 % |
The tails are worse than the medians suggest. More than half of all binaries (54.8%) were at least five years out of date on launch day; 19% were eight or more years behind. On the vulnerability side, 59% shipped with ten or more known CVEs and 10% with fifty or more. The extreme cases read like archaeology: a dnsmasq from 2002 shipping in a Belkin router released in 2019 (16.8 years stale); OpenSSL 0.9.8e from 2007 shipping in 2017 Netgear flagships with 80 of its 82 lifetime CVEs already public; curl 7.42.1 from 2015 shipping in a 2025 Linksys device, again carrying 80 publicly known CVEs.
| Library age at first shipment | Binaries | Share |
|---|---|---|
| < 1 year | 138 | 10.4 % |
| 1–2 years | 97 | 7.3 % |
| 2–3 years | 108 | 8.1 % |
| 3–5 years | 259 | 19.5 % |
| 5–8 years | 476 | 35.8 % |
| 8+ years | 253 | 19.0 % |
| Known CVEs at first shipment | Binaries | Share |
|---|---|---|
| 0 | 49 | 3.7 % |
| 1–4 | 312 | 23.4 % |
| 5–9 | 183 | 13.8 % |
| 10–24 | 313 | 23.5 % |
| 25–49 | 338 | 25.4 % |
| 50+ | 136 | 10.2 % |
| Known CVEs at ship | Distribution | Binaries | |
|---|---|---|---|
| 0 | ████████ | 49 | |
| 1–4 | ██████████████████████████████████████████████████ | 312 | ← mode 1 (busybox/dnsmasq) |
| 5–9 | █████████████████████████████ | 183 | |
| 10–14 | ███████████████████ | 118 | |
| 15–19 | ███████████████ | 94 | ← trough (median lives here) |
| 20–29 | █████████████████████ | 134 | |
| 30–39 | ██████████████████████████████████████ | 234 | ← mode 2 (OpenSSL family) |
| 40–49 | ███████████ | 71 | |
| 50–59 | ████████ | 53 | |
| 60–69 | ███████████ | 71 | |
| 70–80 | ██ | 12 |
A natural worry is that those counts are outliers dragging the average around. They are not. By the standard statistical test (Tukey's 1.5×IQR rule) the dataset contains zero high outliers—the single most-vulnerable binary sits just below the outlier threshold. The high counts are not anomalies; they are a second mode. The distribution is bimodal, split by library family: BusyBox, dnsmasq, and lighttpd have small CVE histories and cluster low (medians of 2–8 known CVEs), while the security-critical OpenSSL family and curl cluster high (a family-wide median of 33 for OpenSSL and 23 for curl). In other words, the libraries that matter most for a network device—the TLS/crypto stack and the HTTP client—are precisely the ones shipping with dozens of known holes.
Has it gotten better over time?
No. Grouping every shipment by its firmware release year and fitting a trend, we find library staleness grew by about 0.16 years per calendar year and the known-CVE load grew by about 1.3 CVEs per year. The share of shipments carrying at least one known CVE rose from ~91% in the late 2000s to ~98–99% and stayed pinned there. The most recent firmware we have (2021–2025) ships a median 5.25-year-old library with 19 known CVEs—no better than a decade earlier. We treat this temporal finding as suggestive rather than definitive, because our recent-year sample is thin, but there is plainly no evidence of improvement.
| Era | Shipments | Median age (y) | Median known CVEs | % ≥1 CVE |
|---|---|---|---|---|
| 2006–2010 | 222 | 3.66 | 11 | 91.4 % |
| 2011–2015 | 1,723 | 5.72 | 17 | 98.5 % |
| 2016–2020 | 1,219 | 5.00 | 16 | 98.8 % |
| 2021–2025 | 170 | 5.25 | 19 | 97.1 % |
Is a corpus like this representative?
We owe readers an honest accounting of the limits, because the obvious objection is that this is a modest, opportunistic sample drawn from firmware that happens to be emulatable. Three points make us confident the core conclusion holds well beyond the sample.
Emulation shaped what we collected, not what we measured. Firmadyne and FirmAE determined which firmware entered the corpus, but our metrics are read from static metadata—library version, library release date, firmware release date—none of which depends on emulation succeeding. And because every figure is computed relative to each firmware's own release date, the common “your corpus is just old” critique simply does not bite: staleness is measured at launch, not today.
The selection bias runs in our favor. Every filter that produced this corpus selects the more responsible end of the market. We sampled established brands with real security programs and public download portals, not the vast white-label and no-name long tail that dominates IoT by volume and patches even less. We measured five of the best-run open-source projects anywhere—projects with disciplined versioning and published CVEs—not the proprietary SDKs and board-support libraries that get even less attention. Our numbers are therefore best read as a conservative floor for consumer IoT, not a worst case.
The result is robust inside the sample. It does not depend on our two dominant vendors: every vendor with at least ten samples shows a multi-year median staleness, and dropping Netgear and D-Link entirely leaves the picture essentially unchanged (median 5.6 years stale, 14 known CVEs, 87.8% with ≥1 CVE). Collapsing the data to distinct (vendor, product, library, version) combinations—removing any inflation from the same binary being reused across products—gives 878 independent cases and the same medians (4.9 years, 15 CVEs). A finding that survives discarding 91% of the rows is not a sampling artifact; the effect size is simply too large for sampling noise to erase.
What we do not claim is a precise percentage for “all of IoT,” nor any extrapolation to RTOS or microcontroller-class devices, which follow a different software model we did not study. But the mechanism behind the result is industry-wide and independent of which devices are emulatable: firmware is built once against a frozen vendor SDK or board-support package, the open-source versions are pinned at SDK-creation time, and the image is rarely rebuilt before launch—let alone refreshed over the product's service life. That pipeline is why devices launch outdated, and it does not care whether a given device made it into our corpus. The same pattern appears in independent large-scale firmware studies and in annual open-source-risk reports, which is what we would expect if the cause is structural rather than peculiar to our sample.
Conclusion
For the slice of IoT we can measure directly—Linux-based consumer routers, gateways, and cameras from major vendors—shipping years-old open-source libraries riddled with already-public CVEs is the norm, not the exception. The median device launches half a decade behind upstream with fifteen known vulnerabilities baked in, and there is no sign the industry is improving. Because we sampled the disciplined end of the market and the best-maintained libraries, the broader reality is very likely worse.
This is exactly the risk landscape Aarno Labs builds for. When a vulnerability lives in a stale dependency and the vendor has moved on—or never intended to update the device at all—the practical path forward is often to analyze and remediate the binary directly, with assurance, as we have written about previously with our CodeHawk Binary Patcher. The dataset behind this post is now available for the research community; if you work on firmware security, software composition analysis, or binary remediation, we would love for you to use it. And if you are wrestling with vulnerabilities in devices the vendor won't fix, please get in touch.