As part of DARPA’s E-BOSS program, Aarno recently participated in a one-day on-site evaluation hosted by the Technical Area 2 team, Cromulence. For this evaluation, Cromulence supplied challenge applications and CVE records, and asked performers to rapidly determine whether each vulnerability was reachable, remediate reachable vulnerabilities, and synthesize proof-of-vulnerability inputs.
This kind of workflow exposes a central difficulty in vulnerability triage: reachable is not the same as triggerable.
A function may be present in a vulnerable library. It may even be reachable from an application entry point. But that still does not prove that an attacker can drive the application into the vulnerable state. Conversely, a vulnerability record may point at an imprecise function, misleading static analysis toward the wrong question. What defenders ultimately need is concrete evidence: an input that demonstrates triggerability in the deployed application context.
That is the role of DIODE.
DIODE is Aarno’s input-synthesis workflow for moving from candidate reachability to concrete proof-of-vulnerability generation. Given a seed input and a target execution region, DIODE uses dynamic taint tracking to identify which input bytes influence control flow and vulnerability-relevant program state. It then builds symbolic expressions over those input-derived values, explores alternate branches by solving modified constraints, and validates the resulting inputs by running the program.
When DIODE succeeds, the output is not just a static argument that a vulnerability might matter. It is a concrete PoV input, along with runtime evidence showing how that input drives the program into the vulnerable behavior.
This post walks through one E-BOSS case study: CVE-2025-28164 in the pdfmaker challenge. The example is useful because the vulnerability was not a simple path-to-function problem. The provided CVE metadata pointed at one libpng function, while the actual trigger condition involved a violated error-handling contract between libpng and libharu. That distinction made the case ambiguous for static reasoning, but a good fit for DIODE’s runtime, input-driven analysis.
The challenge: imprecise CVE metadata and a non-obvious trigger condition
The pdfmaker challenge uses libharu, a PDF generation library, to embed PNG or JPEG images into a generated PDF. The CVE record supplied during the evaluation identified a libpng vulnerability:
{
"cve-id": "CVE-2025-28164",
"cve-description": "Buffer Overflow vulnerability in libpng 1.6.43-1.6.46 allows a local attacker to cause a denial of service via png_create_read_struct() function.",
"package-name": "libpng",
"package-version": ">=1.6.43|<=1.6.46",
"cwe-id": "120",
"cwe-name": "Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')",
"affected-function": "png_create_read_struct",
"affected-file": "pngread.c"
}The record names png_create_read_struct as the affected function. That is a real function on the PNG-loading path, but it is not the most useful description of the trigger condition. The actual issue arises from the way libharu integrates with libpng’s fatal error mechanism.
libpng’s error handling relies on a specific contract: when libpng calls png_error(), that error is fatal. The registered error handler must not return. The normal recovery mechanism is setjmp/longjmp, although an application could choose to abort or exit instead. What matters is that control must not continue back into libpng after a fatal error.
libharu registers a custom PNG error callback, PngErrorFunc. In libharu 2.4.5, that callback records an error but then returns normally:
// libharu-2.4.5/src/hpdf_image_png.c
static void
PngErrorFunc (png_structp png_ptr,
const char *msg)
{
char error_number[16];
HPDF_UINT i;
HPDF_STATUS detail_no;
HPDF_Error error;
HPDF_MemSet (error_number, 0, 16);
for (i = 0; i < 15; i++) {
error_number[i] = *(msg + i);
if (*(msg + i + 1) == ' ')
break;
}
error = (HPDF_Error)png_get_error_ptr (png_ptr);
detail_no = (HPDF_STATUS)HPDF_AToI (error_number);
HPDF_SetError (error, HPDF_LIBPNG_ERROR, detail_no);
} // returns normallyThat return is the bug.
libpng’s internal code is written with the assumption that png_error() does not return. Code after a png_error() call is intentionally not defensive, because in a correctly integrated client that code is unreachable. If the application’s error callback returns anyway, execution continues down a path libpng was never designed to support. That path can operate on attacker-controlled values that have already been deemed invalid.
For example:
if (uval > PNG_UINT_31_MAX)
png_error(png_ptr, "PNG unsigned integer out of range");
return uval; // intended to be unreachable after png_error()In a correct integration, the return uval statement is not reached when uval is invalid. In the vulnerable libharu integration, PngErrorFunc returns, png_error() returns, and the invalid value continues to flow through libpng’s parser.
This makes the vulnerability more subtle than “can the application reach png_create_read_struct?” The more relevant question is: can attacker-controlled PNG input cause libpng to call png_error() while using libharu’s returning error handler?
That is a triggerability question, not just a reachability question.
Why static reasoning is not enough
During the evaluation, Aarno’s reachability analysis identified candidate paths from the application into the libpng/libharu PNG-loading region. That was useful triage: it showed that the application exercised code related to the CVE-attributed function and should not be dismissed as merely containing an unused vulnerable dependency.
But the CVE metadata did not precisely describe the real trigger condition. It named png_create_read_struct, while the actual defect was broader: any fatal libpng error reached through libharu’s returning error callback could create an invalid continuation path.
That distinction matters. Static reachability can answer questions like:
- Is the vulnerable package present?
- Is the affected function present?
- Is there a plausible call path from an application entry point to that function?
- Is the function imported, exported, or called through indirect edges?
Those are valuable triage questions. But they do not necessarily identify the input condition that triggers the vulnerability, especially when the root issue is a violated library contract rather than a single obviously dangerous call.
A static review can also become anchored to the CVE’s function attribution. If the CVE points at a function that is on the setup path rather than the failure path, the analysis can spend its effort proving or disproving the wrong thing. In this case, the crucial fact was not merely that png_create_read_struct was reachable. The crucial fact was that attacker-controlled PNG data could force libpng to call png_error(), and libharu’s error handler would return when it must not.
This is where DIODE changes the nature of the evidence.
Instead of relying only on a static argument about whether a path exists, DIODE observes an actual execution. It tracks which input bytes influence branch conditions and parser state. It then asks: what would happen if we changed these input-derived values? Can we flip a constraint and drive execution into a crashing state?
That is how we move from “probably reachable” to “concretely triggerable.”
Claude Sonnet 4.6: A plausible but wrong static conclusion
We also asked an LLM to review the application and CVE metadata. Its conclusion was technically detailed and plausible: it argued that the relevant libpng failure path depended on the simplified API, while pdfmaker used libharu through the traditional API, so the CVE is unreachable in the application. Given the CVE’s attribution to png_create_read_struct, that was a reasonable line of analysis. But it answered the wrong question. The trigger did not depend on the simplified API path. It depended on any libpng fatal error reaching libharu’s returning PngErrorFunc.
DIODE’s workflow
DIODE starts with a seed input that exercises the relevant parsing path. For pdfmaker, the application takes two attacker-controlled fields: a title string and an uploaded image. The vulnerable behavior is in PNG processing, so the useful seed is a valid PNG that reaches libpng’s chunk parser through libharu.
DIODE runs the application with dynamic taint tracking enabled. As the program processes the seed, DIODE records branch constraints whose operands depend on input bytes. For each such branch, it can build a symbolic expression over the corresponding bytes.
The workflow is:
- Run the program on a seed input.
- Track input-derived bytes through parsing and control flow.
- Extract symbolic expressions for input-derived branch constraints.
- Select constraints to explore.
- Invert a constraint and solve for concrete byte values.
- Mutate the input accordingly.
- Re-run the program and check whether the new input triggers the vulnerability.
This is narrower than general-purpose fuzzing and more targeted than broad concolic exploration. DIODE is not trying to maximize coverage across the whole program. It is trying to use runtime evidence from a relevant execution to synthesize an input that proves triggerability.
The post-event engineering lesson: input-source mapping matters
During the one-day evaluation, DIODE did not produce the official PoV before time expired. The post-event analysis exposed an important engineering issue in our harnessing of the challenge.
Because pdfmaker takes both a title and an image, and because DIODE at the time expected a single seed file, we initially packed both attacker-controlled fields into one file: title bytes first, then PNG bytes. The harness split that packed seed into the two application inputs.
That seemed natural. Both fields come from the attacker. The title is shorter, so placing it first was a reasonable layout.
But pdfmaker does not process the uploaded PNG bytes directly. It extracts the image from the multipart upload, writes it to a temporary file, and then libharu re-reads that file through HPDF_LoadPngImageFromFile, which eventually calls fread().
That detail changed the taint coordinate system.
The bytes libpng processed were tainted relative to the temporary PNG file, not relative to the original packed seed. So when DIODE’s solver identified, for example, PNG byte 0x4c as the byte to mutate, that was correct relative to the PNG file. But the mutator wrote to byte 0x4c of the packed seed. Since the packed seed started with the title field, byte 0x4c in the packed seed was not byte 0x4c of the PNG. It was shifted by the title prefix.
The solver had identified the right kind of mutation. The harness applied it to the wrong byte.
Once we understood this, the fix was straightforward: use the PNG itself as the DIODE seed and keep the title fixed in the client. The vulnerability is in PNG processing, so the title does not need to be part of the symbolic input. With that harness correction, the taint offsets and mutation offsets aligned.
This was a useful product lesson. Multi-source inputs, temporary files, and re-read boundaries need first-class modeling. But the underlying DIODE workflow was doing the right thing: it had to connect input-derived constraints to the correct concrete input bytes.
DIODE synthesizes a direct PoV
With the harness corrected, DIODE ran end-to-end on a benign PNG seed and found a direct trigger in a single constraint-flip round.
The relevant libpng code is in png_get_uint_31():
png_uint_32 /* PRIVATE */
png_get_uint_31(png_const_structrp png_ptr, png_const_bytep buf)
{
png_uint_32 uval = png_get_uint_32(buf);
if (uval > PNG_UINT_31_MAX)
png_error(png_ptr, "PNG unsigned integer out of range");
return uval;
}This function is called when libpng reads a PNG chunk length. The PNG format stores each chunk length as a four-byte integer. libpng requires that length to fit within PNG_UINT_31_MAX. If the high bit is set, libpng calls png_error().
In a correct client, that fatal error does not return.
In this challenge, because libharu’s PngErrorFunc returns normally, libpng continues into code that assumes the invalid value could not exist.
DIODE observed the branch condition:
uval > PNG_UINT_31_MAXIt tracked the input bytes flowing into uval, represented the check as a symbolic expression, and inverted the branch. The resulting solution was simple: set the most significant byte of a PNG chunk length to 0x80.
For one seed, DIODE identified byte 0x4c as the most significant byte of a tEXt chunk’s length field. Changing that byte from 0x00 to 0x80 produced a chunk length of 0x80000019, which exceeds PNG_UINT_31_MAX.
The resulting execution produced:
libpng error: PNG unsigned integer out of range
==Got deadly signal, dumping current buffer
==ABORTING due to signal 6 Exit value: 134This is a compact proof of triggerability: a single-byte mutation to an otherwise normal PNG drives the application into the vulnerable behavior.
It is also a different trigger from the TA2-provided PoV. TA2’s PoV triggered the bug through chunk-data exhaustion: a malformed chunk caused libpng to eventually read invalid bytes as a chunk type and raise a fatal error. DIODE found a more direct route: violate the explicit chunk-length sanity check itself.
Both paths exercise the same underlying defect: libpng raises a fatal error, libharu’s error handler returns, and execution continues when it must not.
DIODE generalizes beyond one PoV
The strongest result was not merely that DIODE reproduced a crash. It generalized the trigger condition.
The relevant constraint appears once for each PNG chunk length processed by libpng. A PNG file is a sequence of chunks — for example, IHDR, gAMA, PLTE, tEXt, IDAT, and IEND. Each chunk has a length field. Each length field is checked by png_get_uint_31().
DIODE’s constraint log exposed this structure automatically. The same symbolic pattern appeared at multiple offsets, each corresponding to a different chunk length. DIODE could therefore synthesize crashing inputs by flipping the high bit of the length field for multiple chunk types.
Across valid seed PNGs with different chunk mixes, DIODE produced triggers for several chunk types, including:
IHDRgAMAPLTEtEXtIDAT
This matters because it demonstrates that DIODE was not narrowly memorizing the TA2 PoV or exploiting a single hand-identified offset. It was operating on the program’s input-derived constraints. Once it observed the parser’s length checks, it could synthesize a family of related PoVs.
In fact, an even smaller valid PNG seed — a 1×1 white pixel image — was sufficient. That seed exposed fewer total constraints than the larger PNG but still included multiple png_get_uint_31() checks. Any one of those chunk-length checks was enough to trigger the vulnerable error-handling path.
That is the value of input synthesis over manual PoV construction: DIODE can turn a runtime constraint set into concrete exploitability evidence, and it can reveal that the trigger surface is broader than a single crafted file.
What this case shows
This case study highlights three lessons.
First, CVE metadata is often not precise enough to answer triggerability. The supplied record named png_create_read_struct, but the actual issue was a violated libpng error-handling contract. A tool or analyst anchored too tightly to the named function could miss the broader condition.
Second, reachability analysis and input synthesis answer different questions. Reachability is essential triage: it tells us whether vulnerable code is present and plausibly exposed. DIODE adds a different kind of evidence: whether attacker-controlled input can actually drive the program into the vulnerable state.
Third, runtime input synthesis depends on accurate modeling of input sources. In the one-day evaluation, our initial packed-seed harness caused taint offsets and mutation offsets to disagree. Once we corrected the seed model, DIODE synthesized a distinct PoV automatically and generalized the trigger across multiple PNG chunk types.
Bonus finding: DIODE found an unrelated crash
While running DIODE on valid PNG seeds, we also found a separate crashing input unrelated to the TA2-provided CVE.
After loading the uploaded image, pdfmaker scales it to fit the PDF page:
int width = HPDF_Image_GetWidth(image) * 500 / HPDF_Image_GetHeight(image);There is no guard against a zero-height image. DIODE’s taint analysis observed the PNG IHDR height bytes flowing into the denominator. By exploring alternate values for those bytes, DIODE produced a PNG with a zero-height IHDR field, triggering a divide-by-zero crash in the application’s rendering logic.
This did not count toward the evaluation target, but it illustrates the same point: once DIODE observes how input bytes flow into program decisions and computations, it can synthesize concrete inputs that expose real application behavior.
Conclusion
The important lesson from this E-BOSS case is not that static analysis was useless, or that LLMs are always wrong, or that reachability is insufficient by itself. The lesson is more specific:
Triggerability requires evidence.
Static reachability can tell us where to look. CVE metadata can identify packages, versions, and functions of interest. Human or LLM review can provide useful hypotheses. But for high-confidence vulnerability triage, defenders ultimately need to know whether attacker-controlled input can drive the deployed application into the vulnerable state.
DIODE provides that missing step.
In pdfmaker, the vulnerable behavior depended on a subtle library integration contract: libpng fatal errors must not return, but libharu’s error handler returned. Once DIODE was given an accurately modeled PNG input source, it identified the input-derived length checks, solved for concrete mutations, generated a direct PoV, and generalized the trigger across multiple PNG chunk types.
That is the core promise of DIODE: moving vulnerability analysis from plausible reachability to demonstrated triggerability.
This post was co-authored by Gary Nguyen who has since left Aarno Labs.