This is the second in a series of posts that detail our CodeHawk Analysis Platform’s capabilities to help discover, understand, and remediate (with or without vendor intervention) vulnerabilities on production stripped binaries. The first post focused on how CodeHawk’s automated memory safety analysis flagged and enabled us to rapidly understand CVE 2024-12248, a buffer overflow vulnerability in the Contec CMS 8000 Patient Monitor. This post details how Aarno Labs employed CodeHawk’s high-assurance binary patching workflow to verifiably fix the vulnerability without costly and ad-hoc direct binary manipulation. This was work performed as part of our AMdP research project with funding from ARPA-H’s DigiHeals program in collaboration with STR and Vector 35. The CodeHawk Binary Patcher was initially developed under DARPA’s AMP program on our MRAM project.

Motivation

Unpatched vulnerabilities in legacy devices and software pose a serious risk. Users depend on the original vendor for opaque and unpredictable updates—or, in the case of end-of-life devices, forced to seek external workarounds such as ad-hoc network filters or complete device replacement. The CodeHawk Binary Patcher was developed to change that: it enables users to remediate vulnerabilities in end-of-life devices without vendor support, and with greater assurance than many vendor-supplied updates. Today, binary patching remains a niche, error-prone, and costly process that depends on deep reverse engineering expertise. CodeHawk streamlines this by introducing familiar source-level development workflows to binary patching—no source code required—and then applying rigorous validation and verification to ensure the patch’s correctness.

Aarno Labs employed CodeHawk to rapidly develop a patch for CVE 2024-12248 during the period between disclosing the vulnerability to Contec and the release of Contec's official patch. We had a patch very quickly, and given Contec’s lack of response to previously reported vulnerabilities and the severity of this flaw, this provides an excellent case study on the capabilities of CodeHawk to address a common scenario in mission-critical applications. In general, CodeHawk’s high-assurance firmware patches offer a path forward for organizations unwilling to accept the risk of running an unpatched device. By applying CodeHawk’s patch, they can continue using the device with greater confidence. Currently, U.S. regulatory agencies do not recognize third-party binary patches to medical device firmware as compliant. However, this position is under active review, and projects like ours are part of ongoing regulatory discussions.

CodeHawk Binary Patcher

The CodeHawk Binary Patcher provides workflows to modify a stripped binary without requiring reverse engineering experience and to verify and validate the changes are correct . A stripped binary is lifted into valid and modifiable C Code, with checkable proofs that validate the preservation of semantics of the translation. The lifted C Code can then be analyzed with CodeHawk’s C Analyzer to find potential sources of undefined behaviors, including memory safety vulnerabilities. When a vulnerability (or other bug) is discovered and understood, the developer can fix it by directly modifying the lifted C code, employing idiomatic C, e.g., variable names, types, and control flow constructs like conditional expressions, early returns, and loops.

CodeHawk then recognizes the changes to the C code and synthesizes binary instructions to perform the changed semantics. CodeHawk does not recompile anything and operates at an expression granularity. This enables CodeHawk to create minimally invasive patches that preserve as much of the original binary as possible to enable precise verification. Next, CodeHawk automatically inserts or replaces the new instructions in the binary. For modifications that insert instructions, CodeHawk automatically finds free space (for ELF binaries), and a trampoline is created in the found free space.

An important benefit of using CodeHawk Binary Patcher is that it allows you to focus on the high-level details of your remediation and think through the best approach instead of being bogged down by the amount of details required to successfully patch a binary. We will explore this with an example later in this post.

Assurance

Once CodeHawk has produced a modified binary, it focuses on demonstrating correctness. We focus on two aspects of correctness: 1) did CodeHawk correctly apply the semantics of the C modification to the binary, and 2) for vulnerabilities, does the patch close the vulnerability? For 1), CodeHawk provides a series of validations of the patch, built on its abstract interpretation and binary invariants framework. It then creates a comparison of the C changes (introduced by the developer) and the lifted binary changes that CodeHawk automatically enacted. For 2) CodeHawk lifts the modified binary to C, and applies its C analysis to attempt to prove that the change removes the undefined behaviors present in the original program related to the vulnerability.

CodeHawk also includes workflows that help ensure the patch does not affect correct behaviors (that should not have been altered by a remediation). For example, CodeHawk includes a C semantic comparison that helps determine whether intended function was affected by the patch, presented in the familiar diff format. Furthermore, CodeHawk’s invariants can be used as a basis for manually developing verifications of higher-level behavioral properties, e.g., state transitions and control flow correctness, and preservation of protocol adherence.

The high-assurance, developer-friendly, binary patching workflows are available in our CodeHawk BinaryNinja Plugin. If you are interested in using CodeHawk, please contact Aarno Labs.

Patching

As described in the first post in this series, CVE 2024-12248 is a buffer overflow vulnerability that can lead to remote code execution. The buffer overflow is caused by a lack of input sanitization on untrusted network data sent to the patient monitor, causing the code to use potentially malicious data as an index into a global array. Two functions are involved in the vulnerability: a listener function that consumes the network data and a parsing function that operates on the data.

Patch Development

Where should the patch be placed? The bounds can be checked in the parse function, just before the array accesses, eliminating the need to transfer any responsibility to the caller (listener) function. Alternatively, the bounds checks can be placed in the listener function where the data is received. There is no a priori best choice; usually, the choice is governed by considering which of the two functions is better suited to handle the error case, that is, if the index value is out of bounds. For example, if there was a need to log the error, one function may be preferred based on the information in the log message.

If we elevate the discussion, we can appreciate how our approach to binary patching enables high-level conversations similar to those that occur when developing fixes at the source code level. These discussions may be challenging to consider when manually modifying binary instructions.

Patching the Listener Function

In the listener function, data packets are received and processed in an infinite loop. Our strategy for handling the error case is to abandon the entire packet silently. This can be accomplished by cutting short the loop by invoking continue when invalid data is detected. We do this right before token_1 is assigned the value received:

As you can see, the patch can be written in straightforward C source code. It refers to local variables, uses struct accesses, adds a compound if statement, and executes continue to drop the packet and go back to waiting for network data.

Once the patch has been typed, we can ask the CodeHawk C Analyzer to confirm that our proposed fix addresses the issue. If you remember from the first post, the C Analyzer call to the parsing function had 4 violations:

With our proposed fix, the violations related to buffer overflow and underflow using token_1 are no longer there (the remaining violation refers to another field and is not exploitable):

Patching the Parsing Function

The parsing function handles a single packet of data at a time. It distinguishes several cases controlled by the value data->token_4 - 1, where data->token_4 is also a field assigned from data read from the network. The vulnerability causing CVE-2024-12248 is in case 2 ( data->token_4 = 3 ). The patch can be inserted at the start of that case:

As in the listener function case, the patch can be expressed in high-level terms, allowing us to refer directly to the data pointer passed as a function argument, and add an early return to prevent the buffer overflow. In this case, the early return is equivalent to dropping the packet, and thus has the same effect as the patch in the listener function.

Once again, we can ask the C Analyzer to check if our proposed fix addresses the issue. First, we can look at its output before we type the patch:

As explained in the previous post, whether the array access is safe is delegated to the caller of the function, as represented by the index-upper-bound and index-lower-bound preconditions. After typing our proposed fix, we can see that the patch closes the vulnerability with a verification of the absence of undefined behaviors (related to memory safety) that enabled the vulnerability:

The index-lower-bound and index-upper-bound proof obligations that were delegated to the API are no longer there since they are now proven locally safe using invariants obtained from the checks. Furthermore, the function preconditions for these proof obligations are now eliminated, along with the corresponding supporting proof obligations in the listener function.

Patch Generation

For our remediation, we chose to patch the listener function. Once the patch has been added to the lifting, the next step is to generate the patched binary. This is a completely automated process, relieving the developer from having to worry about the nuances, pitfalls, footguns, and minutiae of binary patching: Are there registers at my chosen patch location that have the values that I’m interested in, or do I need to log them from the stack or memory? What address should I jump to continue the loop iteration? Where should I place the trampoline? CodeHawk Binary Patcher takes care of all this, and applies the patch after a single button is clicked!

Since our patch adds code, a trampoline is inserted at the chosen location, encapsulating the bounds checks and continuing the loop if bad network data is received. Adding a trampoline minimizes the changes to existing binary instructions, and as we will see in the next section, guarantees minimal disruptions to the binary.

Assurance

Once a patched binary has been generated, we evaluate the assurances that CodeHawk generated. The assurances are generated by performing a relational analysis of the original and patched binary, validating that the patch was implemented correctly, and generating a lifting of the patched binary function, which includes the trampoline implementation.

Once the relational analysis is done, the UI changes to allow us to explore the generated assurances:

The initial screen allows you to get a high-level overview of the changes performed by the patch. The system will report on the number of functions changed in the patched binary:

And on the role that each of those changed functions played:

This information confirms that the patch was implemented at the correct location and that no other part of the binary was modified.
In the sidebar, we can see the rest of the assurances available to us, from high-level to low-level:

Let’s look at some of them. The High-level lifting analysis compares the proof obligations generated by the C analyzer for the original lifting and the patched lifting. It’s important to note that this is performed on the lifting generated from the patched binary as generated by the system. Crucial to this step is CodeHawk’s ability to validate the trampoline and inline its functionality as part of the lifting for the patched function:

This enables the C analyzer to consider the checks as part of its verification. When we look at the analysis of the call to the parse function, we can see that the violations underlying the buffer overflow are now gone, which proves that we have successfully mitigated the vulnerability:

The Binary lifting comparison compares the lifting generated for the original binary and the lifting generated for the patched binary.

The crucial detail to gather from this view is that the only part of the function that changed is where we inserted the patch. The rest of this large, complex function has been untouched, which means that any extra testing and certification that needs to be performed on the patched binary before deploying it into production can be laser-focused on this small part of the binary. This also proves that CodeHawk does not recompile anything; it surgically inserts the modification, aiming for minimal binary-level disruption.

Finally, the User lifting comparison shows a comparison between the lifting that we modified and submitted to the system with our patch, and the lifting generated from the patched binary:

This allows you to evaluate whether the semantics of the patched binary match what was intended from the submitted lifting-level patch. In this case we can see that the check is equivalent. Interestingly, even though we used the structure field for our comparison when we developed the patch, the system determined that the value to compare was already in a register as a result of it being the return value from a function. It used that in its trampoline implementation instead of having to load it from stack memory where the structure resides. This led to a simpler and more efficient patch. These are the details that CodeHawk Binary Patcher abstracts away to let you focus on the important aspects of vulnerability mitigation.

We hope you have enjoyed this walkthrough of CodeHawk’s binary patching capabilities. Please don’t hesitate to get in touch if you’d like to try out the system.

Aarno Labs Blog

Contents

Solutions

Projects:

Technical Areas:

Feed:

Share:

High-Assurance Remediation of CVE-2024-12248

Motivation

CodeHawk Binary Patcher

Assurance

Patching

Patch Development

Patching the Listener Function

Patching the Parsing Function

Patch Generation

Assurance