Extracting and Verifying Cryptographic Models from C Protocol Code by Symbolic Execution

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Consider the problem of verifying security properties of a cryptographic protocol coded in C. We propose an automatic solution that needs neither a pre-existing protocol description nor manual annotation of source code. First, symbolically execute the C program to obtain symbolic descriptions for the network messages sent by the protocol. Second, apply algebraic rewriting to obtain a process calculus description. Third, run an existing protocol analyser (ProVerif) to prove security properties or find attacks. We formalise our algorithm and appeal to existing results for ProVerif to establish computational soundness under suitable circumstances. We analyse only a single execution path, so our results are limited to protocols with no significant branching. The results in this paper provide the first computationally sound verification of weak secrecy and authentication for (single execution paths of) C code.

💡 Research Summary

The paper tackles the long‑standing gap between cryptographic protocol specifications and their low‑level implementations in C. Existing tools either require a pre‑written protocol description, rely on manual annotations, or are too coarse to handle authentication properties and pointer‑rich code. The authors propose a fully automatic pipeline that starts from raw C source code and ends with a computationally sound verification of weak secrecy and authentication using the well‑established protocol analyser ProVerif.

The pipeline consists of three main stages. First, the C code is compiled to a simple stack‑based intermediate language called CVM (C Virtual Machine). CVM abstracts away high‑level C constructs while preserving the essential operations needed for protocol execution: reading from the network, generating random values, writing to the network, signalling events, and a single test instruction that aborts execution if a boolean condition fails. By restricting control flow to a single linear path, the language eliminates loops and recursion, making subsequent analysis tractable.

Second, the CVM program is symbolically executed. Unlike prior symbolic execution approaches that treat each byte as an independent variable, this work models variables as bit‑strings of potentially unknown length, allowing the analysis of buffers whose size is determined at runtime. During symbolic execution, each memory region is associated with a symbolic expression (e.g., hmac(01‖x, k)) that captures how its contents are derived from inputs, random values, and cryptographic primitives. The cryptographic primitives themselves are not interpreted concretely; instead, the analyst supplies abstract symbolic models (e.g., mac, enc) that act as a trusted base.

The output of symbolic execution is a program written in an intermediate modelling language (IML). IML removes destructive updates and pointer arithmetic, representing all data manipulations with constructors and destructors on bit‑strings. This form is amenable to translation into the applied pi‑calculus, the formalism understood by ProVerif.

Third, the IML program is systematically translated into an applied pi‑calculus process. Constructors become message constructors, destructors become pattern‑matching projections, and the special “event” statements become ProVerif events. The authors prove three theorems: (1) the symbolic execution preserves the security property of the original CVM program; (2) the translation from IML to the pi‑calculus preserves the property; (3) under the computational soundness result of Backes, Pfitzmann and others (cited as

Extracting and Verifying Cryptographic Models from C Protocol Code by Symbolic Execution

💡 Research Summary

Comments & Academic Discussion

Leave a Comment