Review of Serval System Code Verification Paper
Luke Nelson, James Bornholt, Ronghui Gu, Andrew Baumann, Em- ina Torlak, and Xi Wang. 2019. Scaling symbolic evaluation for automated verification of systems code with Serval. In ACM SIGOPS 27th Symposium on Operating Systems Principles (SOSP ’19), October 27–30, 2019, Huntsville, ON, Canada. ACM, New York, NY, USA, 18 pages. https://doi.org/10.1145/3341301.3359641
To verify a system's correctness, the developers commonly have to write a spec of its intended behavior and make a proof. Writing such proof could be tedious and requires lots of time investment. The authors then introduce an extensible framework for conducting push-button verification, Serval, which leverages a few advantages of Rosette (hybrid approach using bounded model checking and symbolic execution). Here are how Serval overcomes three common issues with verifiers:
- Difficult to reuse: Serval transform a regular program to work on symbolic values
- Complex constraints generated: this commonly makes verification intractable. Serval addresses this by adopting symbolic profiling, helping the verifier developer to optimize constraint generation and identify bottlenecks with domain specific knowledge
- Restricted type of systems could be verified: though Serval does not support concurrency, the authors manage to apply it on a tweaked version of kernel CertiKOS and security monitor Komodo.
Serval provides a lot of utilities for verifier developer to use. For instance, state-machine refinement and noninterference properties are provided. With the help of symbolic profiler, the authors also identify a range of common optimizations for verifying low level system code:
- Symbolic program counters: time is wasted on exploring infeasible paths. This is solved by introducing splitting for symbolic pc registers (getting back from bounded model checking). Additionally, if the program involves unbounded jumps, the symbolic pc registers would lead to exploration of all code locations. In this case, the authors assume this is a security vulnerability.
- Symbolic mem address: Same as pc, symbolic memory location leads to exploring all possible locations in memory. The authors identify that this might be caused by loss of information when compiling C/C++ or high level code to assembly and patch it if encounters common forms of memory access pattern.
- Symbolic system registers: reuses the representation invariant to rewrite symbolic system registers to take concrete values
- Monolithic dispatching: split on each input that triggers trap handlers
Pros:
- The introduction and identification of reusable optimization in verification are quite innovative. The authors show that some common optimization techniques, like splitting on symbolic pc registers, could help a range of scenarios. By leveraging these shared optimizations, the verifier developers could put less effort into applying domain-specific knowledge to analyze the symbolic profiler's output.
- With Serval's help, the verifier developers no longer need to worry about the correctness of compiler or interpreter for the source code. Additionally, there is no more need to spend decades of effort on implementing a compiler or interpreter tailored for a specific language. Instead, like SAT and NP-complete problems, making a great SAT solver is enough for solving all NP-complete problems well. Similarly, a good platform (Serval + Rosette) would suffice to solve verification problems for all languages.
- This turns to be quite useful in identifying specification violations and bugs. On the other hand, the verifier developers are also not required to write complicated interpreters or optimizations to conduct verification. For instance, writing a new verifier for a new architecture (BPF) would only require 472 lines of code. Yet, the time for verification is short (less than an hour for proofs for CertiKOS and Komodo), and the result is fruitful. For BPF specifically, the authors manage to find 9 bugs for RISC-V and 6 bugs for x86-3 that are hard-to-catch corner cases for unit testing. As a result, Serval manages to verify systems in a reasonable amount of time and requiring reasonable effort.
Question for authors
- How could Serval be adapted so that it could identify timing side-channel vulnerabilities? Is it possible to conduct an output "taint" verification (e.g. with a counter of instructions executed as a symbolic variable)?
- How does different architecture leads to difference in verifier? Especially, for x86 and RISC-V, are they just differ from how input is interpreted (i.e. different frontend same backend)? If so, why not use transpiler?