Marcelo Schmitt

How are Linux device drivers being tested?

Device drivers are a substantial part of the Linux kernel accounting for about 66% of the project’s lines of code. Testing these components is fundamental to providing confidence to the operation of GNU/Linux systems under diverse workloads. Testing is so essential that it is considered part of the software development cycle. However, testing device drivers may be hard due to many possible drawbacks such as not exposing a user space interface, architecture dependence, requirement of custom configuration symbols, etc. A fact that instigates to question: how are device drivers being tested?

To answer this question is the main goal of my master’s research project at the Institute of Mathematics and Statistics of the University of São Paulo (IME-USP). To provide a comprehensive answer to the question, I resorted to a systematic mapping study, and to a grey literature review. With the former, I gathered information from peer-reviewed articles published by academic media while the latter provided data from online publications sponsored by reputable organizations and magazines.

There is a variety of ways to test software, though. One may provide a program with all possible inputs (random testing / fuzzing), run only a few parts of the program (unit testing, integration testing), run a program under extreme conditions (stress testing), analyze the code semantics (semantic check, static analysis), or apply any other of many software testing techniques. This is because software testing “is the process of determining if a program behaves as expected.[1]An activity in which a system or component is executed under specified conditions, the results are observed or recorded, and an evaluation is made of some aspect of the system or component.[2]

It is arguable that sometimes we don’t even need to run a program to estimate if it’s going to work as desired. If we relax a little bit our concept of software testing, we may consider code review as a sort of testing as well. “Code walkthrough, also known as peer code review, is used to review code and may be considered as a static testing technique.[1] However, assessing every such testing practice would be a cumbersome task, one that would not fit in a master’s program. So, to limit the scope of this investigative work, we (my advisors and I) decided to look only at a subset of the tools used for kernel testing. Precisely, the tools that enabled device driver testing.

This article presents a discussion about some of the tools employed to test Linux device drivers. The tools addressed here are the ones most cited by publications assessed in our mapping study and grey literature review. Nevertheless, these tools do not portray all solutions available for kernel testing, nor do they encompass all the possible approaches for driver testing. The details of our research methods are available here.

Linux Kernel Testing Tools

Our answer to the question of how Linux drivers are being tested is going to stem from the assessment of the tools that are said to enable driver testing. In fact, we derived a couple of subquestions to guide ourselves throughout this investigation. What testing tools are being used by the Linux community to test the kernel? What are the main features of these tools? As we need to get to know the testing apparatus to evaluate their role in assessing the functioning of device drivers, we decided to make a catalog to synthesize the information about these testing tools. Thus, one of our goals is to catalog the available Linux kernel testing tools, their characteristics, how they work, and in what contexts they are used.

Besides, I’d like to give something back to the kernel community. For that, I intend to use the testing tool catalog to provide advice to fellow kernel developers interested in enhancing their workflow with the use of testing tools. Also, there is a Kernel Testing Guide [3] documentation page that could benefit from our work. Moreover, a mailing list thread [4] indicates that there is a desire for a more complete testing guide.

Thank for you writing this much needed document.

Thanks, Shuah: I hope I haven’t misrepresented kselftest too much. :-)

Looks great. How about adding a section for Static analysis tools? A mention coccicheck scripts and mention of smatch?

Good idea. I agree it’d be great to have such a section, though I doubt I’m the most qualified person to write it. If no one else picks it up, though, I can try to put a basic follow-up patch together when I’ve got some time.

To build the proposed test tool catalog, we carried out an evaluation process to assess the usage of each testing tool selected throughout our study. We searched each project’s repository, looked their documentation for instructions on how to install and use each tool, installed them, and, finally, made basic use of the testing tools. Moreover, we reached every author by email when we faced setbacks in using those testing tools. We also visited the commit history of some projects and their corresponding mailing lists between 2022-01-12 and 2022-01-24.

Let us finally talk about those testing tools.

What are the options?

To some extent, several tools can facilitate device driver testing. It’s nearly impossible to analyze them all. Yet, this post covers the twenty Linux kernel testing tools selected by our study for being either focused on driver testing or most cited by online publications. These tools make up an heterogeneous group of test solutions comprising diverse features and testing techniques. From unit testing to end-to-end testing, dynamic or static analysis, many ways of puting Linux to the test have been conceived. The following table tries to outline test types and tools associated with them.

The checks match tests and tools according to what we found expressly reported in the literature, with a few complementary marks added by me. Also, some testing types interleave with each other. For instance, unit testing may be considered a sort of regression testing, fuzzing is also an end-to-end test, fault injection tools often instrument the source code to trigger error paths. Thus, it’s more than possible that some tools be not marked with all types of tests they can provide. If you think that’s the case, feel free to leave a comment at the end of the page. Improvement suggestions are greatly appreciated.

\ ________ Tool Test Type \ Kselftest 0-day KernelCI LKFT Trinity Syzkaller LTP ktest Smatch Coccinelle jstest TuxMake Sparse KUnit SymDrive FAUmachine ADFI EH-Test COD Troll
Unit testing (2)                                    
Regression testing (1)                                      
Stress testing (2)                                    
Fuzz testing (3)                                  
Reliability testing (1)                                      
Robustness testing (1)                                      
Stability testing (1)                                      
End-to-end testing (3)                                  
Build testing (2)                                    
Static analysis (3)                                  
Symbolic execution (1)                                      
Fault injection (1)                                  
Code Instrumentation (2)                                    
Concolic Execution (1)                                      
Local analysis and grouping (1)                                      

To give an overview of how easy (or not) it is to use these tools, I made a chart of setup versus usage effort. First, I defined a set of tasks that comprise the work of setting up a test tool. Then, I counted how many of these tasks I carried on while setting up the considered tools. A test tool got a point if I had to:

Finally, I gave five points to the tools I could not set up after trying all the above. The effort ratings were set to None for tools with no effort point, Low for tools with one or two points, Moderate for tools with three to four points, and High for tools with five points.

The classification of usage effort took into account if:

In some cases, I couldn’t get test results because I could not run the tests myself or the test results were unavailable to me. Those tools got three effort points. Lastly, the usage effort ratings were set to None, Low, Moderate, and High for tools with zero, one, two, and three points, respectively.

Setup and usage effort of Linux kernel testing tools.
Figure 1. Setup and usage effort of Linux kernel testing tools.

The next table shows the exact traits that had led me to classify the testing tools the way they appear in the above picture. The cases where I was unable to setup or to run a tool are indicated with a U. Also, there are two slight adjustments I’d like to note. First, I gave an additional D to ktest because it took me almost three days to setup everything needed to run it. Second, I added two extra points to FAU machine since its documentation was completely outdated and unusable.

Tool Setup Effort Points Usage Effort Points
Kselftest A R
Trinity B, C R, S
Syzkaller A, B, E R, S
LTP B, C, D R, S
ktest A, C, D, E, extra D R, S
Smatch A, B R
Coccinelle A R
jstest A R
TuxMake A R
Sparse A R
KUnit - R
SymDrive U U
FAU machine A, B, E, +2 R, S
ADFI U U
EH-Test U U
COD U U
Troll A, B, E R, T

One may, of course, discord from these criteria in many ways. The next section presents some evidence to support my considerations about these tools. Anyhow, if you’re unhappy with the information given here or feel like a tool was too misrepresented, don’t hesitate to write a comment at the end of the post.

Tool Characterization

This section briefly describes each evaluated test tool and tells the experience we had when using them for testing the Linux kernel.

Kselftest

Kernel selftests (kselftest) is a unit and regression test suite distributed with the Linux kernel tree under the tools/testing/selftests/ directory [5] [6] [7]. Kselftest contains tests for various kernel features and sub-systems such as tests for breakpoints, cpu-hotplug, efivarfs, ipc, kcmp, memory-hotplug, mqueue, net, powerpc, ptrace, rcutorture, timers, and vm sub-systems [10]. These tests are intended to be small developer-focused tests that target individual code paths and short-running units supposed to terminate in a timely fashion of 20 minutes . Kselftest consists of shell scripts and user-space programs that test kernel API and features. Test cases may span kernel and use-space programs working in conjunction with a kernel module to test [6]. Even though kselftest’s main purpose is to provide kernel developers and end-users a quick method of running tests against the Linux kernel, the test suite is run every day on several Linux kernel integration test rings such as the 0-Day robot and Linaro Test Farm [7]. It is stated that someday Kselftest will be a comprehensive test suite for the Linux kernel [20] [25].

We had a smooth experience while using kselftest. There are recent patches and ongoing discussions on the project mailing list as well as recent commits in the subsystem tree. The documentation presents all the instructions necessary to compile and run the tests. There are also sections exemplifying how to run only subsets of the tests. Some kselftest tests require additional libraries, listed with the kselftest_deps.sh script. Unfortunately, by the date we evaluated the tool, the documentation did not mention the build dependencies script. Nevertheless, kselftest documentation had enough information for us to run some tests without any problem.

0-day test robot

The 0-day test robot is a test framework and infrastructure that runs several tests over the Linux kernel, covering core components such as virtual memory management, I/O subsystem, process scheduler, file system, network, device drivers, and more [31]. Static analysis tools such as sparse, smatch, and coccicheck are run by 0-day as well [24]. These tests are provided by Intel as a service that picks up patches from the mailing lists and tests them, often before they are accepted for inclusion [25]. 0-day also test key developers’ trees before patches move forward in the development process. The robot is accounted for finding 223 bugs during a development period of about 14 months from Linux release 4.8 to Linux 4.13 (which came out September 3, 2017). With that, the 0-day robot achieved the rank of top bug reporter for that period [25]. Analyzing Linux 5.4 development cycle, though, Corbet [34] reports that there have been worries that Intel’s 0-day testing service is not proving as useful as it once was.

KernelCI

KernelCI is an efort to test upstream Linux kernels in a continuous integration (CI) fashion. The project main goal is to improve the quality, stabiblity and long-term maintenance of the Linux kernel. It is a community-led test system that follows an open philoshophy to enable the same collaboration to happen with testing as open source does to the code itself [19] [27]. KernelCI generates various configurations for different kernel trees, submits boot jobs to several labs around the world, collects, and stores test results into a database. The test database kept by KernelCI includes tests run natively by KernelCI, but also Red Hat’s CKI, Google’s syzbot and many others [23][27].

LKFT

Linaro’s LKFT (Linux Kernel Functional Testing) is an automated test infrastructure that builds and tests Linux release candidates on the arm and arm64 hardware architectures [24]. The mission of LKFT is to improve the quality of Linux by performing functional testing on real and emulated hardware targets. Weekly, LKFT runs tests over 350 release-architecture-target combinations on every git-branch push made to the latest 6 Linux long-term-stable releases. In addition, Linaro claims that their test system can consistently report results from nearly 40 of these test setup combinations in under 48 hours [28] [35]. LKFT incorporates and runs tests from several test suites such as LTP, kselftest, libhugetlbfs, perf, v4l2-compliance tests, KVM-unit-tests, SI/O Benchmark Suite, and KUnit [36].

Trinity

Trinity is a random tester (fuzzer) that specializes in testing the system call interfaces that the Linux kernel presents to user space [33]. Trinity employs some techniques to pass semi-intelligent arguments to the syscalls being called. For instance, it accepts a directory argument from which it will open files and pass the corresponding file descriptors to system calls under test. This can be useful for discovering failures in filesystems. Thus, Trinity can find bugs in parts of the kernel other than the system call interface. Some areas where people used Trinity to find bugs include the networking stack, virtual memory code, and drivers [32] [33].

Trinity is accessible through a repository on GitHub. The latest commit to that repository is from about 1.5 months behind the repository inspection date. From the recent commit history, we estimate that the project change rate is roughly one commit per month and that the tool has been maintained by three core developers. Trinity documentation is scarce and has not been updated for four years. Although there are some usage examples, the documentation does not contain a tool installation guide.

In our experience with Trinity, we let the fuzzer run for a few minutes. It looks like Trinity is still working the same way Konovalov [13] described: “Trinity is a kernel fuzzer that keeps making system calls in an infinite loop.” There is no precise number of tests to run and no time limit for their completion. After being interrupted, the program shows the number of executed system calls, how many ended successfully, and how many terminated with failures.

Syzkaller

Syzkaller is said to be a state-of-the-art Linux kernel fuzzer [13]. The syzbot system is a robot developed as part of the syzkaller project that continuously fuzzes main Linux kernel branches and automatically reports found bugs to kernel mailing lists. Syzbot can test patches against bug reproducers. This can be useful for testing bug fix patches, debugging, or checking if the bug still happens. While syzbot can test patches that fix bugs, it does not support applying custom patches during fuzzing. It always tests vanilla unmodified git trees. Nonetheless, one can always run syzkaller locally on any kernel for better testing a particular subsystem or patch [18]. Syzbot is receiving increasing attention from kernel developers. For instance, Sasha Levin said that he hoped that failure reproducers from syzbot fuzz testing could be added as part of testing for the stable tree at some point [11].

Syzkaller is accessible from a GitHub repository. The project received various contributions in the length of time close to our evaluation window. The majority of those changes were committed by five core developers. Also, the Syzkaller project mailing list had several messages recent to our evaluation period.

The Syzkaller documentation is fairly complete. It contains detailed instructions on how to install and use Syzkaller as well as several troubleshooting sections with tips against possible setup problems. The documentation also includes pages describing how the fuzzer works, how to report bugs found in the Linux kernel, and how to contribute to the tool.

When run, Syzkaller prints execution environment information to the terminal and activates an HTTP server. The server pages display detailed test information such as code coverage, the number of syscall sequences executed, number of crashes, execution logs, etc.

LTP

The Linux Test Project (LTP) is a test suite that contains a collection of automated and semi-automated tests to validate the reliability, robustness, and stability of Linux and related features [10] [14]. By default, LTP run script include tests for filesystems, disk I/O, memory management, inter process comunication (IPC), the process scheduler, and the system call interface. Moreover, the test suite can be customized by adding new tests, and the LTP project welcomes contributions [10].

Some Linux testing projects are built on top of LTP or incorporate it somewhat. For example, LTP was chosen as a starting point for Lachesis, whereas the LAVA framework provides commands to run LTP tests from within it [10]. Another test suite that runs LTP is LKFT [24] [36].

LTP is available at a GitHub repository at which many developers committed in the weeks preceding the evaluation period. There were also a few discussions in progress on the project’s mailing list. In addition, the LTP documentation contains a tool installation and usage guide as well as other information we found helpful.

We ran a few LTP syscall tests separately and had a pleasing first impression of the test suite. The completion time of each test was short, and their results (pass or fail) were very clear. It took about 30 minutes to run the entire collection of system call tests. LTP also has a set of device driver tests, but many of them are outdated and do not work anymore.

ktest

ktest provides an automated test suite that can build, install, and boot test Linux on a target machine. It can also run post boot scripts on the target system to perform further testing[10] [16]. ktest has been included in the Linux kernel repository under the directory tools/testing/ktest. The tool consists of a perl script (ktest.pl) and a set of configuration files containing test setup properties. In addition to the build and boot tests, ktest also supports git bisect, config bisect, randconfig, and patch check as additional types of tests. If a cross-compiler is installed, ktest can also run cross-compile tests [17][10].

ktest is available from within the Linux kernel repository under the tools/testing/ktest/ directory. Despite belonging to the kernel project, the last contribution to ktest dates from five months before our inspection date. Its documentation is sparse and has only a description of the configuration options along with a brief description of the existing example configuration files. There is no installation guide, nor any list of test dependencies. To set up ktest, we followed the guidelines provided by Jordan [16] and adapted several runtime configurations. We only ran an elementary build and boot test over a couple of patches. Nevertheless, we think ktest could be useful in automating many test activities mentioned in the literature, such as patch checking, bisecting, and config bisect.

Smatch

Smatch (the source matcher) is a static analyzer developed to detect programming logic errors. For instance, smatch can detect errors such as attempts to unlock already unlocked spinlock. It is written in C and uses Sparse as its C parser. Also, smatch is run on Linux kernel trees by autotest bots such as 0-day, Hulk Robot [10] [15][24].

We got Smatch by cloning this repository. The project’s commit history showed us contributions recent to the time we evaluated the tool, although a single developer had authored the majority of those changes. The mailing list archives we found have registered no messages for years. Smatch also has a mailing list at vger.kernel.org, but we did not find mail archives for those. The Smatch documentation is brief, nevertheless, it contains instructions on how to install and use the source matcher. Within a few minutes, we had set up Smatch and run some static tests against Linux drivers.

Coccinelle / coccicheck

Coccinelle is a static analyzer engine that provides a language for specifying matches and transformations in C code. Coccinelle is used to aid collateral evolution of source code and to help in catching certain bugs that have been expressed semantically. Collateral evolution is needed when client code has to be updated due to development in library API. Renaming a function, adding function parameters, and reorganizing data structures are examples of changes that may lead to collateral evolution. Also, bug chase and fixing are made with the aid of coccicheck, a collection of semantic patches that makes use of the Coccinelle engine to interpret and run a set of tests. coccicheck is available in the Linux kernel under a make target with the same name [37] [29] [30]. Moreover, coccicheck is run on Linux kernel trees by automated test robots such as 0-day and Hulk robots [24] [29].

Coccinelle can be obtained through the package manager of many GNU/Linux distributions, as a compressed tar.gz file from the project’s web page, or through a GitHub repository. As of the day we evaluated the static analyzer, Coccinelle’s repository had some recent commits, most of them by a single developer. Also, the project’s mailing list had ongoing conversations and patches under review.

The Linux kernel documentation has a page with installation and usage instructions for both Coccinelle and coccicheck. Moreover, the kernel has a Makefile target named “coccicheck” for running coccicheck semantic patches. In a few minutes, we installed Coccinelle and ran some checks on Linux drivers.

jstest

jstest is a userspace utility program that displays joystick information such as device status and incomming events. It can be used to test the features of the Linux joystick API as well as for testing the functionality of a joystick driver [9] [22] [38].

jstest is part of the Linux Console Project and can be obtained from Source Forge or through the package manager of some GNU/Linux distributions. However, as of the date we evaluated jstest, the project’s repository was about a year without updates and the associated mailing lists were without discussions or patches for even longer. Despite that, the jstest documentation was helpful as it listed dependency packages and the installation steps for the tool. Also, the manual page is brief yet informative, containing what you need to use the tool. To use jstest, one must provide the path to a joystick or gamepad device. jstest displays the inputs obtained from joysticks and gamepads and thus can be used to test the functioning of drivers for these devices in a black-box fashion.

TuxMake

TuxMake, by Linaro, is a command line tool and Python library designed to make building the Linux kernel easier. It seeks to simplify Linux kernel building by providing a consistent command line interface to build the kernel across a variety of architectures, toolchains, kernel configurations, and make targets. By removing the friction of dealing with different build requirements, TuxMake assists developers, especially newcomers, to build test the kernel for uncommon toolchain/architecture combinations. Moreover, TuxMake comes with a set of curated portable build environments distributed as container images. These versioned and hermetic filesystem images make it easier to describe and reproduce builds and build problems. Although it does not support every Linux make target, the TuxMake team plans to add support for additional targets such as kselftest, cpupower, perf, and documentation. TuxMake is part of TuxSuite, which in turn makes part of Linaro’s main Linux testing effort [21] [26] [28].

TuxMake is available from its GitLab repository and can also be downloaded as a package for many GNU/Linux distros. The contributions to the project’s repository are recent to the date we evaluated the tool. In addition, the TuxMake documentation contains installation instructions and examples of how to use the tool. We note, however, that TuxMake focuses on build testing and thus only builds the artifacts bound to make targets, not triggering the execution of further tests cases even when those targets would do so by default.

Sparse

Sparse (semantic parser) is a semantic checker for C programs originally written by Linus Torvalds to support his work on the Linux kernel. Sparse does semantic parsing of source code files in a few phases summarized as full-file tokenization, pre-processing, semantic parsing, lazy type evaluation, inline function expansion, and syntax tree simplification [50]. The semantic parser can help test C programs by performing type-checking, lock checking, value range checking, as well as reporting various errors and warnings while examining the code [49][50]. In fact, Sparse also comprises a compiler frontend capable of parsing most ANSI C programs as well as a collection of compiler backends, one of which, is a static analyzer that takes the same name [51].

The kernel build system has a couple of make options that support checking code with static analyzers, and it uses Sparse as the default checker [10]. Also, autotest robots such as 0-day and Hulk Robot run Sparse on kernel trees they test [24].

SymDrive

Renzelmann et al. [39] focused on testing Linux kernel device drivers using symbolic execution. This technique consists of replacing a program’s input with symbolic values. Rather than using the actual data for a given function, symbolic execution comes up with input values throughout the range of possible values to each parameter. SymDrive intercepts all calls into and out of a driver with stubs that call a test framework and checkers. Stubs may invoke checkers passing the set of parameters for the function under analysis, the function’s return, and a flag indicating whether the checker is running before or after the function under test. Driver state can be accessed by calling a supporting library. Thus, checkers can evaluate the behavior of a function under test from the execution conditions and the obtained results.

Symdrive stood out between related works as a testing tool for drivers in the kernel through symbolic code execution. To set up Symdrive, we followed the installation steps listed on their developer’s page [40]. One of the first steps of the setup consists of compiling and installing S2E, a software platform that provides functionalities for symbolic execution on virtual machines. The S2E documentation mentions the use of Ubuntu as a prerequisite for setting up the [41][42] [43], yet, we found indications of compatibility with other OSes after inspecting the compilation and installation scripts. Also, we encountered installation error messages notifying us that S2E is compatible only with a restricted set of processors. However, even though we had configured the VM with a compatible processor, our 10GB of system memory was not enough to prevent installation scripts from failing due to lack of RAM.

We found out that the S2E mailing list was semi-open, meaning that only subscribed addresses may send emails to it. To subscribe to the S2E list, a moderator must first approve your subscriptions request. However, it took a month for an S2E moderator to accept our subscription request to their mailing list. By the time they granted access to us, we were assessing other testing tools and did not want to come back to this one. Finally, our email to the authors of the SymDriver paper was unanswered. So, after a series of setbacks related to installation and lack of access to support, we gave up on installing S2E and evaluating the use of the SymDrive tool.

FAU machine

Buchacker and Sieh [44] developed a framework for testing fault tolerance of GNU/Linux systems by injecting faults in an entirely simulated running system. FAU machine runs a User Mode Linux (UML) port of the Linux kernel, which maps every UML process onto a single process in the host system. Thus, a complete virtualized machine runs on top of a real-world Linux machine as a single process. For injecting faults into the virtualized system, the framework launches a second process in the host system. Every time a UML process makes a system call, return from a system call, or receives a signal, it is stopped by the auxiliary host process. The host process then decides whether the halted process will continue with or without the signal received, if errors should be returned from system calls instead of the actual value, and so on. This technique of virtualization combined with the interception of processes has the benefits of maintaining binary compatibility of programs, allowing fault injection in core kernel functionalities, peripheral faults, external faults, real-time clock faults, and interrupt/exception faults.

To test with FAU machine, we firts asked the authors of [44] to point out the tool’s repository. Next, we downloaded the associated repositories and installed the packages needed for build and installation. The project documentation is outdated and is not maintained by the developers. For instance, two packages indicated in the documentation as necessary for the build are deprecated and no longer needed. In reply to one of our messages, the project maintainer said that questions could be answered by email: “Just forget *any* documentation you find regarding FAUmachine. None is correct any more. Sorry for that. We just don’t have time to update these documents. I think you must ask your questions using e-mail.”.

After compiling and installing the FAU machine, we tried to run some tests by setting up an example from FAU source files. The experiment consisted of starting a virtual machine and installing a Debian image on its disk. However, the experiment run script failed during image installation. Our following email to the maintainer asking for help with the experiment went unanswered. Still, within the menus and options in the virtual machine management window, it was possible to see items referencing system fault injection. The evaluation of these tests, however, cannot be completed.

ADFI

Cong et al. [45] introduced a tool that generates fault scenarios for testing device drivers based on previously collected runtime traces. ADFI hooks internal kernel API so that function calls and return values are intercepted and recorded in trace files. A fault scenario generator takes trace files as input and iteratively produces fault scenarios where an intercepted return to a driver is replaced by a fault. Each fault scenario is then run, and the resulting stack traces are collected to feed further iterations of the fault scenario generator. ADFI employs this test method aiming to assess driver error handling code paths that, otherwise, would rarely be followed.

According to [45], the efforts to run ADFI include (1) preparing a configuration file for driver testing; (2) crash analysis; and (3) (optionally) compilation flag modification to support test coverage. ADFI automatically runs each generated fault scenario, one after another, so test execution is automated.

Nevertheless, there is no link or web page address for the ADFI project repository in the article we found about the tool. Moreover, the authors did not respond to our email asking how to get ADFI. Thus, it was not possible to evaluate ADFI as we could not even get the tool.

EH-Test

Bai et al. [46] focused on device driver testing through a similar approach. They developed a kernel module to monitor and record driver runtime information. Further, a pattern-based fault extractor takes runtime data plus driver source code and kernel interface functions as input and extracts target functions from them. EH-Test considers target functions taking into account driver-specific knowledge such as function return types and whether values returned by functions are checked inside some if statement. The C programming language has no built-in error handling mechanism (such as try-catch), so developers often use an if statement to decide whether error handling code should be triggered in device drivers. Then, a fault injector module generates test cases in which target function returns are replaced by faulty values. Finally, a probe inserter generates a separate loadable driver for each test case. These loadable driver modules have target function calls replaced by an error function in their code.

As for evaluating EH-Test, we downloaded the tool’s source code and, with some adjustments, we managed to build some of the test modules. However, some EH-Test components do not build with current GCC and LLVM versions. We mailed the main author asking for some installation and usage guidance, but we had no feedback.

COD

B. Chen et al. [47] presented a test approach based on hybrid symbolic-concrete (concolic) execution. Their work focus on testing LKM (Linux Kernel Modules) using two main techniques: (1) automated test case generation from LKM interfaces with concolic execution; (2) automated test case replay that repeatedly reproduced detected bugs.

During test case generation, the COD Agent component sequentially executes commands from an initial test case to trigger functionalities of target LKMs through the base kernel. Two custom kernel modules intercept interactions between base Linux kernel and LKMs under test and add new tainted values to a taint analysis engine. When all commands in the test harness are finished, COD captures the runtime execution trace into a file and sends it to a symbolic engine. A trace replayer performs symbolic analysis over the captured trace file, then sends the generated test cases back to the execution environment. These steps then repeat to produce more test cases until some criteria (such as elapsed time) are met.

In test case replay mode, COD Test Case Replayer picks a test case and executes the commands in the test harness to trigger functionalities of target LKMs. Three custom kernel modules intercept the interactions between kernel and LKMs under test, modify these interactions when needed, and capture kernel API usage information. After all commands in the test harness are finished, COD retrieves the kernel API usage information from the custom kernel modules and checks for potential bugs. This process repeats for each test case given as input.

Yet, for reasons analogous to ADFI, COD could not be tested either. There is no repository link in [47] or instruction on how to get COD. We sent an email to the paper authors, but that was unanswered.

Troll

Rothberg et al. [48] developed a tool to generate representative kernel compilation configurations for testing. Troll parses files locally for configuration options (#ifdef) and creates a partial kernel compilation configuration. This initial step is called sampling. Each partial configuration is then abstracted by a node in a configuration compatibility graph (CCG). In this graph, mutually compatible configurations are linked by an edge. In the next step (merging), Troll looks up the CCG for the largest click (set of nodes that are all linked together) and merges all those partial configurations that belong to the click. The compilation configuration obtained with the biggest click covers most of the #ifdef and generates several warnings when Sparse analyzes the code generated by such arrangement. With a valid configuration providing good coverage of different configurations (#ifdef), further automated testing is more likely to find bugs.

Since Troll was designed to generate Linux kernel build configurations, it does not fit into the kernel tests category. Despite that, we decided to give Troll a try. Nevertheless, on our first shot, we found that some new Kconfig features were not supported by Undertaker, a software whose output was needed to feed Troll. Also, the Undertaker mailing list was semi-open. Since our adjustments to the kernel symbols were insufficient to make Undertaker generate partial kernel configurations, our last resort was to reach Troll’s developers. Surprisingly, the authors were very responsive and helped us to set up the latest Undertaker version. After that, we ran an example from Troll documentation that generates Linux kernel compilation settings. The uses of Troll presented in [48] are analogous to the documentation example we ran but the fact that they require more than 10GB of system memory to complete. Due to that, we could not reproduce those use cases.

History

  1. V1: Release

References

[1] Aditya P. Mathur. “Foundations of Software Testing”. (2013) Pearson India. URL: https://www.oreilly.com/library/view/foundations-of-software/9788131794760/

[2] “IEEE Standard Glossary of Software Engineering Terminology”. In IEEE Std 610.12-1990. (1990) Pages 1-84. URL: https://ieeexplore.ieee.org/document/159342

[3] “Kernel Testing Guide”. URL: https://www.kernel.org/doc/html/latest/dev-tools/testing-overview.html

[4] “Re: [PATCH v3] Documentation: dev-tools: Add Testing Overview”. URL: https://lore.kernel.org/linux-doc/CABVgOS=2iYtqTVdxwH=mcFpcSuLP4cpJ4s6PKP4Gc-SH6jidgQ@mail.gmail.com/

[5] “Linux Kernel Selftests”. (2021) URL: https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html

[6] Shuah Khan. “Kernel Validation With Kselftest”. (2021) URL: https://linuxfoundation.org/webinars/kernel-validation-with-kselftest/

[7] “Kernel self-test”. (2019) URL: https://kselftest.wiki.kernel.org/

[8] . “A Tour Through RCU’s Requirements”. (2021) URL: https://www.kernel.org/doc/html/latest/RCU/Design/Requirements/Requirements.html

[9] . “xpad - Linux USB driver for Xbox compatible controllers”. (2021) URL: https://www.kernel.org/doc/html/latest/input/devices/xpad.html

[10] Shuah Khan. “Linux Kernel Testing and Debugging”. (2014) URL: https://www.linuxjournal.com/content/linux-kernel-testing-and-debugging

[11] Jake Edge. “Maintaining stable stability”. (2020) URL: https://lwn.net/Articles/825536/

[12] Jonathan Corbet. “Some 5.12 development statistics”. (2021) URL: https://lwn.net/Articles/853039/

[13] Andrey Konovalov. “Fuzzing Linux Kernel”. (2021) URL: https://linuxfoundation.org/webinars/fuzzing-linux-kernel/

[14] Manoj Iyer. “LTP HowTo”. (2012) URL: http://ltp.sourceforge.net/documentation/how-to/ltp.php

[15] . “Smatch The Source Matcher”. (2021) URL: http://smatch.sourceforge.net/

[16] Daniel Jordan. “So, you are a Linux kernel programmer and you want to do some automated testing…”. (2021) URL: https://blogs.oracle.com/linux/ktest

[17] . “Ktest”. (2017) URL: https://elinux.org/Ktest

[18] Dmitry Vyukov and Andrey Konovalov and Marco Elver. “syzbot”. (2021) URL: https://github.com/google/syzkaller/blob/master/docs/syzbot.md

[19] . “Distributed Linux Testing Platform KernelCI Secures Funding and Long-Term Sustainability as New Linux Foundation Project”. (2019) URL: https://www.prnewswire.com/news-releases/distributed-linux-testing-platform-kernelci-secures-funding-and-long-term-sustainability-as-new-linux-foundation-project-300945978.html

[20] . “Linux Kernel Developer: Shuah Khan”. (2017) URL: https://linuxfoundation.org/blog/linux-kernel-developer-shuah-khan/

[21] Dan Rue. “Portable and reproducible kernel builds with TuxMake”. (2021) URL: https://lwn.net/Articles/841624/

[22] . “Linux Joystick support - Introduction”. (2021) URL: https://www.kernel.org/doc/html/latest/input/joydev/joystick.html

[23] Mark Filion. “How Continuous Integration Can Help You Keep Pace With the Linux Kernel”. (2016) URL: https://www.linux.com/audience/enterprise/how-continuous-integration-can-help-you-keep-pace-linux-kernel/

[24] . “2020 Linux Kernel History Report”. (2020) URL: https://linuxfoundation.org/wp-content/uploads/2020_kernel_history_report_082720.pdf

[25] Jonathan Corbet, Greg Kroah-Hartman. “2017 Linux Kernel Development Report”. (2017) URL: https://www.linuxfoundation.org/wp-content/uploads/linux-kernel-report-2017.pdf

[26] Dan Rue, Antonio Terceiro. “Linaro/tuxmake - README.md”. (2021) URL: https://gitlab.com/Linaro/tuxmake

[27] . “Welcome to KernelCI”. (2021) URL: https://kernelci.org/

[28] . “Rapid Operating System Build and Test”. (2021) URL: https://www.linaro.org/os-build-and-test/

[29] Luis R. Rodriguez, Nicolas Palix. “coccicheck [Wiki]”. (2016) URL: https://bottest.wiki.kernel.org/coccicheck

[30] Luis R. Rodriguez, Tyler Baker, Valentin Rothberg. “linux-kernel-bot-tests - start [Wiki]”. (2016) URL: https://bottest.wiki.kernel.org/

[31] . “Linux Kernel Performance”. (2021) URL: https://01.org/lkp

[32] Dave Jones. “Linux system call fuzzer - README”. (2017) URL: https://github.com/kernelslacker/trinity

[33] Michael Kerrisk. “LCA: The Trinity fuzz tester”. (2013) URL: https://lwn.net/Articles/536173/

[34] Jonathan Corbet. “Statistics from the 5.4 development cycle”. (2019) URL: https://lwn.net/Articles/804119/

[35] . “Linaro’s Linux Kernel Functional Test framework”. (2021) URL: https://lkft.linaro.org/

[36] . “Tests in LKFT”. (2021) URL: https://lkft.linaro.org/tests/

[37] . “Coccinelle: A Program Matching and Transformation Tool for Systems Code”. (2022) URL: https://coccinelle.gitlabpages.inria.fr/website/

[38] Stephen Kitt. “jstest - joystick test program”. (2009) URL: https://sourceforge.net/p/linuxconsole/code/ci/master/tree/docs/jstest.1

[39] Matthew J. Renzelmann and Asim Kadav and Michael M. Swift. “SymDrive: Testing Drivers without Devices”. (2012) Pages 279-292. 

[40] Matthew J. Renzelmann and Asim Kadav and Michael M. Swift. “SymDrive Download and Setup”. (2012) URL: https://research.cs.wisc.edu/sonar/projects/symdrive/downloads.shtml

[41] Cyberhaven. “Creating analysis projects with s2e-env - S2E 2.0 documentation”. (2020) URL: http://s2e.systems/docs/s2e-env.html

[42] Adrian Herrera. “s2e-env/README.md at master - S2E/s2e-env”. (2020) URL: https://github.com/S2E/s2e-env/blob/master/README.md

[43] Cyberhaven. “Building the S2E platform manually - S2E 2.0 documentation”. (2020) URL: http://s2e.systems/docs/BuildingS2E.html

[44] Buchacker, K. and Sieh, V.. “Framework for testing the fault-tolerance of systems including OS and network aspects”. (2001) Pages 95-105. 

[45] Cong, Kai and Lei, Li and Yang, Zhenkun and Xie, Fei. “Automatic Fault Injection for Driver Robustness Testing”. (2015) Pages 361–372. URL: https://doi.org/10.1145/2771783.2771811

[46] Jia-Ju Bai and Yu-Ping Wang and Jie Yin and Shi-Min Hu. “Testing Error Handling Code in Device Drivers Using Characteristic Fault Injection”. (2016) Pages 635-647. 

[47] Chen, Bo and Yang, Zhenkun and Lei, Li and Cong, Kai and Xie, Fei. “Automated Bug Detection and Replay for COTS Linux Kernel Modules with Concolic Execution”. (2020) Pages 172-183. 

[48] Rothberg, Valentin and Dietrich, Christian and Ziegler, Andreas and Lohmann, Daniel. “Towards Scalable Configuration Testing in Variable Software”. (2016) Pages 156–167. URL: https://doi.org/10.1145/3093335.2993252

[49] . “Sparse”. (2022) URL: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/dev-tools/sparse.rst?h=v5.17-rc7

[50] Neil Brown. “Sparse: a look under the hood”. (2016) URL: https://lwn.net/Articles/689907/

[51] . “Welcome to sparse’s documentation”. (2022) URL: https://sparse.docs.kernel.org/en/latest/