Testing Guidelines

This document provides an overview of the standards for writing tests in the MagAOX, MagAOX-scoob and the XWCTk codebases to ensure code quality, reliability, and maintainability, but is applicable to any C++ code development within UASAL more generally.

There are many resources that cover testing in general and test writing in C++ in particular, so this document is not intended as an introduction to testing, but it will highlight several basic concepts in order to establish a common language and starting point. To skip this and refer only to the specific guidelines for writing tests, refer to the Guidelines section below.

Testing primer

What is the point of testing?

Testing is one pillar of software validation, alongside formal verification and code review. In C++, hidden issues such as undefined behavior, memory leaks, and pointer errors make systematic testing essential. Software testing is hard because:

  • The space of potential inputs and states grows combinatorially.

  • Failures often occur at discrete boundaries (overflow, null pointers).

  • Random or ad-hoc tests miss corner cases.

The goal of a tester is to find inputs and conditions that break the code, not merely to show it works.

Test first programming

Aspirationally, one should aim to write tests before writing the code they are testing. This is known as Test-Driven Development (TDD). TDD encourages you to think about the requirements and expected behavior of your code before implementation, leading to better-designed and more maintainable code.

However, in practice, this is often not feasible for all code. Instead, test early and often, aim to write tests as soon as possible after writing the code. It is far more pleasant to test your code as you develop it, than once you have a large untested codebase.

Coverage

Coverage aims to be an objective measures of how thoroughly a set of tests exercise the code. Three common coverage metrics are:

  • Statement coverage: Has every line of code been executed at least once?

  • Branch coverage: For each conditional (if, switch, ternary), have all branches been taken?

  • Path coverage: Have all possible sequences of branches been traversed?

Path coverage quickly becomes infeasible for all but tiny functions, but aiming for at least 80 percent statement coverage on newly written code is a practical baseline.

The two most wide-spread coverage analysis tools for C++ are GNU gcov/lcov toolchain and LLVM’s llvm-cov. We adopt and support the GNU gcov/lcov toolchain for coverage analysis within UASAL.

Types of tests

Many different types of tests exist, a ballanced test suite includes at least:

  • Unit Tests: verify single classes or functions in isolation.

  • Integration Tests: exercise multiple modules together, catching interface mismatches or system-level failures.

  • Performance Tests: benchmark critical code paths for throughput and latency (this can be done with, for example, Google Benchmark).

Stubs and mocks

For testing in general, but for unit tests, in particular, it is important to isolate the code under test from its dependencies. To isolate units under test, replace real dependencies with test doubles:

  • Stubs: simple implementations that return fixed data.

  • Mocks: programmable objects that record calls, enforce interaction expectations, and simulate errors.

Using stubs and mocks ensures that failures point directly at the code under test, not at external dependencies.

Test automation and Continuous Integration

Manual testing is error-prone and slow. A robust project automatically compiles and executes the entire test suite with a single command. This can be done locally or integrated into a Continuous Integration (CI) pipeline, such that every change to the codebase triggers a full test run.

CI systems such as GitHub Actions, or GitLab CI pick up every pushed change, run the full test suite, collect coverage reports, and enforce coverage thresholds before allowing new code to merge.

Regression testing is baked into this pipeline: every time a previously passing test fails, the CI issues an alert, ensuring new changes do not allow old bugs from creeping back.

Guidelines

  • We adopt and support the Catch2 framework.

  • Write tests for new functionality using the Catch2 framework.

  • Place test files in the corresponding tests directory for the module or application being tested.

  • Test files should follow the naming convention <module_name>_test.cpp (e.g fsmCtrl_test.cpp for the fsmCtrl module).

  • Use the SCENARIO structure provided by Catch2 for Behavior-Driven Development (BDD) style tests:

    • GIVEN: Sets up the initial conditions.

    • WHEN: Describes the action being tested.

    • THEN: Specifies the expected outcome.

    • Use REQUIRE for critical assertions that must pass.

    • Use CHECK for non-critical assertions that provide additional information.

Example:

SCENARIO( "xxxx", "[template]" )
{
    GIVEN("xxxxx")
    {
        int rv;
        WHEN("xxxx")
        {
            rv = [some test];
            REQUIRE(rv == 0);
        }
    }
}

Generating coverage analysis reports

We adopt the GNU gcov/lcov toolchain for generating coverage reports and encourage automation via GitHub Actions.

The workflow for generating coverage reports:

  • Build the code with coverage instrumentation flags (e.g., --coverage for gcc) and optimizations disabled (-O0).

  • Initialize a fresh baseline to zero coverage counters (equivalent to removing existing .gcda files).

  • Run the full test suite to exercise the code and collect coverage data.

  • Use lcov to capture the coverage data into a .info file.

  • Filter out non-project files (e.g., system headers, third-party libraries, test-only code).

  • Generate an HTML report with genhtml.

Example (with dummy build and test commands):

# Compile code and tests with coverage flags
make all_code
make all_tests

# Zero coverage counters
lcov --directory . --zerocounters

# Run tests to generate coverage data
./run_all_tests.sh

# Capture coverage data
lcov --directory . --capture --output-file coverage.info

# Filter out unwanted files/dirs
lcov --remove coverage.info "*/apps/*/tests/*" "/usr/*" "/sys/*" "/tty/*" --output-file coverage_filtered.info

# Generate HTML report (use filtered info)
genhtml coverage_filtered.info --output-directory coverage_report --title "Coverage Report"

This can be automated via a GitHub Action that outputs the generated HTML report as an artifact and / or publishes it to a server. The frequency of running coverage reports can be adjusted based on the repository’s needs, but a common approach is to run it on every pull request to the main branch.

External code review tools (e.g. Codacy) can directly consume the .info file to display coverage metrics alongside code quality analysis, or display coverage badges in the repository’s README.

Filtering

The step that filters out non-project files allows the reported percentages to only reflect the repo analyzed, but can be omitted if a broader, system-level view is required.

Typical directories that are excluded include /usr/*, /sys/*, /tty/*, as well as the directories containing the tests themselves.

This can be achieved with:

lcov --remove coverage.info "/usr/*" "/sys/*" "/tty/*" "*/apps/*/tests/*" --output-file coverage_filtered.info

HTML report

The generated HTML site provides:

  • Overall coverage percentage for the repository (aggregate statement coverage) for both line and function coverage.

  • Per-directory and per-file coverage percentages for both line and function coverage.

  • Drill-down to header line-level views showing which lines are covered and which are not.

The report makes it easy to monitor and address blind-spots in testing.

As an example, the report for the MagAOX repository can be found here: https://magao-x.org/docs/coverage/.