Model-based testing with Spec# and Spec Explorer

Jon Jacky, University of Washington
jon@u.washington.edu

Introduction
Theme
Spec Explorer tool
Concepts and vocabulary
Spec# language
Concepts
Features
Modeling
Abstraction
Coding
Exploration
Algorithm
Configuration
Properties and state groupings
Scenario control
Methodology
vs. model checking
Test case generation
Conformance testing
Testing nondeterministic systems
On-the-fly testing
Integration with other test tools

Introduction

Speaker introduction (MSR/FSE visitor, experience with embedded controls, Z)
Purpose: describe features, methodology; discuss motivation, design issues
We support Specification and Testing
- We don't support Specification to Implementation - we don't propose a development method
- We won't discuss static analysis today (but Spec#/Boogie supports that too)
Specifications are written by testers for testing
Implementation (IUT) is a black box - we can only execute it to check conformance
(we can instrument the binary to connect it to the test harness)
Testing checks actual IUT + infrastructure (the only assurance method which does)
Use at MS: features were included (or excluded) based on experience
Textbook in preparation

top

Theme

Spec# "model program" (executable specification) serves as

Specification
Test case generator
Test oracle

top

Spec Explorer tool

Supports specification and testing

Development of Spec# model programs ("literate programming" editor, debugger)
Validation of Spec# model programs
- Simulation (Main execution)
- Exploration (FSM generation)
- Visualization
- Safety (reachability), liveness analysis
Test case generation
Conformance testing (includes test harness for lockstep execution with oracle)

Supports different ways of testing

Offline test case generation or on-the-fly testing
"Complete" coverage (of spec) or stochastic testing (operational profile)

top

Concepts and vocabulary

State-based worldview as in Z, VDM, TLA, ...

State: Information stored in a system (in a program, state variables)
Action: Unit of behavior, may change state (but need not, could be a function)
Scenario: Sample of behavior, sequence of actions
Level of abstraction: Choice of variables and granularity of actions to test
(for example: actions are API method calls, state is certain static variables)
Specification: Represents all allowed scenarios (at chosen level of abtract.)
Test case: A particular scenario (could be one action, usually more)
Test suite: Collection of test cases

top

Spec# language concepts

Extends C#, the production programming language

History: based on AsmL compiler, provides most AsmL features with C# syntax

Contract: (Part of) specification expressed in the programming language

Declarative contract: pre, post, invariant

Runtime checking (non-conforming scenario will throw exception)
Test case generation (preconditions are interpreted as guards)
Static checking (complementary, different tool, we won't discuss today)
Our methodology uses pre only, not post or invariant

Model program: method bodies

Our methodology uses assignments (not assertions) to define postconditions
Executable specifications confer many advantages
- Validation (simulation)
- Test case generation (computes next state, makes exploration possible)
- Conformance test (acts as oracle)

top

Spec# language features

Declarative contracts

requires, ensures - pre, post - any "pure" Boolean expressions
==>, <==> - implies, equivalence, new Boolean operators
Exists, Exists1, Forall - Quantifiers, new kinds of Boolean expressions

Model programs

structure - generalized struct (inheritance, recursion but still value types)
Tuple, Set, Seq, Map - Collection types (immutable value types, no aliasing)
Comprehensions - expressions whose values are collections
Model programs typically contain no iteration, no index variables
Internal nondeterminism (choose), but not recommended in our methodology

top

Modeling: Abstraction

Write minimum code needed to generate scenarios of interest --- no need to be comprehensive

Choose level of abstraction (state variables, actions in IUT to test)

All state (files, messages in transit) is in model variables, nothing external
Global point of view, each agent can see into other agents' state
Each action (at chosen level of abstraction) is coded as a method in the model
(need not correspond 1:1 to methods in IUT)
Model program has a single thread,interleaving actions can represent concurrency
Actions are atomic, actions interleave, no interleaving within action bodies
Multiple assignments within an action can represent parallelism (global state)

Be careful with objects (no isomorph elimination), maybe use value types instead

Model introduces another state space, reconcile with IUT during conformance test

top

Modeling: Coding

To code each action

When is it enabled? (requires ...)
- multiple actions enabled in same state models nondeterminism,
- allows for interleaving concurrency
What (if anything) does it return? (return ...)
What is next state? Is it different? (assignments, ... = ... )

Distinguish top level [Action] methods from helper methods

Possibly write Main method(s) to simulate scenario(s)

top

Test design

Generating test suites from Spec# model

Rationale: there must be equivalence classes. We don't need exhaustive testing.

Different fault models call for different kinds of test suites:

Wrong logic/wrong expression:: Complete but minimal coverage over small domains
Problem scaling up data structures (like hash table resize, editor buffer gap):: Vary a few properties over large ranges
Unreliable infrastucture, hidden state leaks out:: Long test cases, revisit the same (model) states

top

Exploration

Exploration is the primary technique for analyzing model programs.

Exploration generates a Finite State Machine (FSM) from the model program for ---

Validation (visualization, check safety and liveness)
Offline test case generation

Exploration executes the model program in special environment, building the FSM as it goes.

Each invocation (method call including args) is a transition in the FSM
Execute all enabled invocations from a state (backtracking, in effect)
Execute each method with all combinations of arguments from given finite domain
(can simulate internal nondeterminism with additional arguments).

Generated FSM is usually an underapproximation of model program.

Generated FSM can be nondeterministic.

top

Exploration algorithm

Exploration treats model program state as first class.

Spec# compiler generates code with storage management hooks
(alternatively, could use regular compiler but special runtime)

The algorithm itself.

top

Exploration configuration

Model program defines all scenarios
Test tool configuration defines scenarios for a particular test suite

Make the FSM finite

Domains (of method call arguments)
Filters (Boolean expressions on state,can limit sizes of data structures etc)
State groupings (sampling based on properties)

Control exploration order (when sampling)

State weights (weight of next state, can depend on current state)
Action weights (frequency of operations, can match operational profile)
Breadth first (many short scenarios) vs. depth first (fewer long scenarios)
Limit N of states

top

Exploration with properties and state groupings

State groupings (sampling based on properties)

one property
multiple properties (union rather than product)
progress properties (for scenario control)
can use properties for isomorph elimination when objects are used

For exploration or visualization

top

Exploration scenario control

Scenario control - for this we sometimes have to alter model

Extra preconditions (Can... properties in preconditions)
Extra state variables (modes)
Can generate negative tests (that should fail)
- For example, model action with complementary precondition
- Bind positive and negative model actions to same IUT action
- Use action weights to set frequences of of positive, negative cases

top

Exploration methodology

Achieve understanding, check properties, scale up, repeat ...

First, use domains and filters to generate small but complete FSM
Next, sample a larger space using state groupings
Finally, specify exploration order using priorities and weights etc.
At each step, confirm generated FSM has the expected properties: visualize, check safety and liveness

top

Exploration vs. model checking

Complete exploration over small domains is like concrete state model checking
BUT we have no temporal logic formula to check against

Two temporal logic formulas are "baked in"

Safety checking (states that violate safety condition stop exploration)
Liveness checking (states with no path to "accepting" states are marked)

Exploration with sampling is quite different philosophy:
incomplete, use judgment and heuristics to sample from presumed equivalence classes

top

Test case generation

Offline test case generation: traverse FSM generated by exploration

Postman tour gives minimal link coverage (not path coverage)
(BUT maybe you don't want just link coverage, maybe don't want minimal coverage)
Identify "accepting states" where test run may terminate
Identify "cleanup actions" that make progress toward accepting state
Tool ensures each test case reaches accepting state (via cleanup actions)

Tool can store test suite internally for subsequent conformance test
OR tool can write out test suite as C# program

top

Conformance testing

Tool can act as test harness for conformance testing
Tool can reference and execute IUT (binary, DLL)
Model and IUT can be at different levels of abstraction, must reconcile model state space with IUT state space
- Write wrapper or test driver around IUT
- Wrapper can translate IUT values to model values
- [Probe] actions can return (translated) IUT state variables
Action bindings, type bindings defined in configuration
Object bindings made dynamically
Lockstep execution, model with IUT, check:
- actions are enabled
- correct return values

top

Testing nondeterministic systems

External nondeterminism: multiple actions enabled in the same state
Internal nondeterminism: one action may have multiple outcomes
We recommend simulate internal nondeterminism with input args chosen by tool
during exploration
Controllable actions: called by tool
Observable actions (events): originate from IUT, observed by tool
IUT is instrumented so test tool can track observable actions via callbacks

top

On-the-fly testing

Unlike offline test case generation: no FSM generation, no FSM traversal
Deals efficiently with nondet, doesn't bother considering choices not taken
Tool executes model and IUT lockstep
Tool uses model state to determine which actions enabled
Chooses from enabled actions according to weights in test config, can be based on operational profile.
Uses model as oracle for enabledness, return values
May revisit same model state repeatedly (probe for hidden states in IUT)
No stored FSM or test suite, on-they-fly test can run indefinitely
Can configure N of actions to run, or run until failure
After N, runs until "accepting state", invoking only "cleanup actions"

top

Integration with other test tools

Spec Explorer is completely self contained modeling and testing tool
BUT can also be integrated with other tools
Can write out test suites as C# programs
Provides an automation framework, facilities can be accessed programatically

top