Model-based testing with Spec# and Spec Explorer
Jon Jacky, University of Washington
jon@u.washington.edu
Introduction
Theme
Spec Explorer tool
Concepts and vocabulary
Spec# language
Concepts
Features
Modeling
Abstraction
Coding
Exploration
Algorithm
Configuration
Properties and state groupings
Scenario control
Methodology
vs. model checking
Test case generation
Conformance testing
Testing nondeterministic systems
On-the-fly testing
Integration with other test tools
- Speaker introduction (MSR/FSE visitor, experience with embedded controls, Z)
- Purpose: describe features, methodology; discuss motivation, design issues
- We support Specification and Testing
- We don't support Specification to Implementation - we don't propose a development method
- We won't discuss static analysis today (but Spec#/Boogie supports that too)
- Specifications are written by testers for testing
- Implementation (IUT) is a black box - we can only execute it to check conformance
(we can instrument the binary to connect it to the test harness)
- Testing checks actual IUT + infrastructure (the only assurance method which does)
- Use at MS: features were included (or excluded) based on experience
- Textbook in preparation
top
Spec# "model program" (executable specification) serves as
- Specification
- Test case generator
- Test oracle
top
Supports specification and testing
- Development of Spec# model programs ("literate programming" editor, debugger)
- Validation of Spec# model programs
- Simulation (Main execution)
- Exploration (FSM generation)
- Visualization
- Safety (reachability), liveness analysis
- Test case generation
- Conformance testing (includes test harness for lockstep execution with oracle)
Supports different ways of testing
- Offline test case generation or on-the-fly testing
- "Complete" coverage (of spec) or stochastic testing (operational profile)
top
State-based worldview as in Z, VDM, TLA, ...
- State
- Information stored in a system (in a program, state variables)
- Action
- Unit of behavior, may change state (but need not, could be a function)
- Scenario
- Sample of behavior, sequence of actions
- Level of abstraction
- Choice of variables and granularity of actions to test
(for example: actions are API method calls, state is certain static variables)
- Specification
- Represents all allowed scenarios (at chosen level of abtract.)
- Test case
- A particular scenario (could be one action, usually more)
- Test suite
- Collection of test cases
top
Extends C#, the production programming language
History: based on AsmL compiler, provides most AsmL features with C# syntax
Contract: (Part of) specification expressed in the programming language
Declarative contract: pre, post, invariant
- Runtime checking (non-conforming scenario will throw exception)
- Test case generation (preconditions are interpreted as guards)
- Static checking (complementary, different tool, we won't discuss today)
- Our methodology uses pre only, not post or invariant
Model program: method bodies
- Our methodology uses assignments (not assertions) to define postconditions
- Executable specifications confer many advantages
- Validation (simulation)
- Test case generation (computes next state, makes exploration possible)
- Conformance test (acts as oracle)
top
Declarative contracts
- requires, ensures - pre, post - any "pure" Boolean expressions
- ==>, <==> - implies, equivalence, new Boolean operators
- Exists, Exists1, Forall - Quantifiers, new kinds of Boolean expressions
Model programs
- structure - generalized struct (inheritance, recursion but still value types)
- Tuple, Set, Seq, Map - Collection types (immutable value types, no aliasing)
- Comprehensions - expressions whose values are collections
- Model programs typically contain no iteration, no index variables
- Internal nondeterminism (choose), but not recommended in our methodology
top
Write minimum code needed to generate scenarios of interest ---
no need to be comprehensive
Choose level of abstraction (state variables, actions in IUT to test)
- All state (files, messages in transit) is in model variables, nothing external
- Global point of view, each agent can see into other agents' state
- Each action (at chosen level of abstraction) is coded as a method in the model
(need not correspond 1:1 to methods in IUT)
- Model program has a single thread,interleaving actions can represent concurrency
- Actions are atomic, actions interleave, no interleaving within action bodies
- Multiple assignments within an action can represent parallelism (global state)
Be careful with objects (no isomorph elimination), maybe use value types instead
Model introduces another state space, reconcile with IUT during conformance test
top
To code each action
- When is it enabled? (requires ...)
- multiple actions enabled in same state models nondeterminism,
- allows for interleaving concurrency
- What (if anything) does it return? (return ...)
- What is next state? Is it different? (assignments, ... = ... )
Distinguish top level [Action] methods from helper methods
Possibly write Main method(s) to simulate scenario(s)
top
Generating test suites from Spec# model
Rationale: there must be equivalence classes.
We don't need exhaustive testing.
Different fault models call for different kinds of test suites:
- Wrong logic/wrong expression:
- Complete but minimal coverage over small domains
- Problem scaling up data structures (like hash table resize, editor buffer gap):
- Vary a few properties over large ranges
- Unreliable infrastucture, hidden state leaks out:
- Long test cases, revisit the same (model) states
top
Exploration is the primary technique for analyzing model programs.
Exploration generates a Finite State Machine (FSM)
from the model program for ---
- Validation (visualization, check safety and liveness)
- Offline test case generation
Exploration executes the model program in
special environment, building the FSM as it goes.
- Each invocation (method call including args) is a transition in the FSM
- Execute all enabled invocations from a state (backtracking, in effect)
- Execute each method with all combinations of arguments from given finite domain
(can simulate internal nondeterminism with additional arguments).
Generated FSM is usually an underapproximation of model program.
Generated FSM can be nondeterministic.
top
Exploration treats model program state as first class.
- Spec# compiler generates code with storage management hooks
(alternatively, could use regular compiler but special runtime)
- The algorithm itself.
top
Model program defines all scenarios
Test tool configuration defines scenarios for a particular test suite
Make the FSM finite
- Domains (of method call arguments)
- Filters (Boolean expressions on state,can limit sizes of data structures etc)
- State groupings (sampling based on properties)
Control exploration order (when sampling)
- State weights (weight of next state, can depend on current state)
- Action weights (frequency of operations, can match operational profile)
- Breadth first (many short scenarios) vs. depth first (fewer long scenarios)
- Limit N of states
top
State groupings (sampling based on properties)
- one property
- multiple properties (union rather than product)
- progress properties (for scenario control)
- can use properties for isomorph elimination when objects are used
For exploration or visualization
top
Scenario control - for this we sometimes have to alter model
- Extra preconditions (Can... properties in preconditions)
- Extra state variables (modes)
- Can generate negative tests (that should fail)
- For example, model action with complementary precondition
- Bind positive and negative model actions to same IUT action
- Use action weights to set frequences of of positive, negative cases
top
Achieve understanding, check properties, scale up, repeat ...
- First, use domains and filters to generate small but complete FSM
- Next, sample a larger space using state groupings
- Finally, specify exploration order using priorities and weights etc.
- At each step, confirm generated FSM has the expected properties:
visualize, check safety and liveness
top
Complete exploration over small domains is like concrete state model checking
BUT we have no temporal logic formula to check against
Two temporal logic formulas are "baked in"
- Safety checking (states that violate safety condition stop exploration)
- Liveness checking (states with no path to "accepting" states are marked)
Exploration with sampling is quite different philosophy:
incomplete,
use judgment and heuristics to sample from presumed equivalence classes
top
Offline test case generation: traverse FSM generated by exploration
- Postman tour gives minimal link coverage (not path coverage)
(BUT maybe you don't want just link coverage, maybe don't want minimal coverage)
- Identify "accepting states" where test run may terminate
- Identify "cleanup actions" that make progress toward accepting state
- Tool ensures each test case reaches accepting state (via cleanup actions)
Tool can store test suite internally for subsequent conformance test
OR tool can write out test suite as C# program
top
- Tool can act as test harness for conformance testing
- Tool can reference and execute IUT (binary, DLL)
- Model and IUT can be at different levels of abstraction,
must reconcile model state space with IUT state space
- Write wrapper or test driver around IUT
- Wrapper can translate IUT values to model values
- [Probe] actions can return (translated) IUT state variables
- Action bindings, type bindings defined in configuration
- Object bindings made dynamically
- Lockstep execution, model with IUT, check:
- actions are enabled
- correct return values
top
- External nondeterminism: multiple actions enabled in the same state
- Internal nondeterminism: one action may have multiple outcomes
- We recommend simulate internal nondeterminism with input args chosen by tool
- during exploration
- Controllable actions: called by tool
- Observable actions (events): originate from IUT, observed by tool
- IUT is instrumented so test tool can track observable actions via callbacks
top
- Unlike offline test case generation: no FSM generation, no FSM traversal
- Deals efficiently with nondet, doesn't bother considering choices not taken
- Tool executes model and IUT lockstep
- Tool uses model state to determine which actions enabled
- Chooses from enabled actions according to weights in test config,
can be based on operational profile.
- Uses model as oracle for enabledness, return values
- May revisit same model state repeatedly (probe for hidden states in IUT)
- No stored FSM or test suite, on-they-fly test can run indefinitely
- Can configure N of actions to run, or run until failure
- After N, runs until "accepting state", invoking only "cleanup actions"
top
- Spec Explorer is completely self contained modeling and testing tool
- BUT can also be integrated with other tools
- Can write out test suites as C# programs
- Provides an automation framework, facilities can be accessed programatically
top