Using the F L i P proof checker

Jonathan Jacky

F L i P Home   Download   Reference   Logics   Code   Tests   Notes   PyPI   GitHub

Flip is a logical framework written in Python. One Flip application is a proof checker for entering and editing proofs in natural deduction style. This page is a tutorial and user's guide; there is also a complete reference. For an introduction to logic and proof in this style, consult a textbook such as Kaye, Huth and Ryan, or Bornat.

Natural deduction Interactive proof Save and restore Err and recover Show information Batch proof Examples Bibliography

Natural deduction

The checker works with proofs expressed in natural deduction style. The checker can use different logics; Flip comes with several. Here is a proof in first-order logic based on an example from Kaye, displayed by the checker's pp command:

Kaye ex. 9.12, ~Ax.P(x) |- Ex.~P(x)  (0)  Comment
~Ax.P(x)                  (1)  Given
|~Ex.~P(x)                (2)  Assumption
||Let x be arbitrary      (3)  New variable for subproof
|||~P(x)                  (4)  Assumption
|||Ex.~P(x)               (5)  E-Introduction (4)
|||F                      (6)  Contradiction (5) (2)
||~~P(x)                  (7)  Reductio Ad Absurdum (4) (6)
||P(x)                    (8)  Not-Elimination (7)
|Ax.P(x)                  (9)  A-Introduction (3) (8)
|F                       (10)  Contradiction (9) (1)
~~Ex.~P(x)               (11)  Reductio Ad Absurdum (2) (10)
Ex.~P(x)                 (12)  Not-Elimination (11)

A proof in this style is a sequence of steps. Each step must include a formula (on the left above) and a justification (on the right). The justification always includes the name of an inference rule and often includes premises, other steps in the proof. Premises are indicated by their step numbers. Proofs begin with steps whose rule is Given or Assumption, which have no premises.

Making an assumption begins a subproof, indicated here by indentation and vertical bars. Discharging an assumption closes the subproof. A step in a subproof can only be justified by premises in the same subproof or in enclosing subproofs (including the main proof).

Interactive proof

The checker supports forward proof. We can only work on one goal at a time. We can only add a step at the end of the proof, or delete steps from the end. If a proffered step is invalid, the checker prints an error message and does not add the step to the proof. Therefore, every step in the stored proof is valid and justified.

The only user interface to the checker is the Python interpreter itself. Formulas, inference rules, and entire proofs are Python expressions. Checker commands are Python functions. You do not need to know much Python to use the checker, just follow the directions here.

Here we demonstrate a few prover commands to enter a proof with no errors. Demonstrations of error handling appear below. A complete list of commands (including several not shown here) appears in the reference.

To begin an interactive proof session, start the Python interpreter with the session file for the logic we wish to use. The proof in this example uses first-order logic (fol). Other Flip logics are discussed in the reference.

python -i -m flip.logic.fol_session

(It is also possible to use the poset logic with python -i -m flip.poset.session. This will be the recommended way to use logics added after version 1.2.)

Python starts, runs the session file, and prints its command prompt. Now we can type checker commands at the prompt.

In many Python versions, we can edit in the command line with the right- and left-arrow keys and the backspace and delete keys. We can retrieve earlier commands to repeat or edit with the up- and down-arrow keys.

Use the clear command to start a new proof, or to start over. All command invocations are function calls, so we must put parentheses after the command name, even when there are no arguments:

>>> clear()

Use the checkp command (check and print) to enter the next proof step. The arguments to this command are the formula, the rule, and possibly some premises.

At this time Flip does not provide a parser. Formulas are ordinary Python expressions. We must enter formulas in prefix syntax with operator names. Then checkp prints the formula in infix syntax with operator symbols:

>>> checkp(Not(A(x, P(x))), given)
~Ax.P(x)                  (0)  Given

The variable x and the relation P are defined by running the fol_session module, as are all the logical operators including Not and A (for all). That module defines several other variables, relations, and functions, and we can define more during a proof session; see the reference.

Python uses zero-based indexing, so the first proof step is always number 0. We usually like to put a comment there, so the first formula in the proof appears in step 1. Let's remove that step 0 and put in a comment instead. Use the back command to remove the last step from the proof.

>>> back()

The comment goes in the formula argument to checkp, coded like this: Text('...'). The rule argument is comment. Once again, checkp prints the step.

>>> checkp(Text('Kaye ex. 9.12'), comment)
Kaye ex. 9.12             (0)  Comment

Use checkp to enter the first few steps

>>> checkp(Not(A(x, P(x))), given)
~Ax.P(x)                  (1)  Given
>>> checkp(Not(E(x, Not(P(x)))), assume)
|~Ex.~P(x)                (2)  Assumption
>>> checkp(New(x), new)
||Let x be arbitrary      (3)  New variable for subproof
>>> checkp(Not(P(x)), assume)
|||~P(x)                  (4)  Assumption

Then use pp (pretty-print) to show our progress so far:

>>> pp()
Kaye ex. 9.12             (0)  Comment
~Ax.P(x)                  (1)  Given
|~Ex.~P(x)                (2)  Assumption
||Let x be arbitrary      (3)  New variable for subproof
|||~P(x)                  (4)  Assumption

Use checkp to add a step with a premise. The premise is the third argument.

>>> checkp(E(x, Not(P(x))), Ei, 4)
|||Ex.~P(x)               (5)  E-Introduction (4)

Some rules have more than one premise; use as many arguments as needed.

>>> checkp(F, contra, 5, 2)
|||F                      (6)  Contradiction (5) (2)

The rapply (rule apply) command is an alternative to checkp for steps that have premises. It is usually the most convenient way to enter a step. Just enter the rule and the premise numbers; rapply itself generates the formula.

>>> rapply(raa,4,6)
||~~P(x)                  (7)  Reductio Ad Absurdum (4) (6)
>>> rapply(ne,7)
||P(x)                    (8)  Not-Elimination (7)
>>> rapply(Ai,3,8)
|Ax.P(x)                  (9)  A-Introduction (3) (8)
>>> rapply(contra,9,1)
|F                       (10)  Contradiction (9) (1)
>>> rapply(raa,2,10)
~~Ex.~P(x)               (11)  Reductio Ad Absurdum (2) (10)
>>> rapply(ne,11)
Ex.~P(x)                 (12)  Not-Elimination (11)

To use rapply with the Ae rule, you must also provide a term or variable to substitute for the bound variable, like this: rapply(Ae, 1, x). For the Ei rule, you must also provide a dictionary that shows how to replace the term in the premise with a bound variable. For example, we could do step 5 (above) this way:

>>> rapply(Ei,4,{x:x})
|||Ex.~P(x)               (5)  E-Introduction (4), with { x:x }

Here the dictionary {x:x} commands the checker to replace the term x in step 4 with the bound variable x.

Use pp to show the finished proof:

>> pp()
Kaye ex. 9.12             (0)  Comment
~Ax.P(x)                  (1)  Given
|~Ex.~P(x)                (2)  Assumption
||Let x be arbitrary      (3)  New variable for subproof
|||~P(x)                  (4)  Assumption
|||Ex.~P(x)               (5)  E-Introduction (4)
|||F                      (6)  Contradiction (5) (2)
||~~P(x)                  (7)  Reductio Ad Absurdum (4) (6)
||P(x)                    (8)  Not-Elimination (7)
|Ax.P(x)                  (9)  A-Introduction (3) (8)
|F                       (10)  Contradiction (9) (1)
~~Ex.~P(x)               (11)  Reductio Ad Absurdum (2) (10)
Ex.~P(x)                 (12)  Not-Elimination (11)

The ptree (print tree) command is an alternative to pp that prints the proof as a tree with the conclusion at the root, where each premise is indented under the formula it justifies:

>>> ptree()
Ex.~P(x)                 (-1)  Not-Elimination (11)
  ~~Ex.~P(x)               (11)  Reductio Ad Absurdum (2) (10)
    ~Ex.~P(x)                (2)  Assumption
    F                       (10)  Contradiction (9) (1)
      Ax.P(x)                  (9)  A-Introduction (3) (8)
        Let x be arbitrary      (3)  New variable for subproof
        P(x)                    (8)  Not-Elimination (7)
          ~~P(x)                  (7)  Reductio Ad Absurdum (4) (6)
            ~P(x)                  (4)  Assumption
            F                      (6)  Contradiction (5) (2)
              Ex.~P(x)               (5)  E-Introduction (4)
                ~P(x)                  (4)  Assumption
              ~Ex.~P(x)                (2)  Assumption
      ~Ax.P(x)                  (1)  Given

This format suggests an alternative to forward proof: backward proof from conclusions to premises. However, at this time the checker does not support backward proof.

Save and restore

Use the save command to save work in progress to resume later, or to save a completed proof. The save command takes one argument, a string, the name of the module where the proof is saved:

>>> save('ex912')
Saved in

The save command writes the file in the current working directory, replacing the file if it already exists. The saved file is a Python module with one attribute: the saved proof, whose name is the same as the module name. The proof can be restored into a Python session by using the import statement:

>>> from ex912 import ex912

To resume work on the proof, execute the check_proof command with the name (not quoted) we assigned. The check_proof command checks and prints each step:

>>> check_proof(ex912)
Kaye ex. 9.12             (0)  Comment
~Ax.P(x)                  (1)  Given
|~Ex.~P(x)                (2)  Assumption
... etc. ...
Ex.~P(x)                 (12)  Not-Elimination (11)

Now we could continue editing, issuing checker commands at the prompt.

Err and recover

The preceding example is a bit unrealistic because all of the proffered steps are valid. The whole purpose of a proof checker is to catch invalid steps and other errors. If a proffered step is erroneous, the checker prints a message, and the step is not added to the proof. The stored proof is still valid, so we can always recover by simply trying again.

Invalid steps

At step 5, we might have attempted:

>>> checkp(E(x,P(x)),Ei,4)
|||Ex.P(x)                (5)  E-Introduction (4)
'Fail: Ex.P(x) does not match Ev1.S1 with { S1:~P(x), v1:x }'

Here checkp checks its formula and premise arguments against the inference rule for E-introduction. They do not match, so the step fails. The message indicates that the formula contains P(x) while the premise contains ~P(x). The formula should contain ~P(x) as well. Try again:

>>> checkp(E(x,Not(P(x))),Ei,4)
|||Ex.~P(x)               (5)  E-Introduction (4)

This step is valid; checkp accepts it and adds it to the proof.

A complete list of error messages appears in the reference.

Type errors

Flip's first-order logic distinguishes two types of formulas: Formula (with a capital F), and Term. Arguments of logical operators (And, etc.) must be Formulas, while arguments of functions or relations (Equal, etc.) must be Terms. Variables are Terms, while Letters (Boolean variables) are Formulas. Using a Letter where a Variable is expected is a type error:

>>> checkp(E(q,Not(P(q))),Ei,4)
Traceback (most recent call last):
  File "", line 1, in 
  File "C:\Users\jon\Documents\prover\code\", line 41, in __init__
    Relation.__init__(self, *args)
  File "C:\Users\jon\Documents\prover\code\", line 154, in __init__
    check_type(self, Term, *args)
  File "C:\Users\jon\Documents\prover\code\", line 607, in check_type
    (self.__class__.__name__, i, a.ppf(), type(a), arg_type)
TypeError: P argument 0, q is <class 'formula.Letter'>, must be <class 'formula.Term'>

The problem indicated here is that the formula uses the Letter q as the argument of the relation P, where a Variable or another Term is required.

Another kind of type error occurs when the first argument of the checkp command is not a formula. For example, recall that a comment must be encoded this way: Text('...'). If we fail to do this, it is a type error:

>>> checkp('Kaye ex. 9.17 ...', comment)
"Fail: Kaye ex. 9.17 ... is <type 'str'>, must be Formula"

Syntax errors

Commands and formulas are just Python expressions, so syntax errors are signaled by the Python interpreter:

>>> checkp(E(x,Not(P(x)))),Ei,4)
  File "", line 1
SyntaxError: invalid syntax

The problem here is too many closing parentheses after P(x).

If we attempt to use an identifier that is not defined, Python prints a message like this:

>>> checkp(E(j,Not(P(j))),Ei,4)
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'j' is not defined

The problem here is that we tried to use j as a variable, but it is not defined in the fol_session module, or in the current session.

Command invocations are function calls so we must include the parentheses, even when there are no arguments. If we omit them, Python prints a message like this:

>>> pp
<function pp at 0x02232270>

Show information

The dir command shows identifiers that are defined in the Python session, including all the prover commands, rules, logical operators, variables, functions and relations:

>>> dir()
['A', 'Ae', 'Ai', 'And', 'Apply', 'E', 'Ee', 'Ei', 'Equal', 'F', 'Formula', 
... etc. ...
'top', 'trans', 'u', 'v', 'v1', 'v2', 'w', 'x', 'xr', 'xx', 'y', 'z']

Type any identifier to print information about it:

>>> x
<formula.Variable object at 0x02222A30>
>>> p
<formula.Letter object at 0x02222AB>
>>> P
<class 'fol.P'>

The help command often prints useful information:

>>> help(P)
Help on class P in module fol:

class P(formula.Relation)
 |  Example of relation, used in Kaye ex 9.16, 9.17 , Huth and Ryan sect. 2.3
 |  ... etc. ... 

The help command prints the arguments and other information about any function, including prover commands:

>>> help(checkp)
Help on function checkp in module nd:

checkp(formula_arg, rule, *premises_etc)
    Check step and update proof state, pretty-print step, print error

The rules command shows all the inference rules in the session. In each rule, the premises appear first, the conclusion appears last:

>>> rules()
[('comment', ['m1']),
 ('impli', ['m1', 'm2', 'Impl(m1,m2)']),
 ('oil', ['m2', 'Or(m1,m2)']),
 ('raa', ['m1', 'F', 'Not(m1)']),
 ('ai', ['m1', 'm2', 'And(m1,m2)']),
 ... etc. ...

The apropos command shows the inference rules that involve the operator given in the argument:

>>> apropos(Not)
[('raa', ['m1', 'F', 'Not(m1)']),
 ('oer', ['Or(m1,m2)', 'Not(m2)', 'm1']),
 ('oel', ['Or(m1,m2)', 'Not(m1)', 'm2']),
 ('ne', ['Not(Not(m1))', 'm1']),
 ('contra', ['m1', 'Not(m1)', 'F'])]

The rule_names attribute associates each rule name with its description. Use pprint to format the output:

>>> pprint(rule_names)
{'Ae': 'A-Elimination',
 'Ai': 'A-Introduction',
 ... etc. ...

Batch proof

An entire proof can be represented by a Python expression: a list of tuples, where each tuple is a step. We can use a text editor to create a file that contains a proof. Here are the contents of

from fol_session import *

ex912 = \
  [(Text('Kaye ex. 9.12, ~Ax.P(x) |- Ex.~P(x)'), comment),
    (Not(A(x, P(x))), given),
    (Not(E(x, Not(P(x)))), assume),
    (New(x), new),
    (Not(P(x)), assume),
    (E(x, Not(P(x))), Ei, 4),
    (F, contra, 5,2),
    (Not(Not(P(x))), raa, 4,6),
    (P(x), ne, 7),
    (A(x, P(x)), Ai, 3,8),
    (F, contra, 9,1),
    (Not(Not(E(x, Not(P(x))))), raa, 2,10),
    (E(x, Not(P(x))), ne, 11)]


Run Python on the file to check the proof:

> python
Valid A-intro, Kaye ex. 9.12, ~Ax.P(x) |- Ex.~P(x)  (0)  Comment
~Ax.P(x)                  (1)  Given
|~Ex.~P(x)                (2)  Assumption
||Let x be arbitrary      (3)  New variable for subproof
... etc. ...

The check_proof function checks and prints each step. If the step fails, check_proof prints an error message and exits.


The test modules contain many proofs in each of the Flip logics. They demonstrate every formula class and inference rule. Each test case is coded as a batch proof.


The proof format printed by the pp command resembles the one in Kaye. Similar formats are used by Huth and Ryan, Bornat, and many other logic textbooks. Sometimes this is called Fitch style. Other proof styles are discussed by Gries and Schneider.

Kaye, The Mathematics of Logic

Huth and Ryan, Logic in Computer Science

Bornat, Proof and Disproof in Formal Logic

Gries and Schneider, A Logical Approach to Discrete Math

F L i P Home   Download   User's guide   Reference   Logics   Code   Tests   Notes   PyPI   GitHub