Skip to content
This repository has been archived by the owner on Feb 15, 2024. It is now read-only.

Proposal for proof language #87

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
392 changes: 392 additions & 0 deletions proof-language.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,392 @@
---
geometry: margin=2cm
---

\newcommand{\limplies}{\to}
\renewcommand{\phi}{\varphi}


Matching logic proof format
===========================

This is a proposed matching logic proof format.

## Relation to Metamath

Note that this language heavily based on the metamath format for compressed proofs.
Besides being a binary rather than ASCII encoding, there are two major changes.

1. We use the same mechanism for constructing terms and proofs, allowing better
compression.
2. We allow emmitting multiple proofs from the same representation. This allows
us to represent lemmas and compression uniformly.
That is, we can store the entire contents of a single metamath file in this format,
rather than just the proof of a single lemma.

## Relation to postfix notation/RPN

The format may be viewed as a generalization
of postfix notation (incidentally also called Lukaseiwicz notation),
for an expression computing the conclusion of a proof.

We also allow marking subexpressions (i.e. either proofs or patterns) for reuse. This
allows us to represent lemmas, and compactly representing configurations and notation.

(This needs input from ZK folk.) Finally, we allow marking constructed proofs
as public. This serves two purposes. First, it allows us to publish lemmas
that may be reused in other proofs. Second, it allows us to publish the main
result proved by the proof.


## As a stack machine

It may also be viewed language interpreted using a stack machine.
This stack machines constructs terms---either patterns or proofs. We treat
patterns and proofs uniformly---at an abstract level, they are both terms we
must construct, with particular well-formedness criteria. It is intended for the
most part, that **the verifier only needs to keep proved conclusions in
memory**.


## Goals:

- No upfront parsing cost.
- Easily verifiable using ZK.
- Low memory usage.
- Composability.


## Non-goals:

- Parallization: I have not been thinking explicitly about parallel verification of this format,
because I expect parallization to happen by composing units of this format into larger proofs.
Even so I think with some additional metadata, we could make
this format parallizable.

- Human readability: We aim to be a simple binary format. To become human
readable, we expect the user to run a secondary program to pretty-print the
proof. Note that symbols, lemmas, and notation do not have names.


Terms
=====

As mentioned previously, we may view this format as a language for constructing
terms that meet particular well-formedness criteria. Let us first enumerate
those terms, in a python-esque pseudocode.

`Term`s are of two types, `Pattern`, and `Proof`.
Every `Term` has a well-formedness check.
Every `Proof` has a conclusion.

```python
abstract class Term:
abstract def well_formed:
...

abstract class Pattern(Term):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider adding Contexts that should have the distinguished hole variable check, as we discussed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I recall what we discussed. MetaVars already have an application_context field that is used for this check.

def well_formed():
# Fails if any two MetaVars with the same id has
# different meta-requirements
# MetaVars are introduced later.
...

abstract class Proof(Term):
abstract def conclusion:
...
```

`Pattern`s
----------

There are two classes of `Pattern`,
the expected concrete constructors for matching logic patterns,
and a representation for various "meta" patterns.

### Standard patterns

```python
class Symbol(Pattern):
name: uint32

class SVar(Pattern):
name: uint32

class EVar(Pattern):
name: uint32

class Implication(Pattern):
left: Pattern
right: Pattern

class Application(Pattern):
left: Pattern
right: Pattern

class Exists(Pattern):
var: EVar
subpattern : Pattern

class Mu(Pattern):
var: SVar
subpattern : Pattern

def well_formed():
return super.well_formed()
and subpattern.var_occurs_positively(var)
```

### Meta-patterns

Meta-patterns allow us to represent axiom- and theorem-schemas through the use
of metavariables:

```python
class MetaVar(Pattern):
name: uint32

# Meta-requirements that must be satisfied by any instantiation.
fresh: list[EVar | SVar] # variables that must not occur in an instatiation
positive: list[SVar] # variables that must only occur positively in an instatiation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't at least positive, negative rather be in Pattern? Pattern also have positive/negative occurrences of variables, there's little reason why we should compute them. These could also be actual references in the concrete pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. For concrete patterns, we don't need additional metadata to know whether something is positive or negative. We can just look at the term and check. For metavariables, that is not the case, since we cannot see "inside" the metavariable.

Note that a smart implementation, may keep track of positivity/negativity track in a clever way. That is not our concern here. This is not an implementation, its a specification of the language.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, it's a specification. I was pointing more to the current python implementation.

It is a bit redundant if you can hold direct references to them. Especially later on when we want to do substitutions more efficiently. But sure, we can do optimizations later.

negative: list[SVar] # variables that must only occur negatively in an instatiation
application_context: list[EVar] # Variables that must only occur as a hole variable in an application context.
```

Each `MetaVar` has a list of requirements that must be met by any instantiation.
These may also be used by the well-formedness checks for `Proof`s.

We also need to represent substitutions applied to `MetaVar`s.

```python
class ESubst(Pattern):
pattern: MetaVar
var: EVar
plug: Pattern

class SSubst(Pattern):
pattern: MetaVar
var: SVar
plug: Pattern
```


`Proof`s
--------

### Axiom Schemas

Axiom schemas are `Proof`s that do not need any input arguments.
They may use `MetaVar`s to represent their schematic nature.

```python
class Lukaseiwicz(Proof):
def conclusion():
phi1 = MetaVar('phi1')
return Implication(Implication(Implication(MetaVar(phi1) , ...)...)...)

class Quantifier(Proof):
def conclusion():
x = EVar('#x')
y = EVar('#y')
phi = MetaVar('phi', fresh=[y])
return Implication(ESubst(phi, x, y), Exists(x, phi))

class PropagationOr(Proof):
def conclusion():
hole = EVar('#hole')
C = MetaVar(application_context=(EVar('#hole'),))
phi1 = MetaVar('#phi1')
phi2 = MetaVar('#phi2')
return Implication(ESubst(C, or(phi1, phi2), hole), or(ESubst(C, phi1, hole), ESubst(C, phi2, hole)))

...
```


### Meta Variable Instantiation

Using a single rule of meta inference, we may instantiate metatheorems.
This allows us to prove theorem schemas, such as $\phi \limplies \phi$.

```python
class InstantiateSchema(Proof):
subproof : Proof
metavar_id: uint32
instantiation: Pattern

def well_formed():
# Fails if the instantiation does not meet the
# disjoint/positive/freshness/application ctx
# criterea of the metavar.

def conclusion():
return subproof.meta_substitute(metavar, instantiation)
```

We may also use metavariables to represent notation.

```python
class InstantiateNotation(Pattern):
notation: Pattern
metavar_id: uint32
instantiation: Pattern

def well_formed():
# Fails if the instantiation does not meet the
# disjoint/positive/freshness/application ctx
# criterea of the metavar.

def conclusion():
return notation.meta_substitute(metavar, instantiation)
```


### Ordinary inference

```python
class ModusPonens(Proof):
premise_left: Implication
premise_right: Pattern

def conclusion():
nishantjr marked this conversation as resolved.
Show resolved Hide resolved
return premise_left.right

def well_formed():
assert premise_right == premise_left.left

class Generalization(Proof):
premise: Implication

def phi1():
premise.left
def phi2():
premise.right

def conclusion():
return Implication(ESubst(MetaVar(phi), EVar(x), EVar(y)), Exists(EVar(x), MetaVar(phi)))

def well_formed():
assert EVar(x) is fresh in phi1()
...
```

Instructions and semantics
==========================

Each of these instructions checks that the constructed `Term` is well-formed before pushing onto the stack.
Otherwise, execution aborts, and verification fails.

* Supporting:

`List n:uint32`
: Consume $n$ items from the stack, and push a list containing those items to the stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the explanation we discussed, please.


* Variables and Symbols:

`EVar <uint32>`
Copy link
Contributor

@fiedlr fiedlr Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's think about whether this should pop from the stack instead (adding this comment just for reference).

: Push an `EVar` onto the stack.

`SVar <uint32>`
: `Push an `SVar` onto the stack.

`Symbol <uint32>`
: `Push an `Symbol` onto the stack.

`MetaVar <uint32>`
: Consume the first four entries from the stack (corresponding to the meta-requirements),
and push an `MetaVar` onto the stack.

* Connectives:

`Implication`/`Application`/`Exists`/`Mu`
: Consume the first two patterns from the stack, and push an implication/application/exists/mu onto the stack with appropriate argumetns.

* Axiom Schemas

`Lukaseiwicz`/`Quantifier`/`PropagationOr`/`PropagationExists`/`PreFixpoint`/`Existance`/`Singleton`
: Push Proof term corresponding to axiom schema onto the stack.

* Meta inference

`InstantiateSchema <metavar_id:uint32>`
: Consume a `Proof` and `Pattern` off the stack, and push the instantiated proof term to the stack.

`InstantiateNotation <metavar_id:uint32>`
: Consume two `Pattern`s off the stack, and push the instantiated proof term to the stack.

* Inference rules

`ModusPonens`/`Generalization`/`Frame`/`Substitution`/`KnasterTarski`
: Consume one or two `Proof` terms off the stack and push the new proof term.

* Memory manipulation:

`Save`
: Store the top of the stack to the lowest unused index in memory.

`Load i:uint32`
: Push the `Term` at index $i$ to the top of the stack.

`Delete i:uint32`
: Remove the `Term` at index $i$ from memory. This is not strictly needed,
but will allow the verifier to use less memory. The memory slot is not considered free.

* Journal manipulation.

`Publish`
: publish the entry at the top of the stack to the journal.

* Stack manipulation.

`Pop`
: Consume the top of the stack. (Do we need this? Probably not.)

Verification
============

The verifier takes two inputs[^1]:

1. `theory` A file with instructions to construct patterns
and place them on the stack to represent the theory $\Gamma$.
This is public.

All patterns on the stack after running this file are considered axioms.
Before verification of the proof, these are moved to the first free memory locations (That is, after the notation).

All patterns placed in memory using the `Save` instruction may be used as notation.
Note that these patterns are not considered proved, and can only be used to buildother patterns.

TODO: We should review this. Do we want a specific instruction for publishing notation?

2. `lemmas`: A file with instructions to construct patterns corresponding the
lemmas previously proved in other proof files. An aggregation circuit will
need to check that the theory used by that proof is a subset of the theory
for the corrent proof.

All patterns on the stack after running this file are considered proved lemmas.

TODO: Should we allow notation from these as well?
If we do, composing proofs may get difficult---notation may accumulate
forcing the checker to use a lot more memory than needed.

3. `proof`: A file with instructions to construct proofs about the theory.
This file is private, and should be rolled-up by ZK.

The verifier consumes the theory file. The stack should now only include `Pattern` terms.
These are moved into the initial entries of the memory, as `Proof` terms.
The verifier then consumes the `proof` file and produces a "journal"[^2],
containing a list of instructions to construct the patterns of theorems proved.

[^1]: This needs input from the ZK folk for the exact input mechanism.
[^2]: Using Risc Zero terminology

The verifier state consists of a stack of `Term`s, and
an indexed list of saved `Term`s, called the memory.

Composition
===========

We may compose proofs together, by allowing proofs to reuse results contained in another proofs journal.
A subset of the contents of the journal of a proof $p$ may be included as the input lemmas of another proof $q$,
**so long as** the gamma of $p$ is subset of gamma of $q$.

Loading