Specification Refinement
When teaching formal methods, I often get asked how we can connect the specifications we write to the code we produce. One way we do this is with refinement. All examples are in TLA+.1
Imagine we’re designing a multithreaded counter. We don’t know how we’re going to implement it yet, so we start by specifying the abstract behavior.2
---- MODULE abstract ----
EXTENDS Integers, Sequences
VARIABLES pc, counter
vars == <<pc, counter>>
\* Two threads
Threads == 1..2
\* State transition action
Trans(thread, from, to) ==
/\ pc[thread] = from
/\ pc' = [pc EXCEPT ![thread] = to]
\* Both threads start in the `start` state
Init ==
/\ pc = [t \in Threads |-> "start"]
/\ counter = 0
Next ==
\* Pick one thread
\/ \E t \in Threads:
\* Move it to `done`
/\ Trans(t, "start", "done")
\* Increment counter
/\ counter' = counter + 1
Spec == Init /\ [][Next]_vars
====
We find out that our implementation system can’t do atomic updates. If we want to increment the counter, we have to nonatomically read the value into a local variable and write a new value in a separate step. We add that to a new specification:
---- MODULE bad ----
EXTENDS Integers, Sequences
VARIABLES pc, counter, tmp
vars == <<pc, counter, tmp>>
Threads == 1..2
States == {"start", "inc", "done"}
Trans(thread, from, to) ==
/\ pc[thread] = from
/\ pc' = [pc EXCEPT ![thread] = to]
Init ==
/\ pc = [t \in Threads |-> "start"]
/\ counter = 0
/\ tmp = [t \in Threads |-> 0]
GetCounter(t) ==
/\ tmp' = [tmp EXCEPT ![t] = counter]
/\ UNCHANGED counter
IncCounter(t) ==
/\ counter' = tmp[t] + 1
/\ UNCHANGED tmp
Next ==
\/ \E t \in Threads:
\* Step to get the counter value
\/ /\ Trans(t, "start", "inc")
/\ GetCounter(t)
\* Step to increment the counter
\/ /\ Trans(t, "inc", "done")
/\ IncCounter(t)
Spec == Init /\ [][Next]_vars
====
We now have an additional thread state and a local tmp
per thread. This is trying to do the same thing as abstract
with a more fleshed-out implementation. I want to encode that “trying to do” as a property. Is bad
a faithful implementation of abstract
?
Refinement
TLA+ specifications are “just” temporal logic predicates, or boolean functions. If we have
Spec == Init /\ [][Next]_vars
Then Spec
is the predicate “Init
is true for the initial state and [Next]_vars
is true of all subsequent states.” This specifies our system: a sequence of states counts as a system behavior if-and-only-if Spec
is true for that sequence.
When we tell TLC to check Inv
as an invariant of Spec
, that’s equivalent to verifying that Spec => []Inv
. If Spec
is true of a sequence of states, then Inv
is true for all states in that behavior. That makes Inv
an invariant. Liveness properties, like “L
is eventually true”, work the same way.
We say Impl
refines Abstract
if
ImplSpec => AbstractSpec
If ImplSpec
is true, then AbstractSpec
is also true. But AbstractSpec
is the predicate that describes valid behaviors of Abstract
. So if this statement is true, any behavior of Impl
is also a behavior of Abstract
. If we make Impl
a more detailed version of Abstract
, we can use this to “close the gap” between our high-level behavior abstractions and our low-level code.
I think of it like a game: we initialize both specs and make “moves” in the Impl
state space. After every move, we make an “equivalent” move in the Abstract
state space. If we can always make an “equivalent” move, then Impl
refines Abstract
. Otherwise it’s possible for Impl
to do something that Abstract
cannot do, so Impl
cannot be a more detailed implementation of Abstract
.3
To prove refinement, we need to specify what counts as an “equivalent” move. That means finding a way to transform bad
behaviors into abstract
behaviors. This transformation is called a refinement mapping.
Checking Correspondence
bad
is more detailed than abstract
: it has one more thread state and an additional variable. As part of the refinement mapping we want to say that these are “implementation details” that don’t matter to the refinement. We do this the INSTANCE
keyword. This lets us import another module into our current one, but replace the values of state variables with expressions.
Mapping ==
INSTANCE abstract WITH
counter <- counter,
pc <- [t \in Threads |->
IF pc[t] = "inc" THEN "start"
ELSE pc[t]
]
Refinement == Mapping!Spec
To understand what’s going on here, let’s look at a step where pc[1]
transitions from "start"
to "inc"
. This ends us in a valid state for bad
, but and invalid one for abstract
, which only has threads states "start"
and "done"
. However, in our refinement mapping, we replace all instances of "inc"
with "start"
. So Mapping!pc[1] = "start"
, which is a valid state for abstract
.
Now that we have a refinement mapping, we can test Refinement
as a temporal property. 4
Error: Action property of module abstract is violated.
Error: The behavior up to this point is:
State 1:
/\ counter = 0
/\ pc = <<"start", "start">>
/\ tmp = <<0, 0>>
State 2:
/\ counter = 0
/\ pc = <<"start", "inc">>
/\ tmp = <<0, 0>>
State 3:
/\ counter = 0
/\ pc = <<"inc", "inc">>
/\ tmp = <<0, 0>>
State 4:
/\ counter = 1
/\ pc = <<"done", "inc">>
/\ tmp = <<0, 0>>
State 5:
/\ counter = 1
/\ pc = <<"done", "done">>
/\ tmp = <<0, 0>>
This error trace is a valid behavior for bad
, but it is not one for abstract
. There is no way to take the abstract
spec and end with both threads in "done"
and counter = 1
. Therefore, we conclude that our implementation is not a valid refinement of our design.5
Fixing this
How can we fix the race condition? By implementing a lock, of course!
---- MODULE good ----
EXTENDS Integers, Sequences
VARIABLES pc, counter, tmp, lock
vars == <<pc, counter, tmp, lock>>
Threads == 1..2
States == {"start", "inc", "done"}
Trans(thread, from, to) ==
/\ pc[thread] = from
/\ pc' = [pc EXCEPT ![thread] = to]
Init ==
/\ pc = [t \in Threads |-> "start"]
/\ counter = 0
/\ tmp = <<0, 0>>
/\ lock = 0
AcquireLock(t) ==
/\ lock = 0
/\ lock' = t
ReleaseLock(t) ==
/\ lock = t
/\ lock' = 0
GetCounter(t) ==
/\ tmp' = [tmp EXCEPT ![t] = counter]
/\ UNCHANGED counter
IncCounter(t) ==
/\ counter' = tmp[t] + 1
/\ UNCHANGED tmp
Next ==
\/ \E t \in Threads:
\/ /\ Trans(t, "start", "inc")
/\ GetCounter(t)
/\ AcquireLock(t)
\/ /\ Trans(t, "inc", "done")
/\ IncCounter(t)
/\ ReleaseLock(t)
Spec == Init /\ [][Next]_vars
Mapping ==
INSTANCE abstract WITH
counter <- counter,
pc <- [t \in Threads |->
IF pc[t] = "inc" THEN "start"
ELSE pc[t]
]
Refinement == Mapping!Spec
====
This passes the model checker: good
is a valid refinement of abstract
. We can then refine good
into a more detailed spec, perhaps closer to our machine bytecode, and eventually refine the spec into executable code. There isn’t any tooling to do this in TLA+, but it’s possible. I’ll talk a bit more about spec-to-code refinements later.
Refinement Guarantees in TLA+
Surprisingly, good
also a valid refinement of bad
! You can think of the refinement as showing that the implementation behaviors map to a subset of the abstract behaviors. good
maps to a subset of bad
’s behaviors, the subset of bad
that also conforms with abstract
.
One consequence of this: we know that a given Impl
behavior is also an Abstract
behavior, but we don’t know if a given Abstract
behavior is possible in Impl
. We can’t guarantee that every state of Abstract
is reachable by our refinement. But we can guarantee all of Abstract
’s safety and liveness properties:
- If
Abstract
will never do a “bad” thing, thenImpl
will never do it, either. - If
Abstract
is guaranteed to do a “good” thing, thenImpl
is guaranteed to do it, too.
good
can have additional properties beyond abstract
, such as safety and liveness properties on tmp
and counter
. In general, Impl
will have a strict superset of Abstract
’s properties.
Refining is Hard
I’ve not seen much industrial use of refinement. Part of this is a lack of knowledge and resources, but there are also a few technical challenges with refining specifications.
The first is that implementations are much larger than specifications, so refinements of a spec will be larger too. As a rule of thumb, the difficulty to write and verify a spec is exponential on its length. That’s why most specs are either high-level models of large systems or low-level models of small systems. With refinement you have the worst of both worlds.
Additionally, all refinements (in TLA+) need to totally describe the behavior of Abstract
. Since all state variables need to be refined, we can’t declare any behavioral properties as unnecessary for the refinement.6 Since good
simultaneously updates lock
and tmp
, all further refinements of good
must also simultaneously change those two variables. We couldn’t write a spec which sets the lock and updates tmp
as two separate actions.7
Finally, part of the reason we use formal specifications in the first place is to get away from the messiness of machine code. If you try to refine all the way from abstract spec to implementable code, at some point you’ll have to deal with Unicode and floating point semantics and filesystems and all that jazz. For these reasons, while refining specifications to code is possible, refining specifications to more detailed specifications is (relatively) more popular.
Ultimately, while I’m optimistic about broader industry adoption of formal methods, I don’t think many people will use refinement. It adds too many essential difficulties on top of what’s already a challenging discipline.8
Refinement in other formalisms
This article was TLA-centric because that’s the specification language I know best. But user-friendly spec-to-code refinement is one of the holy grails of formal methods, so lot of formalisms explore it in some way. To my understanding, the most “refinement-oriented” specification language is Event-B, which I say entirely because refinement is part of the core syntax. It also has a lot of third party tooling for refining specifications to code. But I haven’t used Event-B enough to know how much either happens in practice.
You can also refine code to other code. This isn’t often done because if you already have the code, you can just use the code. But it can make sense if you need to convert between languages. Most famously, seL4 was first specified in Isabelle/HOL, then refined into Haskell, and then refined again into C.
Thanks to Ron Pressler for feedback and corrections. Any further errors are mine alone. I shared the first draft of this essay on my newsletter. If you like my writing, why not subscribe?
- Disclaimer, I’m only talking about refinement as it relates to formal specification of systems. So I’m not covering things like stepwise refinement or the refinement calculus. Nothing against them, they’re just out of scope here. [return]
- If you’re following along with the TLA+ toolbox, you’ll need to disable deadlock checking. When the algorithm terminates, there are no possible next states, so the model “deadlocks”. [return]
- “Formal methods as adversarial game” has a long history and is often used to prove bisimulation. [return]
- It’s a temporal property and not an invariant because we’re looking at both states and state transitions here. See Action Properties. [return]
- Or, more precisely, that
Mapping
is not a correct refinement mapping. It could be that we just misunderstood the relationship between our design and implementation, and a different transformation would show us a valid refinement. However, seeing the refinement mapping fail should make us less confident that there is a refinement. [return] - TLA+ has a mechanism for variable hiding, the temporal existential operator (
\EE
), which lets us remove state variables from refinement mappings. But the model checker can’t check specifications which use it. [return] - This gets especially annoying when you’re using auxilary variables to check complex properties. The refinement also needs to map the aux vars, even though they have no bearing on behavior whatsoever. [return]
- One place I think refinement could be very useful: refining partial specifications. That’d be when
Abstract
is an incomplete specification that covers only a small part of the system’s state. You could, for example, write a specification for a stack ADT and use refinement to verify that some state variables in your system behave like stacks. This would be one way to make reusable libraries and get around the composition problem. [return]