Using Abstract Data Types in TLA+
In my 2021 TLAConf Talk I introduced a technique for encoding abstract data types (ADTs). For accessibility I’m reproducing it here. This post is aimed at intermediate-level TLA+ users who already understand action properties and the general concept of refinement.
Say we want to add a LIFO stack to a specification. You can push to and pop from the end of this stack but you cannot change data in the middle. We could represent this internally with a sequence, along with Push
and Pop
actions:
VARIABLES seq
Push(x) ==
stack' = Append(seq, x)
\* For simplicity we can pop an empty stack
\* Which will be a noop
Pop ==
stack' = SubSeq(seq, 1, Len(seq)-1)
While these actions describe a stack, there’s nothing stopping us from modifying seq
directly.1 To prevent that, we first define a separate StackADT
module:
---- MODULE StackADT ----
LOCAL INSTANCE Sequences
LOCAL INSTANCE Integers
CONSTANT StackType
VARIABLE stack
Init == stack \in Seq(StackType)
Push(x) ==
stack' = Append(stack, x)
Pop ==
stack' = SubSeq(stack, 1, Len(stack)-1)
Next ==
\/ Pop
\/ \E x \in StackType:
Push(x)
Spec == Init /\ [][Next]_stack
====
Spec
defines all of the valid behaviors of the module. At the beginning, Init
must be true, meaning the stack
variable is a sequence of StackType
values. For every step afterwards Next
must be true, meaning we either push a new Stacktype
value or pop from the existing stack.
While we can’t model check Spec
directly, we can use it as a property for another spec, much like we do with refinement.2 Here’s an example main
spec:
---- MODULE main ----
EXTENDS Integers, Sequences, TLC
VARIABLES seq
Workers == {"a", "b"}
Stack == INSTANCE StackADT WITH stack <- seq, StackType <- Workers
Range(f) == {f[x] : x \in DOMAIN f}
Init == seq = <<>>
Next ==
\/ /\ seq # <<>>
/\ Stack!Pop
\/ \E w \in Workers:
/\ w \notin Range(seq)
/\ Stack!Push(w)
Spec == Init /\ [][Next]_seq
Refinement == Stack!Spec
====
The StackADT module is parameterized under the “namespace” Stack
, meaning we pass in values for its variables and constants. StackType
is set to a fixed value, but stack
is set to a variable. This effectively replaces all instances of stack
in the module with seq
.
\* This can be used directly in the spec
Stack!Push(x) ==
- stack' = Append(stack, x)
+ seq' = Append(seq, x)
We can, of course, still modify seq
directly. However, we also have the Refinement
property. Expanding the namespace, we get
Refinement == Stack!Spec
\* Is equivalent to
Refinement == Stack!Init /\ [][Stack!Next]_seq
This is saying that as a property of main
, Stack!Init
is true in the first state and Stack!Next
is true for every next state. If we initialize seq
to something not in Stack!Init
or modify seq
in a way incompatible with Stack!Next
, TLC raises an error. As an experiment, try making one of these changes to Next
:
Next ==
+ \/ Stack!Push("fakeval")
\/ /\ seq # <<>>
Next ==
+ \/ seq' = <<"b">> \o seq
\/ /\ seq # <<>>
In the first change, we push a value not in the stack’s type. In the second, we prepend a value to the beginning of the stack, as opposed to pushing it on the end. Model checking either case will raise an error.
Adding mutations
Popping from the stack should return the popped value. In steps where Pop
happens, the popped value is just stack[Len(stack)]
. If we instead called Push
or don’t change the stack at all, there is no popped value, so we shouldn’t be able to retrieve it.
We can emulate that behavior with this operator:
Top == CASE Pop -> stack[Len(stack)]
Most of the time we use actions like mutations: When TLC sees Pop
in Next
, it generates a state where we popped from the queue. Top
instead uses an action as a boolean expression. For any given transition, either we popped from the queue and Pop
is true or we didn’t and Pop
is false. As with any other boolean expression, we can use value of Pop
to branch a conditional.
Since the CASE
statement only has one branch, nothing matches if Pop
is false. TLC will raise an error, guaranteeing that Top
can only be called when we’ve already popped from the stack. It also returns the last element of stack
, not the last element of stack'
, meaning it’s the popped value.
In practice, this means that if you call Top
by itself, you get an error, while if you call Pop
then Top
, it works as expected. I think this is an acceptable compromise. When you call Pop
, the popped element is “stored into” Top
for the rest of the state-step, then it disappears in the next step. If you try to access the stored element without calling Top
, you get an error. It’s not entirely ergonomic, but it still enforces the system properties.
(This doesn’t maintain the value of Top
between steps; it only exists in the same step we called Pop
. If you need the popped value for multiple nonatomic steps you will have to store it in a variable. But we can use Top
several times in different places of the same step and it will always refer to the same value. stack'
describes what stack
will be in the next step; as long as we’re in the current step, stack
will always be the same value, as will its last element.)
Dynamic Stacks
So far we’ve only looked at specifying a single stack. But you might have something like
CONSTANT Workers
VARIABLE queue
Type ==
queue \in [Workers -> Stacks]
With queue
we have a separate stack for each worker, where the spec can have any number of workers. The StackADT
module takes a single stack, not a set of them. We can still use it, we just need to be clever. TLA+ modules can be partially parameterized:
S(stack) == INSTANCE StackADT WITH StackType <- Workers
Explanation of Partial Parameterization
Partial parameterized modules are a bit complex, so I’m going to step out of the example and showcase it with a simpler one.
---- MODULE Point ---
CONSTANTS X, Y
Point == <<X, Y>>
Add(x, y) == <<X + x, Y + y>>
====
To fully parameterize Point
, I’d write
P == INSTANCE Point WITH X <- 1, Y <- 2
So P!Point = <<1, 2>>
. But I could also write
Q(X) == INSTANCE Point WITH Y <- 2
Q is a partially parameterized: it fixes Y
but not X
. We have to provide X
whenever we call any of the operators in Q
. So Q(3)!Point = <<3, 2>>
, Q(0)!Add(1, -1) = <<1, 1>>
, etc.
As with total parameterization, this also works with actions. If I have
---- MODULE Counter ----
VARIABLE x
Inc == x' = x + 1
====
---- MODULE main ----
VARIABLE y, z
C(x) == INSTANCE Counter
Then C(y)!Inc
with increment y
as an action and C(z)!Inc
will increment z
.
With this, we could pop from the first queue with S(queues[1])!Push(3)
. Then we can non-deterministically pop one queue with a quantifier:
Next ==
\E w \in Workers:
\/ S(queue[w])!Pop
\/ \* ...
Our property would be that every queue follows Stack!Spec
.
AllQueuesAreStacks ==
\A w \in Workers:
S(queue[w])!Spec
Unfortunately, neither of these work. If you try to check these TLC will throw an error. There are workarounds, but they’ll take a little explaining.
Fixing the Property
Let’s start with AllQueuesAreStacks
. While it’s a valid TLA+ property, TLC isn’t sophisticated enough to top-level partially-parameterized refinements. But it can check action properties! It’d look something like this:
AllQueuesAreStacks ==
\A w \in Workers:
LET q == queue[w] IN
[][S(q)!Next]_q
Remember, S(q)!Spec == S(q)!Init /\ [][S(q)!Next]_q
.3 This is closer to something TLC can check, but it still can’t handle actions inside quantifiers. We need to move the action outside the quantifier:
AllQueuesAreStacks ==
[][
\A w \in Workers:
LET q == queue[w] IN
q' # q => S(q)!Next
]_queue
Derivation
Let P = S(q)!Next
for brevity, so the “inner action property” is [][P]_q
.
[P]_q
is formally defined to meanP \/ UCHANGED q
.- By definition of
=>
,a => b
≡b \/ ~a
.P \/ UNCHANGED q
≡~(UNCHANGED q) => P
.
UNCHANGED q
≡q' = q
~(UNCHANGED q)
≡~(q' = q) ≡ q' # q
[][P]_q => [](q' # q => P)
Finally, []
commutes with \A
, meaning \A x: []P(x)
≡ [](\A x: P(x))
. So we can pull out the []
and place the entire quantifier inside.
That’s equivalent and TLC can check it! It’s all a bit messy, but it’s conceptually well-founded, at least. Now to fix the action.
Fixing the action
This is a lot harder, and it’s not just a limitation of the model checker. S(queue[w])!Next
is semantically invalid.
Imagine if Next
was just x' = x + 1
. Then S(queue[w])!Next
would be
queue[w]' = queue[w] + 1
This does not fully define queue
! If queue = {a: 1}
, S(queue["a"])!Next
is satisfied by queue' = {a: 2}
, but it’s also satisfied by queue' = {a: 2, b: ↓}
. Since this is almost certainly not what you want, TLC reports that as an error. Instead, we have the unergonomic alternative of4
queue' = [queue EXCEPT ![w] = queue[w] + 1]
Since functions have to be handled in this special way, and Stack
works on a single variable, passing in queue[w]
would leave the rest of queue
undefined.5
So you’ll have to write new operators to handle things, which basically means copypasting S!Pop
:
Next ==
\E w \in Workers:
\/ queue' = [queue EXCEPT ![w] = SubSeq(queue[w], 1, Len(queue[w])-1)]
It’s a bit of leaky abstraction, but that tends to be okay with specifications, because they’re still overall much smaller than programs. Importantly, you’re still protected by the refinement property! AllQueuesAreStacks
is a property on how queue
changes in value, not that queue
uses the appropriate actions. So even though we’re manipulating queue
in a different way than Stack
“expects”, it’s still in a way that’s compatible with Stack(w)!Spec
, and the property holds.
Discussion
I don’t know how effective these techniques are in practice. As far as I can tell, nobody else has used them before, certainly not for dynamic values. I think it’s interesting, though, and I hope that this eventually allows for more reusability. One of the big limitations on formal methods is how hard it is to automatically compose specifications. But we might be able to automate 80% of composition, and I think this technique can help us get there.
Thanks to Marianne Bellotti and Andrew Helwer for feedback.
- I’m being super teleological here. Formally the actions don’t modify the stack, they describe the ways the stack can change between next-states. But everybody understands what I mean when I say “modify” and I’m fine trading precision for clarity. [return]
- We can’t check
Spec
directly because 1) theInit
usesSeq(StackType)
, so there’s infinite starting states, and 2)Next
is unbounded, so there’s infinite reachable states. We can use it as a property because only needs to be checked on every transition of our main spec. So if our main spec is bounded, it doesn’t matter thatSpec
represents an unbounded system, because we only look at a bounded subset. [return] - I’m not including
S(q)!Init
in the property for simplicity. The complicated bit isS(q)!Next
anyway. [return] - You could write that more cleanly as
![w] = @ + 1
. I cover@
in the TlaConf talk. [return] - We don’t have this problem with
AllQueuesAreStacks
because we’re using it as a property, not an action. Oncequeue'
is specified, we can writequeue[w]' = queue[w] + 1
as an expression. [return]