Using Abstract Data Types in TLA+

In my 2021 TLAConf Talk I introduced a technique for encoding abstract data types (ADTs). For accessibility I’m reproducing it here. This post is aimed at intermediate-level TLA+ users who already understand action properties and the general concept of refinement.

Say we want to add a LIFO stack to a specification. You can push to and pop from the end of this stack but you cannot change data in the middle. We could represent this internally with a sequence, along with Push and Pop actions:

VARIABLES seq

Push(x) ==
  stack' = Append(seq, x)

\* For simplicity we can pop an empty stack
\* Which will be a noop
Pop == 
  stack' = SubSeq(seq, 1, Len(seq)-1)

While these actions describe a stack, there’s nothing stopping us from modifying seq directly.¹ To prevent that, we first define a separate StackADT module:

---- MODULE StackADT ----

LOCAL INSTANCE Sequences
LOCAL INSTANCE Integers
CONSTANT StackType
VARIABLE stack

Init == stack \in Seq(StackType)

Push(x) == 
  stack' = Append(stack, x)

Pop == 
  stack' = SubSeq(stack, 1, Len(stack)-1)
 
Next == 
  \/ Pop
  \/ \E x \in StackType:
     Push(x)

Spec == Init /\ [][Next]_stack  

====

Spec defines all of the valid behaviors of the module. At the beginning, Init must be true, meaning the stack variable is a sequence of StackType values. For every step afterwards Next must be true, meaning we either push a new Stacktype value or pop from the existing stack.

While we can’t model check Spec directly, we can use it as a property for another spec, much like we do with refinement.² Here’s an example main spec:

---- MODULE main ----

EXTENDS Integers, Sequences, TLC
VARIABLES seq

Workers == {"a", "b"}
Stack == INSTANCE StackADT WITH stack <- seq, StackType <- Workers
Range(f) == {f[x] : x \in DOMAIN f}

Init == seq = <<>>

Next ==
  \/ /\ seq # <<>>
     /\ Stack!Pop
  \/ \E w \in Workers:
     /\ w \notin Range(seq)
     /\ Stack!Push(w)
  
Spec == Init /\ [][Next]_seq

Refinement == Stack!Spec
====

The StackADT module is parameterized under the “namespace” Stack, meaning we pass in values for its variables and constants. StackType is set to a fixed value, but stack is set to a variable. This effectively replaces all instances of stack in the module with seq.

\* This can be used directly in the spec
Stack!Push(x) == 
- stack' = Append(stack, x)
+ seq' = Append(seq, x)

We can, of course, still modify seq directly. However, we also have the Refinement property. Expanding the namespace, we get

Refinement == Stack!Spec
\* Is equivalent to
Refinement == Stack!Init /\ [][Stack!Next]_seq

This is saying that as a property of main, Stack!Init is true in the first state and Stack!Next is true for every next state. If we initialize seq to something not in Stack!Init or modify seq in a way incompatible with Stack!Next, TLC raises an error. As an experiment, try making one of these changes to Next:

Next ==
+ \/ Stack!Push("fakeval")
  \/ /\ seq # <<>>

Next ==
+ \/ seq' = <<"b">> \o seq  
  \/ /\ seq # <<>>

In the first change, we push a value not in the stack’s type. In the second, we prepend a value to the beginning of the stack, as opposed to pushing it on the end. Model checking either case will raise an error.

Adding mutations

Popping from the stack should return the popped value. In steps where Pop happens, the popped value is just stack[Len(stack)]. If we instead called Push or don’t change the stack at all, there is no popped value, so we shouldn’t be able to retrieve it.

We can emulate that behavior with this operator:

Top == CASE Pop -> stack[Len(stack)]

Most of the time we use actions like mutations: When TLC sees Pop in Next, it generates a state where we popped from the queue. Top instead uses an action as a boolean expression. For any given transition, either we popped from the queue and Pop is true or we didn’t and Pop is false. As with any other boolean expression, we can use value of Pop to branch a conditional.

Since the CASE statement only has one branch, nothing matches if Pop is false. TLC will raise an error, guaranteeing that Top can only be called when we’ve already popped from the stack. It also returns the last element of stack, not the last element of stack', meaning it’s the popped value.

In practice, this means that if you call Top by itself, you get an error, while if you call Pop then Top, it works as expected. I think this is an acceptable compromise. When you call Pop, the popped element is “stored into” Top for the rest of the state-step, then it disappears in the next step. If you try to access the stored element without calling Top, you get an error. It’s not entirely ergonomic, but it still enforces the system properties.

(This doesn’t maintain the value of Top between steps; it only exists in the same step we called Pop. If you need the popped value for multiple nonatomic steps you will have to store it in a variable. But we can use Top several times in different places of the same step and it will always refer to the same value. stack' describes what stack will be in the next step; as long as we’re in the current step, stack will always be the same value, as will its last element.)

Dynamic Stacks

So far we’ve only looked at specifying a single stack. But you might have something like

  CONSTANT Workers

  VARIABLE queue 

  Type ==
    queue \in [Workers -> Stacks]

With queue we have a separate stack for each worker, where the spec can have any number of workers. The StackADT module takes a single stack, not a set of them. We can still use it, we just need to be clever. TLA+ modules can be partially parameterized:

S(stack) == INSTANCE StackADT WITH StackType <- Workers

Explanation of Partial Parameterization

Partial parameterized modules are a bit complex, so I’m going to step out of the example and showcase it with a simpler one.

---- MODULE Point --- 
CONSTANTS X, Y

Point == <<X, Y>>
Add(x, y) == <<X + x, Y + y>>
====

To fully parameterize Point, I’d write

P == INSTANCE Point WITH X <- 1, Y <- 2

So P!Point = <<1, 2>>. But I could also write

Q(X) == INSTANCE Point WITH Y <- 2

Q is a partially parameterized: it fixes Y but not X. We have to provide X whenever we call any of the operators in Q. So Q(3)!Point = <<3, 2>>, Q(0)!Add(1, -1) = <<1, 1>>, etc.

As with total parameterization, this also works with actions. If I have

---- MODULE Counter ----
VARIABLE x

Inc == x' = x + 1
====

---- MODULE main ----
VARIABLE y, z

C(x) == INSTANCE Counter

Then C(y)!Inc with increment y as an action and C(z)!Inc will increment z.

With this, we could pop from the first queue with S(queues[1])!Push(3). Then we can non-deterministically pop one queue with a quantifier:

Next ==
  \E w \in Workers:
    \/ S(queue[w])!Pop
    \/ \* ...

Our property would be that every queue follows Stack!Spec.

AllQueuesAreStacks ==
  \A w \in Workers:
    S(queue[w])!Spec

Unfortunately, neither of these work. If you try to check these TLC will throw an error. There are workarounds, but they’ll take a little explaining.

Fixing the Property

Let’s start with AllQueuesAreStacks. While it’s a valid TLA+ property, TLC isn’t sophisticated enough to top-level partially-parameterized refinements. But it can check action properties! It’d look something like this:

AllQueuesAreStacks ==
    \A w \in Workers:
      LET q == queue[w] IN
        [][S(q)!Next]_q

Remember, S(q)!Spec == S(q)!Init /\ [][S(q)!Next]_q.³ This is closer to something TLC can check, but it still can’t handle actions inside quantifiers. We need to move the action outside the quantifier:

AllQueuesAreStacks ==
  [][
    \A w \in Workers:
      LET q == queue[w] IN
        q' # q => S(q)!Next
  ]_queue

Derivation

Let P = S(q)!Next for brevity, so the “inner action property” is [][P]_q.

[P]_q is formally defined to mean P \/ UCHANGED q.
By definition of =>, a => b ≡ b \/ ~a.
- P \/ UNCHANGED q ≡ ~(UNCHANGED q) => P.
UNCHANGED q ≡ q' = q
- ~(UNCHANGED q) ≡ ~(q' = q) ≡ q' # q
[][P]_q => [](q' # q => P)

Finally, [] commutes with \A, meaning \A x: []P(x) ≡ [](\A x: P(x)). So we can pull out the [] and place the entire quantifier inside.

That’s equivalent and TLC can check it! It’s all a bit messy, but it’s conceptually well-founded, at least. Now to fix the action.

Fixing the action

This is a lot harder, and it’s not just a limitation of the model checker. S(queue[w])!Next is semantically invalid.

Imagine if Next was just x' = x + 1. Then S(queue[w])!Next would be

queue[w]' = queue[w] + 1

This does not fully define queue! If queue = {a: 1}, S(queue["a"])!Next is satisfied by queue' = {a: 2}, but it’s also satisfied by queue' = {a: 2, b: ↓}. Since this is almost certainly not what you want, TLC reports that as an error. Instead, we have the unergonomic alternative of⁴

queue' = [queue EXCEPT ![w] = queue[w] + 1]

Since functions have to be handled in this special way, and Stack works on a single variable, passing in queue[w] would leave the rest of queue undefined.⁵

So you’ll have to write new operators to handle things, which basically means copypasting S!Pop:

Next ==
  \E w \in Workers:
    \/ queue' = [queue EXCEPT ![w] = SubSeq(queue[w], 1, Len(queue[w])-1)]

It’s a bit of leaky abstraction, but that tends to be okay with specifications, because they’re still overall much smaller than programs. Importantly, you’re still protected by the refinement property! AllQueuesAreStacks is a property on how queue changes in value, not that queue uses the appropriate actions. So even though we’re manipulating queue in a different way than Stack “expects”, it’s still in a way that’s compatible with Stack(w)!Spec, and the property holds.

Discussion

I don’t know how effective these techniques are in practice. As far as I can tell, nobody else has used them before, certainly not for dynamic values. I think it’s interesting, though, and I hope that this eventually allows for more reusability. One of the big limitations on formal methods is how hard it is to automatically compose specifications. But we might be able to automate 80% of composition, and I think this technique can help us get there.

Thanks to Marianne Bellotti and Andrew Helwer for feedback.

I’m being super teleological here. Formally the actions don’t modify the stack, they describe the ways the stack can change between next-states. But everybody understands what I mean when I say “modify” and I’m fine trading precision for clarity. ^[return]
We can’t check Spec directly because 1) the Init uses Seq(StackType), so there’s infinite starting states, and 2) Next is unbounded, so there’s infinite reachable states. We can use it as a property because only needs to be checked on every transition of our main spec. So if our main spec is bounded, it doesn’t matter that Spec represents an unbounded system, because we only look at a bounded subset. ^[return]
I’m not including S(q)!Init in the property for simplicity. The complicated bit is S(q)!Next anyway. ^[return]
You could write that more cleanly as ![w] = @ + 1. I cover @ in the TlaConf talk. ^[return]
We don’t have this problem with AllQueuesAreStacks because we’re using it as a property, not an action. Once queue' is specified, we can write queue[w]' = queue[w] + 1 as an expression. ^[return]