May 15, 2012
CLOS FSM with MOP Sauce

Common Lisp in the desire to be as cool as possible includes in its specification the Common Lisp Object System, or CLOS, which itself can be introspected and altered in great detail using the MetaObject Protocol, or MOP, as described in The Art of the Metaobject Protocol. Unfortunately, MOP didn’t make it into the ANSI standard, but most implementations include MOP as it is described in the book, and a compatibility package :closer-mop (available in Quicklisp) makes using the symbols described seamless between implementations.

One of the features of MOP is the ability to make an instance of a CLOS class funcallable, that is allow a class instance to be a valid first argument to funcall. This behavior in a lot of ways can resemble the traditional method model from other languages, but that’s not how I intend use it here.

The Idea

I’m going to show and tell an implementation of a generic finite state machine that uses the MOP concepts of funcallable-standard-object and funcallable-standard-class to marshal the flow of incoming events to the machine and the concept of CLOS generic method dispatch to handle the execution of transition handlers for any given state of the machine.

These features will allow us to build a structure that lets us focus solely on the problem at hand and defer features like event data binding, state-dependent method selection and unexpected state handling entirely to the language without pushing the boundaries of any specific feature.

The Design

The design we’re going for is such that we can define a class with a state slot that will hold :keyword name of a state. We’ll make instances of this class funcallable so that when we make an instance we will be able to simply (funcall fsm-instance fsm-evemt) repeatedly and have the machine dispatch to the correct event, perform any logic, and transition to the next state based on the input.

We would be able to query the state of the machine with (state fsm-instance) and receive a keyword, and we should be able to drive events into the machine until we’re in a desired or unexpected state. Any attempt to feed the machine an event while the machine is in an invalid state should result in an error.

Setup

Dependencies in Common Lisp these days should come from Quicklisp. Install it now before even reading the next paragraph, and during the installation phase ask it to write itself to your lisp configuration’s init file.

The code samples below, as should any code using MOP intending to run more than once, uses the :closer-mop package. Fetch it from Quicklisp before attempting these examples by evaluating:

(ql:quickload :closer-mop)

At the REPL before attempting to evaluate any example code.

Implementation

We’ll first declare the base of our state machine. We’ll call the base standard-state-machine to follow the convention CL seems to have of sticking the standard- prefix to base classes.

(defclass standard-state-machine (c2mop:funcallable-standard-object)
  ((state :initform :initial :initarg :state
          :accessor state)
   (last-event :initform (get-internal-real-time)
               :accessor last-event))

  (:metaclass c2mop:funcallable-standard-class))

This grants us the majority of our desired features from the design. The resulting class holds a :keyword state name in a state slot, and is declared to be funcallable by the inclusion of the funcallable-standard-object base and the funcallable-standard-class metaclass for the class itself.

When events fire through this state machine we will need to invoke a method that we can use to dispatch to various handlers in the machine depending on the state we happen to be in. We can use the CLOS generic method feature called eql specializers to distinguish which method should be invoked based on the name of the state a machine happens to be in. EQL specializers are just like type specializers for method parameters, but instead of the comparison of the parameter to the specialization happening based on the type of the parameter, it happens as if an (eql parameter-value parameter-specialization) test is performed to determine if a given method is applicable to an invocation.

With this in mind we can design a generic method that is aware of three things as parameters that we can choose to specialize: The current state machine, the state of the current state machine, and the event that was received by the machine.

(defgeneric standard-state-machine-event (machine state event))

We won’t define any such methods now, because right now we have no state machine to model, and as such have no state handing methods to define.

To actually invoke a function when the funcallable instance is called as a function a lambda must be bound to be the funcallable instance function of a given instance of a class. Since there is no mandate for an inherent this pointer in CL, the best place to bind this function is often in one of the specializations of initialize-instance and close over the current instance in the process. The attachment of a function to an instance is done with the MOP function set-funcallable-instance-function, which places no special requirements on the args list of the attached function.

We’ll bind our driver to the state machine instance in the :before specializer of its initialize-instance method, and use an args list that expects only the event that is being delivered to the state machine:

(defmethod initialize-instance :before ((machine standard-state-machine) &key)
  (c2mop:set-funcallable-instance-function
   machine
   #'(lambda (event)
       (multiple-value-bind (next-state recur-p)
           (standard-state-machine-event machine (state machine) event)

         (setf (last-event machine) (get-internal-real-time)
               (state machine) (or next-state (state machine)))

         (if recur-p
             (funcall machine event)
             (values machine (state machine)))))))

Now when we have instance of the state machine invoked with (funcall fsm-instance event), the inner lambda expression will get passed the event and compute a return. This function despite having some length does very little.

First, it dispatches the event to our previous standard-state-machine-event method using the instance of the state machine we closed over as the first parameter, the state of the state machine as the second parameter and the event that the machine was called with.

It expects a multi-value return, but if the method only returns one value, no error is raised and the second value, recur-p is left bound to nil. Its meaning will become abundant in a moment.

Once the state handler is called the new state returned by the handler is stored as the current state of the state machine, and if a second value was returned, the machine is invoked again with the same event it had just received before returning from the handler the value of the current machine instance for chaining and the value of the current state.

The second, recur-p value is a hook that can be used from a state handler to force the machine to retry handling the event in a new state. This is useful when the same event should loop through the machine a second time before the transition is complete, and the optional second value return from a state handler allows that without additional external plumbing.

Functionally, the implementation of the state machine is complete. It can now cycle through a graph of events, provided that those events are defined as specialized methods. But, in its current state, the API leaves very much to be desired.

The Polish

As the state machine stands now, to implement one that for all inputs remains in the default :initial state, one would have to be defined explicitly by specializing the standard-state-machine event.

(defmethod standard-state-machine-event
            ((fsm standard-state-machine)
             (state (eql :initial))
             event)
  nil)

The trailing nil makes the return value explicit, due to the length of the specializations. This makes the structure of the dispatch clear, and the nil return value could be traced through the driver above to determine that this machine would do absolutely nothing given any kind of input. Remaining eternally in the :initial state.

To make the task of defining states for a given type of state machine we can write a simple macro to write the above form for us. After all, the majority of it is filler.

(defmacro defstate (machine-type state-name (machine-sym event-sym) &body body)
  `(defmethod standard-state-machine-event
       ((,machine-sym ,machine-type) (state (eql ,state-name)) ,event-sym)
     ,@body))

Using the macro above we can now declare an identical looping :initial state for the standard-state-machine as follows:

(defstate standard-state-machine :initial (fsm event)
  nil)

Which results in identical code to the one we wrote previously. The symbols given in the argument list of the defstate forms are bound to the machine evaluating the event and the event that is being sent through the machine. The remaining forms will be evaluated just as in any other defun/defmethod with the return value treated as either next state for the machine to enter, nil to stay in the same state, or a multivalue return with one of the previous and a non-nil second value to indicate the event should be fired through the machine a second time before the final state is stored.

To create subclasses of the standard state machine, we again run into a similar situation as before with boilerplate requirements. Each subclass of standard-state-machine must not only include the standard-state-machine class in the list of parents, it must also include funcallable-standard-class as its metaclass, as in:

(defclass my-fsm (standard-state-machine)
  ()
  (:metaclass c2mop:standard-funcallable-class))

This peculiarity could be documented a thousand times in a thousand places, but that will make it no less ugly or difficult to remember. It would be much easier to provide a familiar tasting API to consumers that results in the same code. For example, using a structure like this to generate code equivalent to the above

(deffsm my-fsm ()
  ())

This leaves the consumer free to do whatever their heart desires with CLOS without disturbing the requirements of our function. Such a construct can be stated simply as another tiny macro.

(defmacro deffsm (name parents slots &rest options)
  `(defclass ,name ,(append (list 'standard-state-machine) parents)
     ,slots
     (:metaclass c2mop:funcallable-standard-class)
     ,@options))

The Result

If we were to put all of that code together in one place with a bunch of documentation strings for the methods, classes and slots in a slightly more condensed form than this article, it might look something like this. Which is actually a minimal extraction of the state machine that drives the HTTP parser in Hinge. If this article gets enough interest, I will extract standard-state-machine into a QuickLisp compatible package for even easier reuse.

The Demo

Let’s construct an FSM for determining if a sequence of characters contains the string “Hi”, but before the character “!” appears. Once a string contains “Hi” it can contain “!” characters again.

We’ll model this as a machine of three states, with events being single characters of input. The states will be :initial, :want-i and :done.

In the :initial state, we’ll accept any input without transitioning except for “!” which will cause the machine to transition into an error state so that no more input can pass, and “H” will cause the machine to transition to the :want-i state.

Any event other than “i” in the :want-i state will transition back to :initial and the event “i” will cause the machine to enter the state :done, which will be a no-op looping state that allows input to pass through uninspected. The implementation for such a machine would look like this, assuming we have previously defined the :fsm package.

(fsm:deffsm hi-fsm ()
  ())

(fsm:defstate hi-fsm :initial (fsm c)
  (case c
    (#\! :error)
    (#\H :want-i)))

(fsm:defstate hi-fsm :want-i (fsm c)
  (if (char-equal c #\i)
      :done
      (values :initial t)))

(fsm:defstate hi-fsm :done (fsm c)

We can observe it in action by running it through a couple of strings and measuring the terminal state of the machine:

(let ((fsm (make-instance 'hi-fsm))
       (input "Oh? Hello there. Hi. How are you!?"))
  (map 'list 
       (lambda (c) 
         (if (eql :error (fsm:state fsm))
             (format t "Skipping: ~S~%" c)
             (funcall fsm c))) 
       input)
  (fsm:state fsm))

Resulting in :DONE as the final state of hi-fsm.

This will iterate the string character by character through the machine we just defined, and unless the machine is in an error state, it will submit each token, then return the state of the machine when it has completed its run.

If we change the string to one where neither “Hi” nor “!” appear at all, the machine will remain in :initial and if “!” appears before “Hi”, then the machine will leave the loop in an :error state, and any tokens after the “!” won’t even be sent into the machine. Changing the binding of input in the above to “Go away! Now.” results in the value of :ERROR and the printing of the lines

Skipping: #\ 
Skipping: #\N
Skipping: #\o
Skipping: #\w
Skipping: #\.

So there you have it. You now know how to leverage the powers of Common Lisp and MOP to build a pretty cool and useful abstraction for a pretty common algorithm.

  1. null-miracle-blog reblogged this from sshrkv
  2. sshrkv posted this