Sunday 16 November 2008

Transactional boosting

This post is about software transactional memory (STM) and how to retain its benefits (composability) while trying to minimize its drawbacks (poor performance due to false conflicts).

Software transactional memory: Composability and false conflicts



Software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing (wikipedia). Below is a simple example of a concurrent, unordered singly-linked list using STM as supported by GHC Haskell.


date List a = Node {val :: a, next :: TVar a}
| Nil

insert :: TVar a -> a -> STM ()
find :: TVar a -> a -> STM Bool
-- True = present, False = absent
delete :: TVar a -> a -> STM Bool
-- True = present and deleted, False = absent

-- atomically delete 3 and 4 if present
transaction :: TVar a -> IO Bool
transaction root = atomically $ do
b1 <- find root 3
b2 <- find root 4
if b1 && b2 then do
delete root 3
delete root 4
else return False


The straightforward definitions of the linked list operations are left out. They are all executed within the STM monad which will guarantee that the operation is executed atomically, i.e. to its full extent or not at all which means in the programm language setting that we re-run (aka rollback/retry) the operation.

In Haskell, we can easily build 'bigger' STM operations by gluing together 'smaller' STM operations. For example, the transaction checks if elements 3 and 4 are present in the list and then will delete both elements. The entire transaction is executed atomically which guarantees that in case we return True, elements 3 and 4 were present but have now been deleted. Neat!

The benefits of STM over say locks are well documented. However, STM has (often) a significant performance disadvantage. Our (naive) STM implementation will protect the entire list. Suppose that elements 3 and 4 are stored towards the tail of the list. Hence, we (almost) need to traverse the entire list which then potentially leads to false conflicts with other concurrent STM operations. The effect is that many concurrent STM operations will be serialized (i.e.~sequential instead of parallel execution).

Comparing concurrent linked-list implementations in Haskell



I recently wrote a paper with the above title to be presented at DAMP'09, in collaboration with Simon Marlow and Edmund Lam. Our results show that the naive STM implementation is no match against more fine-tuned implementations which


  • either side-step the STM protocol (only each node operation is performed within the STM)
  • employ lock-coupling via Haskell's MVars, or
  • rely on atomic compare-and-swap.

Please check the paper for the details.

The (well-known) downside of this more efficient singly-linked operations (efficient in the sense they enjoy a higher degree of concurrency) is that they are not composable anymore. That is, it is non-trivial to implement the above transaction (atomically find and delete 3 and 4) using the more efficient implementations.

Transactional boosting



Maurice Herlihy and Eric Koskinen wrote two recent papers on a topic they refer to as transactional boosting: How to recover the 'high-level' benefits of STM while relying on 'low-level', highly concurrent data-structures.
The papers are (see Eric's website):

  • Transactional Boosting: A Methodology for Highly-Concurrent Transactional Objects
    M. Herlihy, E. Koskinen. PPoPP 2008
  • Aggressive Transactional Boosting
    E. Koskinen, M. Herlihy.
    Under submission. August 2008.


My understanding of how the transactional boosting approach meant to work is that


  • the user needs to verify if primitive operations 'commute', otherwise
  • the operands of (non-commuting) operations will be protected by an abstract lock,
  • each operations requires an inverse which will be executed if the transaction fails to commit.


It seems that the transactional boosting approach assumes pessimistic concurrency. I'm still not clear about the exact details of the approach but it looks definitely very promising.

Incidentally, I ran across the issue of transactional boosting in my own work on concurrent execution of multi-set rewrite rules.

Multi-set rewriting and transactional boosting



Below are simple examples of (monomorphic for simplicity) multi-set rewrite rules.
Right-hand sides are omitted because they don't matter here.

A, B <==> ...
A, C <==> ...


Multi-set rewrite rules operate on a store, a multi-set of elements. We can concurrently apply multi-set rewrite rules if they operate on non-conflict parts of the store. Suppose, we have the store

{A, A, B, C}

Then, the first rule can be applied on (the first) A and B and the second rule can be
applied on (the second A) and C. There's no conflict. Hence, both rules can be applied concurrently.

The point is that the store is implemented as a concurrent data-structure and execution of a rewrite rule can be viewed as an STM transaction. For example, the operational interpretation of the first rule is

atomically $ do
find A
find B
if both present
then do delete A
delete B
else return ()


The first implementation of concurrent multi-set rewrite rules, in collaboration with Edmund Lam (my PhD student btw), presented here

Edmund S. L. Lam, Martin Sulzmann:
A concurrent constraint handling rules implementation
in Haskell with software transactional memory. DAMP 2007:

used a (naive) STM list to implement the store. Our experiments confirm that such an approach won't scale. Due to a high(ly likely) number of false conflicts, multi-set rewrite rules are executed sequentially, though, we ought to execute them in parallel.

Our latest implementation, presented here

Martin Sulzmann, Edmund S. L. Lam:
Parallel execution of multi-set constraint rewrite
rules. PPDP 2008

uses a more low-level, 'more concurrent' data-structure which is *not* composable. Hence, we had to implement our own STM protocol on top of the low-level data-structure (The PPDP paper is a bit unclear about this but we'll present what we've achieved in its full glory in a forth-coming journal submission soon).

Summary



STM is great but it often has poor performance. There are many applications where we wish to have a 'transactional boosting' approach. As far as I can tell, the approach laid out by Maurice Herlihy and Eric Koskinen is fairly different from the approach we used for concurrent execution of multi-set rewrite rules. Possibly, there's some unifying theory but I have the feeling that transactional boosting relies on some domain-specific assumption. Anyway, interesting times with many interesting topics. I'm sure there' more to say about transactional boosting.

No comments: