Thursday, 11 December 2008

Equality, containment and intersection among regular expressions via symbolic manipulation

Three standard regular expression operations

data RE a where
Phi :: RE a -- empty language
Empty :: RE a -- empty word
L :: a -> RE a -- single letter taken from alphabet a
Choice :: RE a -> RE a -> RE a -- r1 + r2
Seq :: RE a -> RE a -> RE a -- (r1,r2)
Star :: RE a -> RE a -- r*
deriving Eq

equality :: Eq a => RE a -> RE a -> Bool
intersect :: Eq a => RE a -> RE a -> RE a
contains :: Eq a => RE a -> RE a -> Bool

implemented via symbolic manipulation of regular expressions (no explicit finite automata construction) by adapting Antimirov's regular expression rewriting algorithm via partial derivatives.

The implementation is available via the Haskell platform hackage. Check out the regexpr-symbolic package.

The source code describes some of the technical details. You can also check out an earlier post which considers the specific case of intersection.

Saturday, 6 December 2008

Parallel Join Patterns with Guards and Propagation

A paper in the works for quite some time. You can download the latest version here.

Here's what the abstract says:
Join patterns are a powerful concurrency abstraction to coordinate multiple events. We propose an extension with guards and propagation and argue that both features are essential in many programming situations. We develop a parallel execution scheme which we have fully implemented as a library in Haskell. Our results provide new insights how to write parallel programs on multi-core architectures.

Interested? You can get the implementation via the Haskell platform hackage. Download the join package which in turn relies on the multisetrewrite package. The multisetrewrite package provides the basic functionality to execute join patterns (with guards and propagation) in parallel. The join package adds synchronous and asynchronous arguments and join method calls. The user interface is still a bit clumsy to use but this will hopefully be improved over time.

Friday, 5 December 2008

Concurrent Goal-Based Execution of Constraint Handling Rules

We (my PhD student Edmund and me) have just submitted a paper with this title to Theory and Practice of Logic Programming. The paper complements our PPDP'08 paper on parallel Constraint Handling Rules execution. You can get the paper here.

Here's what the abstract says:
We introduce a systematic, concurrent execution scheme for Constraint Handling Rules (CHR) based on a previously proposed goal-based CHR semantics. We establish strong correspondence results to the abstract CHR semantics, thus guaranteeing that any answer in the concurrent, goal-based CHR semantics is reproducible in the abstract CHR semantics. Our work provides the foundation to obtain efficient, parallel CHR execution schemes.

Thursday, 4 December 2008

STM with control: Communication for retrying transactions

While working on the join pattern with guards and propagation implementation, we (my PhD student Edmund and me) came across an interesting problem which points out the current limited control/communication capabilities of retrying STM transactions (in Haskell). More on join patterns later. Here's the gist of the problem and a solution in form of a new Haskell library.

The task



We're given a list of switches, they can either be on or off


> data Switch = On | Off


If all switches are on, we switch them all off and report True If one switch is off, we do nothing and report False.

In a sequential setting, this task is straightforward. But what if the switches are shared among concurrent threads performing the above task?

In a concurrent setting, the task must be formulated as follows.

Atomically (
If all switches are on, we switch them all off and report True
If one switch is off, we do nothing and report False. )

Sounds like an application of STM. Indeed, here's a solution.


> verifyAndSwitchOff :: [TVar Switch] -> IO Bool
> verifyAndSwitchOff ts =
> atomically $ do
> bs <- mapM (\s -> do v <- readTVar s
> case v of
> On -> return True
> Off -> return False)
> ts
>
> if (and bs)
> then do mapM (\s -> do writeTVar s Off) ts
> return True
> else return False



We first check that all switches are on. Only then we switch all of them off. Isolation and atomicity of STM are important here.

Isolation guarantees that no intermediate step is visible to any other thread. After checking that all switches are on, we have the guarantee that they are still on, once we switch them off. Atomicity guarantees that all operations are executed all at once or not all which implies that some of the initially read switch values has changed. In this case, we abort the transaction and discard all changes so far and start over again.


It's tempting to immediately turn a switch off once we've read that the switch is on. But what if we read that a switch is still on? We need some additionally functionality to explicitly abort the transactions. That's what the retry and orElse combinators are for.


> verifyAndSwitchOff2 :: [TVar Switch] -> IO Bool
> verifyAndSwitchOff2 ts =
> let loop [] = return True
> loop (s:xs) = do v <- readTVar s
> case v of
> On -> do writeTVar s Off
> loop xs
> Off -> return False
> in atomically $
> (do b <- loop ts
> if b then return True
> else retry)
>
> `orElse`
>
> return False


The loop reads the status of each switch and turns the switch off if the switch is on. Otherwise, we 'abort' by returning False. For example, in case of the initial list [On, On, Off, On] we reach the third switch and the intermediate result [Off, Off, Off, On]. We can't just naively exit at this stage because the tasks demands that all changes are done atomically, ie to its full extent or not at all.

Here comes the rescue. The retry combinator allows the programmer to explicitly abort a transaction. That is, we discard all changes (which is important) here. The orElse combinator allows the programmer to compose two transaction. The second transaction will be executed if the first transaction retries. Exactly what we require.

Making the task more interesting



Suppose each switch is attached to an action, sort of a continuation. Let's say an IO action.


> type Cnt = IO ()


We want to return the continuation of the switch which first 'failed'. Here's the updated task description.

Atomically (
If all switches are on, we switch them all off and report True
Otherwise, return the continuation of the first turned on switch.)

The signature of the function we need to implement


verifyAndSwitchOffCnt :: [(TVar Switch, Cnt)] -> IO Cnt


What are our options how to implement the above?

We could adapt our first solution verifyAndSwitchOff. This is pretty straightforward. What about verifyAndSwitchOff2? Previously, we simply return False in the second orElse branch. For the more interesting task, we need to iterate over the list of pairs of switches and continuation and select the continuation of the first switch which is still on. This means that in both case we possibly need to iterate over the list twice. This is inefficient and leads to clumsy code.

Below we make use of an extension of STM where a retrying transaction can pass some information to the second transaction composed with orElse.

The extension is implemented as the STMControl.STMC library. The new primitives are:


newTVarC :: b -> STMC a (TVar b)
readTVarC :: TVar b -> STMC a b
writeTVarC :: TVar b -> b -> STMC a ()
atomicallyC :: STMC a b -> IO b
retryC :: a -> STMC a b
orElseC :: STMC a b -> (a -> STMC a b) -> STMC a b



STMC stands for STM with more control. The first parameter refers to the type of the value transmited in case of a retryC. The parameter in the orElseC combinator is now a function which accepts the result transmitted of the retrying transaction.

The implementation is fairly simple and uses an (abused) reader monad on top of STM. Here's finally an application of the new functionality.


> verifyAndSwitchOffCnt :: [(TVar Switch, Cnt)] -> IO Cnt
> verifyAndSwitchOffCnt tcs =
> let loop [] = return Nothing
> loop ((s,c):xs) = do v <- readTVarC s
> case v of
> On -> do writeTVarC s Off
> loop xs
> Off -> return (Just c)
> in atomicallyC $
> (do r <- loop tcs
> case r of
> Nothing -> return (return ()) -- default (donothing) continuation
> Just c -> retryC c)
>
> `orElseC`
>
> (\c -> return c)


The loop checks and resets all switches. In case a switch is turned off we return the attached continuation. We use the Maybe type to indicate failure (Just c) or success (Nothing). In case of success, we return the donothing continuation. Otherwise, we retryC to discard all changes made so far and pass the continuation to the second transaction which then returns this continuation.

I'd say compared to the other options, the STMC solution looks pretty clean and easy to understand.

Making the task even more interesting



That's the final extension, I promise. Suppose that to each switch there's also an attached STM Bool transaction. In addition to testing that the switch is off, we also test that the STM Bool transaction evaluates to True. Otherwise, we discard our changes and return the attached continuation.


> verifyAndSwitchOffCnt2 :: [(TVar Switch, STM Bool, Cnt)] -> IO Cnt
> verifyAndSwitchOffCnt2 tcs =
> let loop [] = return Nothing
> loop ((s,stm,c):xs) =
> do v <- readTVarC s
> case v of
> On -> do writeTVarC s Off
> b <- lift $ stm

STMC are (reader monad) lifted STM operations

> if b then loop xs
> else return (Just c)
> Off -> return (Just c)
> in atomicallyC $
> (do r <- loop tcs
> case r of
> Nothing -> return (return ()) -- default (donothing) continuation
> Just c -> retryC c)
>
> `orElseC`
>
> (\c -> return c)


The required changes to our previous solution are minor. In fact, I can't see how to implement the even more interesting task with STM alone. I believe that this task strictly requires STMC or something similar.


The complete source code of this example is available as part of the stmcontrol package on the Haskell platform hackage.

Implementation details




type STMC a b = ReaderT (IORef a) STM b

retryC :: a -> STMC a b
retryC x = do msgChan <- ask
lift $ unsafeIOToSTM (writeIORef msgChan x) -- L1
lift $ retry


What if the transaction is aborted just after execution of L1? Is our use of unsafeIOToSTM (un)safe? It's safe. It's okay to discard the IO action in case the transaction is rolled back. The important point is that if the transaction is explicitly retried, the IO action has been executed (which is the case as far as I understand the interplay of STM and unsafeIOToSTM).

Monday, 1 December 2008

XHaskell

My PhD student Kenny has just submitted his thesis on XHaskell - Adding Regular Expression Type to Haskell.

You can download the latest XHaskell implementation and Kenny's thesis draft here