Sunday, 27 January 2013

Non-injective type functions and ambiguous types




> {-# LANGUAGE  ScopedTypeVariables, MultiParamTypeClasses, FunctionalDependencies, TypeFamilies #-}

Type inference for standard Hindley/Milner boils down to
solving simple unification (Herbrand) constraints.
Extensions such as type annotations and type functions require
more complex constraints. For users of such extensions, it becomes harder
to understand the reason why type inference (i.e. the underlying constraint solver)
has failed.

A re-occuring problem of type inference failure seems due to
*non-injectivity* of type functions. GHC 7.4.1 provides some improved
error message for such situations (some examples will shortly).
I argue that the underlying reason for type inference failure
in this context is that the program's type is *ambiguous*.
In case of ambiguous types, type inference, to be successful, is required to guess
types  (i.e. the underlying constraint solver is required to guess
solutions). But type inferencer generally don't guess and therefore fail.

Hence, instead of non-injectivity, ambiguity should be blamed for failure
(where of course non-injectivity is a possible reason for ambiguity).

Interestingly, recasting the type function program in terms of
functional dependencies yields the desired reason for
failure (i.e. the program's type is detected as ambiguous).

I yet have to check what's the status of the ambiguity check
in more recent versions of GHC such as GHC 7.6.

== Type functions are not injective

OK, so let's start with some basic definitions

A function F is injective iff
from F a ~ F b we conclude that a ~ b.

In general, type functions are not necessarily injective
e.g. we could define
type instance F Int = Bool
type instance F Char = Bool

== Type inference failure due to non-injective type functions?

Consider the following simple example.

> type family F a
> f1 :: F a
> f1 = undefined

The above program is type correct.
The type inference problem boils down to the
following constraint problem

(1) forall a. exists t. F a ~ t

The unification type variable 't' arises from the program text (undefined).
The 'exists' quantifier indicates we can apply unification.
The type annoation demands 'F a' for any type 'a'.
Therefore, there's an outermost forall quantifier.

It's easy to see that the above constraint problem is satisfiable,
e.g. take t = F a.

[Note: t = F a is not necessarily the only solution, we will come
back to this issue shortly]

Here's a slightly more complicated variant of f1.

> type family G a
> {-
> f2 :: G b
> f2 = f2
> -}

The above is rejected by GHC.
"   Couldn't match type `G b' with `G a0'
    NB: `G' is a type function, and may not be injective
    ...
"

Let's examine in more detail what's happening here.
The type inference problem boils down here to
the constraint problem

(2) forall b. exists b. G b ~ G a

The 'G a' bit arises from the program text and
'G b' is the type demanded by the type annotation.

It's quite easy to see that the constraint problem (2)
is satisfiable, e.g. take a = b. But GHC's type inferencer
is not capable of making such a guess and therefore
GHC rejects the program f2.

The slightly odd thing is that in case of constraint problem (1),
GHC chooses the obvious guess t = G a.
Depending on G's instances, further solutions are possible.

For example, for

> type instance G a = Int

t = Int would be another guess/solution.

== Ambiguous types

I argue that the actual reason for type inference failure is
that the type of f1 and f2 is ambiguous.
Let's slightly recast the type of f2 as follows:

> {-
> f3 :: (G b ~ c) => c
> f3 = f3
> -}

I have simply substituted 'G b' by the fresh type variable 'c'.
Here, the unambiguity check demands that fixing
the variable 'c' also fixes the variable 'b' in the context (G b ~ c)
As we know, this is not necessarily the case because
of non-injectivity of type function G.

== The case of functional dependencies

For fun, let's recast the above programs with functional dependencies.

> class FD a b | a -> b

> fd1 :: FD a b => b
> fd1 = undefined

The above program is accepted by GHC 7.4.1. It's not hard to see
that fd1's type is ambiguous (so here GHC took a guess).

The next program is rejected.

> {-
> fd2 :: FD a b => b
> fd2 = fd2
> -}

GHC says
"
    Ambiguous type variable `a0' in the constraint:
      (FD a0 b) arising from a use of `fd2'
    Probable fix: add a type signature that fixes these type variable(s)
"

== Type inference in the presence of ambiguous types

Suppose we need this program (and switch on a hypothetical flag -XNoUnambiguityCheck).
How can we convice the type inferencer to accept this program?
We need some sort of type-level lambda annotation to fix the 'a' in 'FD a b => b'.
We can mimic such a construct by
(a) turning 'a' into an extra function argument and
(b) lexically scoped type annotations.


> fd3 :: (FD a b ) => a -> b
> fd3 = undefined

> fd4 :: forall a b. (FD a b) => b
> fd4 = fd3 (undefined :: a)


== Conclusion

Someone might say. I never write such contrived ambiguous type annotations.
Well, I argue that such types can arise quite frequently when doing
some of the now fashionably type-level programming in Haskell.
To be continued.

1 comment:

Martin Sulzmann said...

Some typo, should be (forall b. exists a. G b ~ G a).
The issue raised in one sentence:
*Non-injectivity may cause ambiguity and ambiguity may cause type inference failure*
The issue has been raised several times by myself and others as early as 2008! I'm wondering how many people/emails have stumbled over this issue since 2008 and Conal's email?

My dense blog post makes no attempt to provide a comprehensive introduction to this subtle issue. I've simply put together a minimal set of examples plus pointing out the connection to functional dependencies.