breaking Haskell circular imports with C preprocessor (cpp)

This article shows how to solve the “circular imports” problem in Haskell, using the C preprocessor, in a swift way. If you are in a hurry, go to C preprocessor section and do not skip Caveats.

The problem

The difficulty is well known:

  1. you start a new Haskell project. After a cup of coffee, with glee and hope you flesh your first data-types and functions:

    data Big = Big
    data Test = Test Big
    
    testFun :: Big -> Test
    testFun = undefined
    
    bigFun :: Test -> IO Big
    bigFun = undefined
  2. after a while the initial module is getting bigger and bigger, so you decide it would be a good idea to split it in manageable pieces:

    -- A.hs
    module A where
    
    data Big = Big
    bigFun :: Test -> IO Big
    bigFun = undefined
    
    -- B.hs
    module B where
    
    data Test = Test deriving (Show)
    testFun :: Big -> Test
    testFun = undefined
  3. you eagerly run cabal new-build, but GHC complains:

    Module imports form a cycle:
             module ‘B’ (src/B.hs)
            imports ‘A’ (src/A.hs)
      which imports ‘B’ (src/B.hs)

    “Lord, why me?!”

Existing solutions

There are three accepted ways of dealing with circular imports:

C preprocessor

What I am proposing here is a quick hack that uses the C preprocessor (cpp) to break circular imports. Download the working example and let us see what it is all about. We will start from file A.hs:

module A where

#define beta
#include "B.hs"

data Big = Big

bigFun :: Test -> IO Big
bigFun = undefined

We do not import B via import B, but using #include. Note the #define beta before it. Examining B.hs will reveal the trick:

#ifdef beta
-- data declaration
data Test = Test Big
#undef beta

#elif 1
-- your module
module B where

import A

someFun :: Big -> Test
someFun = undefined
#endif

Data declaration is sandwiched between #ifdef beta and #undef beta, while the rest of the module goes between #elif 1 and #endif.

It is not difficult to see the preprocessor executes two passes: one to transfer data-types to A.hs (via CPP), the second to allow the rest of B.hs to be compiled by GHC. The benefits of this approach are:

I advise to add a ghc-options: -XCPP in your .cabal file, otherwise you are forced to put a {#- Language CPP -#} on top of every module.

Caveats

I tested this solution in real life, these are the warnings:

Conclusion

If you need a quick fix to mutually recursive modules in Haskell, using CPP is an option that can save you headaches. Hopefully the problem will soon be addressed at compiler level.