Week 14: Notes

This week we discussed

newtype declarations
immutable arrays
applicative functors
parsing
monadic parsing

You can read about newtype declarations and applicative functors in chapter 11 "Functors, Applicative Functors and Monoids" of the Learn You a Haskell book.

arrays

As we discussed in the lecture, Haskell has immutable arrays that allow you to access elements in O(1) using the ! operator. You can build an array from a list or an association list. Our library quick reference documents the array API.

counting sort

This function sorts a list of integers in a fixed range in linear time using an array.

sort :: Int -> Int -> [Int] -> [Int]
sort lo hi xs =
    let arr = accumArray (+) 0 (lo, hi) [(x, 1) | x <- xs]
    in concat [replicate x i | (i, x) <- assocs arr]

arrays and recursion

Array elements are evaluated lazily, and may even refer to other elements of the same array:

> a = listArray (1, 3) [10, 20, a ! 1 + a ! 2]
> a ! 1
10
> a ! 3
30

And so we can build an entire array using a recursive calculation. This function builds an array of the first N Fibonacci numbers in linear time:

fibs :: Int -> [Int]
fibs n = 
    let arr = listArray (1, n) (1 : 1 : [arr ! (i - 2) + arr ! (i - 1) | i <- [3 .. n]])
    in elems arr

parsing

In the lecture we went through an extended example of parsing in Haskell. We first wrote code to parse characters and strings of digits without using monads:

import Data.Char

data Parser a = Parser (String -> Maybe (a, String))

anyChar :: Parser Char
anyChar = Parser (\s ->
    case s of
        "" -> Nothing
        c : cs -> Just (c, cs)
    )

apply :: Parser a -> String -> Maybe (a, String)
apply (Parser p) s = p s

(<|>) :: Parser a -> Parser a -> Parser a
p <|> q = Parser (\s ->
    case apply p s of
        Just (x, s') -> Just (x, s')
        Nothing -> apply q s
    )

satisfy :: (Char -> Bool) -> Parser Char
satisfy f = Parser (\s -> do
    (c, cs) <- apply anyChar s
    if f c then Just (c, cs) else Nothing
    )

digit :: Parser Char
digit = satisfy isDigit

letter :: Parser Char
letter = satisfy isAlpha

alphaNum :: Parser Char
alphaNum = letter <|> digit

none :: Parser [a]
none = Parser (\s ->
    Just ([], s)
   )

-- parse 0 or more digits (digit*); never fails
digits :: Parser String
digits = digits1 <|> none

-- parse 1 or more digits (digit+), or fail
digits1 :: Parser String
digits1 = Parser (\s -> do
    (c, s') <- apply digit s
    (cs, s'') <- apply digits s'
    Just (c : cs, s'')
    )

parsing with monads

Our function digits1 above is awkward because it has to manually keep track of various versions of the input string (i.e. s, s', and s''). It would be much nicer if we could just write

digits1 :: Parser String
digits1 = do
    c <- digit        -- parse a digit (or fail)
    cs <- digits      -- parse more digits
    return (c : cs)   -- return them all

We can make this possible by turning our Parser type into a monad.

The code below also includes a generalized combinator many1 that can parse one or more repetitions of any other parser. With that, we can easily implement digits1 by calling many1 digit.

import Prelude hiding (fail)
import Control.Monad hiding (fail)
import Data.Char

data Parser a = Parser (String -> Maybe (a, String))

-- a Parser that always fails
fail :: Parser a
fail = Parser (\s -> Nothing)

anyChar :: Parser Char
anyChar = Parser (\s ->
    case s of
        "" -> Nothing
        c : cs -> Just (c, cs)
    )

apply :: Parser a -> String -> Maybe (a, String)
apply (Parser p) s = p s

-- a parser combinator
(<|>) :: Parser a -> Parser a -> Parser a
p <|> q = Parser (\s ->
    case apply p s of
        Just (x, s') -> Just (x, s')
        Nothing -> apply q s
    )

instance Monad Parser where
    return x = Parser (\s -> Just (x, s))

    p >>= q = Parser (\s -> do
        (x, s') <- apply p s
        (y, s'') <- apply (q x) s'
        return (y, s'')
      )

instance Functor Parser where
    fmap = liftM

instance Applicative Parser where
    pure = return
    (<*>) = ap

satisfy :: (Char -> Bool) -> Parser Char
satisfy f = do
    c <- anyChar
    if f c then return c else fail

digit :: Parser Char
digit = satisfy isDigit

letter :: Parser Char
letter = satisfy isAlpha

space :: Parser Char
space = satisfy isSpace

-- parse a letter OR a digit
alphaNum :: Parser Char
alphaNum = letter <|> digit

-- parse nothing, return an empty list
none :: Parser [a]
none = return []

-- (many p) parses 0 or more instances of p, i.e. p*  (never fails)
many :: Parser a -> Parser [a]
many p = many1 p <|> none

-- (many p1) parses 1 or more instances of p, i.e. p+  (could fail)
many1 :: Parser a -> Parser [a]
many1 p = do
    x <- p
    xs <- many p
    return (x : xs)

digits :: Parser String
digits = many digit

digits1 :: Parser String
digits1 = many1 digit

int0 :: Parser Int
int0 = do
    ds <- digits1
    return (read ds)

int :: Parser Int
int = read <$> digits1   -- fmap read digit1

parsing expressions

We'd now like to parse fully parenthesized arithmetic expressions defined by the following unambiguous context-free grammar:

int = digit+

op = '+' | '-' | '*'

expr = int | '(' expr op expr ')'

For example, ((3 + 4) * (2 + 3)) is an expression in this language.

We can represent such expressions using this datatype:

data Op = Plus | Minus | Times
  deriving (Show)

data Expr = Const Int | OpExpr Expr Op Expr
  deriving (Show)

Here is a parser for these expressions, extending the previous code:

-- (token p) parses a p, optionally preceded by whitespace
token :: Parser a -> Parser a
token p = many space >> p

-- (sym c) parses the character c, optionally preceded by whitespace, or fails
sym :: Char -> Parser Char
sym c = token (satisfy (== c))

op :: Parser Op
op = (sym '+' >> return Plus)
 <|> (sym '-' >> return Minus)
 <|> (sym '*' >> return Times)

int_expr :: Parser Expr
int_expr = token (Const <$> int)

binary_expr :: Parser Expr
binary_expr = do
    sym '('
    e <- expr
    o <- op
    f <- expr
    sym ')'
    return $ OpExpr e o f

expr :: Parser Expr
expr = int_expr <|> binary_expr

We can run the parser like this:

> apply expr "((3 + 4) * (2 + 3))"
Just (OpExpr (OpExpr (Const 3) Plus (Const 4)) Times (OpExpr (Const 2) Plus (Const 3)),"")

Parsec

Haskell includes a useful monadic parser combinator library called Parsec. Actually in our parsing code above we've written a small subset of Parsec. If you understand the code above, it shouldn't be difficult for you to come up to speed on the full Parsec library.