Tests with Deriving Via

Posted on January 26, 2021

I have been using both hedgehog and QuickCheck based property-based testing frameworks, I’m fairly comfortable in writing tests and generators in both. Theoretical aspects aside, for a user, I feel like hedgehog is more ergonomic as it does automatic shrinking and does away with typeclasses. The former is important as writing good shrinkers is hard, remembering to write shrinkers is even harder. The latter is important when you need to modify your generation for some tests.

In this post, I’ll show that using DerivingVia extension and generic coercions can help you write almost as ergonomic Arbitrary definitions for QuickCheck. The initial idea is taken from the Deriving Via paper, but taken a little bit further. This post assumes some level of understanding of type level programming.

For the examples, we’re using a Person as shown in the examples below. The test we’ll implement will be the tripping property. For the expected values, the name is something name-like and age is a range between 1-99.

I’ll use hedgehog to write the ideal case. The generator is light-weight, but has been customized for the business case. I’m using the hedgehog-corpus package for the name-like generation.

import GHC.Generics (Generic)

import Data.Text (Text)

import qualified Data.Aeson as A

import Hedgehog
import qualified Hedgehog.Gen as Gen
import qualified Hedgehog.Range as Range
import qualified Hedgehog.Corpus as Corpus

data Person
  = Person { name :: Text
           , age :: Int
           }
  deriving stock (Show, Eq, Generic)
  deriving anyclass (A.ToJSON, A.FromJSON)

genValidPerson :: Gen Person
genValidPerson =
  Person <$> Gen.element Corpus.simpsons
         <*> Gen.integral (Range.linear 0 99)

prop_encoding :: Property
prop_encoding = property $ do
  p <- forAll genValidPerson
  pure p === A.eitherDecode (A.encode p)

For comparison, this is what I would write with QuickCheck without any helpers. There’s quite a bit of added complexity, especially in the shrinker, and only with two fields.

import GHC.Generics (Generic)

import Data.Text (Text)
import qualified Data.Aeson as A

import Test.QuickCheck

data Person
  = Person { name :: Text
           , age :: Int
           }
  deriving stock (Show, Eq, Generic)
  deriving anyclass (A.ToJSON, A.FromJSON)

instance Arbitrary Person where
  arbitrary = Person <$> elements simpsons <*> choose (1,99)
    where
      simpsons = ["bart", "marge", "homer", "lisa", "ned"]
  shrink Person{name,age} =
    [Person name' age'
    | name' <- [name]
    , age' <- shrinkIntegral age
    , age' >= 1
    , age' <= 99
    ]

prop_encoding :: Person -> Property
prop_encoding p = pure p === A.eitherDecode (A.encode p)

Good, now that the base is done, let’s see what we can do about making QuickCheck more ergonomic. The solution I’m outlining here relies on these features.

  • DerivingVia extension which can automatically generate instances for you if two types are Coercible
  • Isomorphism between the Generic representation of two types. For example (a,b) has a Generic representation that is the same as data Foo = Foo a b
  • QuickCheck modifiers, for example PrintableString which modify the arbitrary generation

The paper defines this piece of code for deriving Arbitrary instances for anything that is generically isomorphic to something that is already an instance.

newtype SameRepAs a b = SameRepAs a

instance
  ( Generic a
  , Generic b
  , Arbitrary b
  , Coercible (Rep a ()) (Rep b ())
  )
  => Arbitrary (a `SameRepAs` b) where
  arbitrary = SameRepAs . coerceViaRep <$> arbitrary
    where
      coerceViaRep :: b -> a
      coerceViaRep =
        to . (coerce :: Rep b () -> Rep a ()) . from

For my implementation, I’ll be cleaning the code from the paper. I’m swapping the type parameters of the newtype and extract the coercion function to top-level so that I can define the shrink as well.

newtype Isomorphic a b = Isomorphic b

type GenericCoercible a b =
  ( Generic a
  , Generic b
  , Coercible (Rep a ()) (Rep b ())
  )

genericCoerce :: forall a b. GenericCoercible a b => a -> b
genericCoerce =
  to . (coerce @(Rep a ()) @(Rep b ())) . from

instance
  ( Arbitrary a
  , GenericCoercible a b
  )
  => Arbitrary (a `Isomorphic` b) where
  arbitrary = Isomorphic . genericCoerce @a @b <$> arbitrary
  shrink (Isomorphic b) =
    Isomorphic . genericCoerce @a @b
      <$> shrink (genericCoerce @b @a b)

With this, we can now write Arbitrary instances using the tuple representation as an intermediary. At least as long as the child types have their instances properly set.

data Person
  = Person { name :: Text
           , age :: Int
           }
  deriving stock (Show, Eq, Generic)
  deriving anyclass (A.ToJSON, A.FromJSON)
  deriving (Arbitrary) via ((Text, Int) `Isomorphic` Person)

This is already a marked improvement to the original Arbitrary instance we wrote, but this does not yet satisfy our original requirement of generating only ‘valid’ persons. I would like to modify the instance generation on a more ad-hoc fashion. For this to happen, I would need some modifiers that control the arbitrary generation. I would like to write something like the instance below.

type Simpsons = '["marge", "bart", "homer", "lisa", "ned"]
data Person
  = Person { name :: Text
           , age :: Int
           }
  deriving stock (Show, Eq, Generic)
  deriving anyclass (A.ToJSON, A.FromJSON)
  deriving (Arbitrary)
    via ((Corpus Simpsons Text, Range 1 99 Int) `Isomorphic` Person)

Let’s start by defining the Range as it’s more straightforward. This is just a newtype with a couple of phantom type variables, which is used in choosing the range of the generator. Shrinking is already quite complex (and probably not optimal!), I wouldn’t want to write this multiple times.

newtype Range (from :: Nat) (to :: Nat) a = Range a

instance
  ( KnownNat from
  , KnownNat to
  , Num a
  , Ord a
  , Integral a
  ) => Arbitrary (Range from to a) where
  arbitrary = Range . fromInteger <$> choose (natVal $ Proxy @from, natVal $ Proxy @to)
  shrink (Range x) = Range <$> shrunk
    where
      shrunk =
        [ x'
        | x' <- shrinkIntegral x
        , x >= fromInteger (natVal $ Proxy @from)
        , x <= fromInteger (natVal $ Proxy @to)
        ]

Then the corpus. Just like the Range it’s a newtype with a phantom variable, providing the input for the random generation. There’s an extra typeclass involved to act as a typelevel function.

newtype Corpus (corpus :: [Symbol]) a = Corpus a

class FromCorpus (corpus :: [Symbol]) where
  fromCorpus :: [String]

instance FromCorpus '[] where
  fromCorpus = []

instance (KnownSymbol x, FromCorpus xs) => FromCorpus (x ': xs) where
  fromCorpus = symbolVal (Proxy @x) : fromCorpus @xs

instance (FromCorpus corpus, IsString x) => Arbitrary (Corpus corpus x) where
  arbitrary = Corpus . fromString <$> elements (fromCorpus @corpus)

With these instances out of the way, we can redo our original test with automatic instances.

import GHC.Generics (Generic)

import Data.Text (Text)
import qualified Data.Aeson as A

import Test.QuickCheck

import Isomorphic

data Person
  = Person { name :: Text
           , age :: Int
           }
  deriving stock (Show, Eq, Generic)
  deriving anyclass (A.ToJSON, A.FromJSON)
  deriving Arbitrary via ((Corpus Simpsons Text, Range 1 99 Int) `Isomorphic` Person)

prop_encoding :: Person -> Property
prop_encoding p = pure p === A.eitherDecode (A.encode p)