Keeping decoders simple

6 December 2018, by Jasper Woudenberg

It can take a bit of time to wrap your head around how JSON decoders work. Especially functions like Json.Decode.andThen take a bit of practice to use. The nice thing is that you can almost always work around using these functions, and often this leads to nicer code.

To illustrate I've come up with some JSON describing people and their pets. We're going to attempt writing a simple Elm decoder for it. To make things more challenging I intentionally made the JSON structure a bit unpleasant to work with. Let's take a look at it!

[
  {
    "id": 1,
    "species": "Human",
    "name": "Jasper"
  },
  {
    "id": 2,
    "species": "Dog",
    "name": "Otis",
    "owner": 1
  }
]

The JSON above contains a single person and a single pet, mixed together into a list. A longer sample can contain multiple owners and pets, still in a single list, and in any order 😨. Luckily we're free to model this data differently in our Elm application. The following Model type can store the same data, but ensures all pets have an owner.

type alias Model =
    List Human


type alias Human =
    { id : Id
    , name : String
    , pets : List Pet
    }


type alias Pet =
    { id : Id
    , name : String
    , animal : Animal
    }


type Animal
    = Dog
    | Cat
    | Fish


type Id
    = Id Int

Doesn't that look much nicer? Once we get the data into our Elm application, it's going to be so much nicer to work with! To load JSON data into the application in Elm we'll need to write a JSON decoder. But in this particular case that is no simple task, because the shape of the JSON and Elm type are so different.

We can write a decoder that decodes our JSON data right into our Model type, but to do so we'll need every tool in the decoder toolbox. Luckily there's a simpler approach, always available to us, that only uses basic Json.Decode primitives. To use it, we're going to need an extra Elm type, one that looks like the JSON we're trying to read. Let's call this type BackendData.

type alias BackendData =
    List Individual


type alias Individual =
    { id : Int
    , species : String
    , name : String
    , owner : Maybe Int
    }

This type has almost the exact same shape as the JSON we're trying to decode! Because of this we'll have a much easier time to write a decoder for BackendData then we would writing one for Model. Let's write that decoder now.

import Json.Decode as Decode exposing (Decoder)


dataDecoder : Decoder BackendData
dataDecoder =
    Decode.list individualDecoder


individualDecoder : Decoder Individual
individualDecoder =
    Decode.map4
        (\id species name owner -> Individual id species name owner)
        Decode.int
        Decode.string
        Decode.string
        (Decode.maybe Decode.int)

Not bad right? If we choose the right Elm type, we can always decode a JSON structure using only a small number of tricks:

A JSON array we can represent in Elm with a List. We decode it using Decode.list.
A JSON object we can represent Elm with a record. We decode it using one of the Decode.map functions (depending on the amount of fields the record has).
JSON values (numbers and string) can be represented in Elm by Ints, Floats, and Strings.
A field that may be absent can be represented in Elm by a Maybe. We decode it using Decode.maybe.
A field that may be null can be represented in Elm by a Maybe. We can decode it using Decode.nullable.
Occasionally you will need oneOf, if a field in the JSON can contain two different types.

This is a nice result, but we're not done yet. We have succeeded in decoding our JSON into a BackendData type, but it's not a Model yet. The JSON we started out with was unpleasant to work with because its structure is so different from the Model type we'd like to use in our Elm application. Our BackendData type has the same structure as our JSON and so inherits its problems as well.

We'll need a function that turns our BackendData into a Model, let's call it fromBackendData. It will have a type like this.

fromBackendData :: BackendData -> Result String Model

The function returns a Result because there's a couple of things that can go wrong in this transformation. One example: our data can contain a whale, which is neither a person nor a pet. In cases like that Result allows us to return an error to indicate something went wrong.

fromBackenData is going to take some effort to write, and that's exactly why it's great that we don't need to worry about decoders anymore. Writing JSON decoders is hard, writing transformation functions is also hard. We were always going to have to deal with both these problems, but introducing BackendData allows us to face one problem at the time rather then a combined mega-problem.

There's other benefits. The Elm compiler is going to help us a lot writing fromBackendData. It won't offer the same amount of help writing a decoder that directly decodes the Model.

We can also write tests to help us get the transformation logic right. We could write tests for a decoder that directly decodes the Model too, but we'd need to pass it JSON strings as inputs. Our transformation function takes and returns regular Elm types, so its tests it will be easier to write and maintain.

Because it's pretty long I'll leave the implementation of fromBackendData out of this blogpost. If you're interested, check out the code in Ellie (a big thank you to Tony Gu for a number of fixes and improvements to the code!).

That's it! Any time we're faced with writing a complicated JSON decoder, we can instead choose to write an intermediary type, simple decoder, and transformation function.