Code Generators

It’s a quite common pattern in Go projects to use code generators. Code generators or in short term, codegens are used to provide type safety and eliminate runtime unnecessary processing. Without codegens, we need to use reflect package at runtime.

Our Problem

Imagine we have a json serialized byte stream received from a user or read from a file.

1
2
3
4
{
    "first_name": "Amir",
    "last_name": "Ehsandar"
}

How can we Unmarshal this arbitrary data structure into a struct? In this article, we discuss two different approaches to unmarshal this serialized binary into corresponding struct. We also compare three different stages for generating codes, on developer’s machine, on CI or every time before build process begins.

Unmarshalling

We can unmarshal our json into a struct using reflection. The reflection API provides the developer with a few function to iterate over a struct exported fields and inspect their data types. We also can use codegens to generate codes for unmarshalling our data into a struct. There are a few open source modules available on GitHub which we’ll mention later on.

Unmarshalling JSON using the reflection API

This is how builtin json package works. Here’s an example usage of reflect to iterate through an arbitrary struct:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package main

import (
	"fmt"
	"reflect"
)

func main() {
	type anotherStructure struct {
		Field string
	}

	type sampleStructure struct {
		Name   string
		Age    int
		Height float32

		Another anotherStructure
	}

	sType := reflect.TypeOf(sampleStructure{})

	for i := 0; i < sType.NumField(); i++ {
		field := sType.Field(i)

		fmt.Printf("%s -> %s\n", field.Name, field.Type.Name())
	}
}

Output:

1
2
3
4
Name -> string
Age -> int
Height -> float32
Another -> anotherStructure

Marshalling struct to JSON using generated code

To eliminate this overhead at runtime, there are a number of third-party module to preprocess to be marshaled/unmarshalled structures. We use easyjson as an example here. Code generate for nearly the same struct mentioned above looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
func easyjson89aae3efEncodeGithubComEhsundarBlahEasyjsonsample(out *jwriter.Writer, in sampleStructure) {
	out.RawByte('{')
	first := true
	_ = first
	{
		const prefix string = ",\"name\":"
		out.RawString(prefix[1:])
		out.String(string(in.Name))
	}
	{
		const prefix string = ",\"age\":"
		out.RawString(prefix)
		out.Int(int(in.Age))
	}
	{
		const prefix string = ",\"height\":"
		out.RawString(prefix)
		out.Float32(float32(in.Height))
	}
	out.RawByte('}')
}

This generated code saves a lot of CPU time at runtime, instead takes a while to generate this code and needs a few extra steps to develop go code.

You can see easyjson’s benchmarks here compared to a few other popular tools.

libjson sizeMB/sallocs/opB/op
standardregular2221810229
standardsmall9.714720
easyjsonregular1251289794
easyjsonsmall673128
ffjsonregular661419985
ffjsonsmall17.610488
codecregular5543419299
codecsmall297336
ujsonregular103N/AN/A

Code Generation Stage

Generation on the Developer’s Machine

TODO

Generation on CI and Commit on Another Repository

Generation before Build Process

Conclusion

StageSuitable for
Local- small projects
- generated codes only used by the same module
CI- another repository is needed for the generated code to be committed at
- add extra complexity and generated code are available with a delay
Build- clone repository can not be built without generating codes
- there is no change history for generated codes in git