Snapshot test your Postgresql in Golang pgx driver

Wed Oct 20, 2021 in DevSecOps, Testing Postgresql, unit-test, Golang

If your unit-test understand SQL syntax, you can cover many behaviors of your code without the need of starting a real DB. Want to understand Postgresql syntax? just simply import Postgresql parser to your program, https://github.com/pganalyze/pg_query_go helps you to do that.

How about using a containerized DB?

Sometimes, the repository layer doesn’t have much logic (maybe concat some WHERE conditions), only propagate the SQL statement to DB.

Testing with a simple containerized DB is a good option where you can:

verify the syntax, and
checking the logic, where the real complexity happen with all the fake data and logic

Good option? Yes but not the best option for all the cases. For example, the UI snapshot test technique has its place whenever we don’t want to start a real browser for testing your UI code. Snapshot test your SQL statement can have their place also. Hopefully, I can show you some of its advantages.

A better tool for you

Before going further, https://github.com/cockroachdb/copyist is a good tool for you if you’re using the std sql interface. We’re using pgx custom interface so we need to write some testing helper but the idea is the same.

OK, now the snapshot test

In this post, we use a simple example query but you can see the real benefit with complex query that make you hate your ORM lib.
Here is our example:

type UsersInvoicesFilter struct {
	IDs       []string
	CreatedAt time.Time
}

const FetchUsersInvoicesStmt = `SELECT
	invoices.invoice_id,
	invoices.user_id,
	users.billing_address,
	invoices.amount
FROM users LEFT JOIN invoices ON users.user_id = invoices.user_id
WHERE users.user_id = ANY($1) AND invoices.created_at >= $2`

func FetchUsersInvoices(ctx context.Context, tx pgx.Tx, filter UsersInvoicesFilter) ([]*Invoice, error) {
	rows, err := tx.Query(ctx, FetchUsersInvoicesStmt, &filter.IDs, &filter.CreatedAt)
	if err != nil {
		return nil, err
	}
	defer rows.Close()

	var invoices []*Invoice
	for rows.Next() {
		invoice := Invoice{}
		err := rows.Scan(
			&invoice.InvoiceID,
			&invoice.UserID,
			&invoice.BillingAddress,
			&invoice.Amount,
		)
		if err != nil {
			return nil, fmt.Errorf("scanning invoice data error: %w", err)
		}
		invoices = append(invoices, &invoice)
	}

	if err := rows.Err(); err != nil {
		return nil, fmt.Errorf("unexpected error: %w", err)
	}

	return invoices, nil
}

All seasoned Go developers must see this code block familiar, we introduce some examples that unit-test can help to cover.

Validating PostgresSQL syntax

increase code coverage percentage just to be looking good 💩
avoid “accident change” when refactoring 👍
check Postgresql syntax 👍

Example use case: a function to fetch all user’s invoices. In this example we have 2 tables:

users(user_id, email, billing_address)
invoices(invoice_id, user_id, amount, created_at)

A single query to fetch the users and their invoice data with 2 conditions:

SELECT
	invoices.invoice_id,
	invoices.user_id,
	users.billing_address,
	invoices.amount
FROM users JOIN invoices WHERE users.user_id = invoices.user_id
WHERE users.user_id = ANY($1) AND invoices.created_at >= $2

Now, import pg_query "github.com/pganalyze/pg_query_go/v2" and then test the basic syntax with

func Test_parsingUsersInvoicesQuery(t *testing.T) {
	json, err := pg_query.ParseToJSON(FetchUsersInvoicesStmt)
	if err != nil {
		panic(err)
	}

	t.Log(json)
}

Then run your test

go test -v
--- FAIL: Test_parsingUsersInvoicesQuery (0.00s)
    users_repo_test.go:46: syntax error at or near "WHERE"
    users_repo_test.go:49: 
FAIL
exit status 1
FAIL    examples        0.005s

Oops, what’s wrong with my WHERE condition? I think you already know I intentionally put the wrong join condition and with a short query like this, it’s very easy to spot. But I guess at some time you must have written something more complex than CRUD queries, this is where it can be helpful. Besides that, this test returns the result in a small fraction of a second.
Correcting the JOIN invoices WHERE to JOIN invoices ON and returning the test, a formatted version of the JSON can be found here https://gist.github.com/nvcnvn/d73d441b7878c47e85a654eef61819db
The output shows how Postgresql parses and manages the SQL syntax tree, not something trivial in general but in our case you can compare the tree with the original query since it is fairly simple.

Not only text compare but understand SQL semantic

Sometimes, adding a new line, space, and tab does not change the semantic of the query, we have a small helper for this issue.

                           "ColumnRef":{
                              "fields":[
                                 {
                                    "String":{
                                       "str":"users"
                                    }
                                 },
                                 {
                                    "String":{
                                       "str":"billing_address"
                                    }
                                 }
                              ],
                              "location":49
                           }

Because this is a legit parser, it comes with a location of each token, we don’t care about the location. https://github.com/kinbiko/jsonassert is with "<<PRESENCE>>" is very handy for this case. Using together with a simple regex replaces all, we can have this JSON “template” stored for comparing later, the flow is:

run the test the first time, record the JSON string
modify all the location fields to check their presence only (ignore the value)
then for each later test we use the query passed and compare it with the stored template

https://github.com/manabie-com/manabie-com.github.io/blob/main/content/posts/snapshot-test-your-postgresql-in-golang/examples/helper_test.go#L33

avoid mistake when refactor

Assuming this code working already:

type UsersInvoicesFilter struct {
	IDs       []string
	CreatedAt time.Time
}

func FetchUsersInvoices(ctx context.Context, tx pgx.Tx, filter UsersInvoicesFilter) ([]*Invoice, error) {
	rows, err := tx.Query(ctx, FetchUsersInvoicesStmt, &filter.IDs, &filter.CreatedAt)

One day, you come up with an idea that we should have a generic interface for the filter in general, something like:

type Filter interface {
	Args() []interface{}
}

type UsersInvoicesFilter struct {
	IDs       []string
	CreatedAt time.Time
}

func (f *UsersInvoicesFilter) Args() []interface{} {
	return []interface{}{&f.IDs, &f.CreatedAt}
}

and then, to use it

// FetchUsersInvoices(context.Background(), mockDB, &UsersInvoicesFilter{})
func FetchUsersInvoices(ctx context.Context, tx pgx.Tx, filter Filter) ([]*Invoice, error) {
	rows, err := tx.Query(ctx, FetchUsersInvoicesStmt, filter.Args())

See the issue?
I have misused filter.Args(), the correct usage should be filter.Args().... This mistake can be avoided by unit-test. The idea is understand the SQL tree structure, then let it check if the number of args send to tx.Query method matches the number of WHERE conditions.
https://github.com/tidwall/gjson can be used for this, example:

	actual, expected := len(m.queryArgsLogs[0]),
		len(gjson.Get(m.statementLogs[0], "stmts.0.stmt.SelectStmt.whereClause.BoolExpr.args").Array())
	if actual != expected {
		m.t.Errorf("WHERE conditions have %d but function call send %d", expected, actual)
	}

We have the mock written in helper_test.go for recording the Query call args.

Let run the test with the bad code:

go test -v       
=== RUN   Test_FetchUsersInvoices
    helper_test.go:68: WHERE conditions have 2 but function call send 1
--- FAIL: Test_FetchUsersInvoices (0.01s)
FAIL
exit status 1
FAIL    examples        0.013s

more than just increasing code coverage percentage

With a fairly small amount of test code (not counting the helper that can be reused by the whole team), we can have some code coverage already, without the need of starting any docker container.

go test -v -cover
=== RUN   Test_FetchUsersInvoices
--- PASS: Test_FetchUsersInvoices (0.01s)
PASS
coverage: 53.3% of statements
ok      examples        0.012s

But the Manabie backend team use these kinds of test as an automatic code-review, for example when we have a rule that all the queries touching a table need to have a special filter, this comes to handy since we can code that rule to the MockDB to check it.
I think this is post long enough, please refer to our example folder https://github.com/manabie-com/manabie-com.github.io/tree/main/content/posts/snapshot-test-your-postgresql-in-golang/examples and try it. Maybe write a unit test to check the number of Scan args match with the SELECT target.