Posts Tagged With: Code

Let’s talk about Javascript string encoding

Node string encoding is all over the place. Let's try to straighten out how it works.

First, some very basics about string encoding. A string is a series of bytes. A byte is 8 bits, each of which can be 0 or 1, so a byte can have 28 or 256 different values. Encoding is the process of squashing the graphics you see on screen, say, 世 - into actual bytes. There are over a million possible Unicode characters - think about all of the different languages and emoji and symbols on the planet.

You could easily represent all of the characters in the Unicode set with an encoding that says simply "assign one number, 4 bytes (or 32 bits) long, for each character in the Unicode set." One 32-bit combo for each character means you can have 4 billion distinct characters. But this would be really inefficient for most documents. For example, the English Bible or dictionary or people's email folders are mostly the characters a-z, A-Z, 0-9, and punctuation. It would be inefficient to waste 4 bytes on every "a" in the document - we want a way to represent it more efficiently.

The most common encoding is UTF-8. The main advantage of UTF-8 is that it needs a single byte (instead of four) to encode Unicode characters 0-127 - the so-called ASCII set, which includes the ones I listed above. Less common characters are represented with two bytes, and even less common characters with three bytes, and so on.

Because Javascript was invented twenty years ago in the space of ten days, it uses an encoding that uses two bytes to store each character, which translates roughly to an encoding called UCS-2, or another one called UTF-16. The letter 'a' is Unicode code point 97, so stored in a Javascript string, the first byte of a UTF-16 code point would be 97 and the second byte would be 0. The euro symbol () is Unicode code point 8364, so the first byte would be 172 and the second byte would be 32. (The relationship between them: 8364 = (32 << 8) + 172).

That only gets you Unicode code points 0 through (256*256 = ) 65536, though, so: if the first two-byte character is between 55296 and 56319, it's a surrogate, and you have to read the second two-byte character to figure out what Unicode code point it represents. Javascript will handle this multi-character read for you if you use the new codePointAt operator, which can return all Unicode code points, up to 1.1 million. The old charCodeAt operator only reads one character at a time (between 0 and 65536) and you'll have to read/decode the second one yourself.

Where this gets complicated

Because everything else in the world is moving to UTF-8, Node is also trying to move to UTF-8. This gets confusing because Node sometimes asks you to choose which encoding you want, in places where you wouldn't really expect it to ask.

Frequently, you want to convert from a UTF-16 encoded Javascript string to UTF-8 bytes in some form, whether a Buffer or a Uint8Array. Unfortunately Node doesn't offer easy API's to do that; it buries the conversion logic in C++ code, which eventually calls out to the V8 binary to handle it.

Buffers

Node offers a Buffer type, where a Buffer is an array of bytes, and an API, Buffer.from(string, encoding). encoding defaults to UTF-8. So far, so good! Unfortunately, sometimes encoding refers to the encoding of string, and sometimes it refers to the encoding of the bytes in the Buffer.

Buffer.from('7468697320697320612074c3a97374', 'hex') will decode the input as a series of hex characters, and store the bytes corresponding to each 2-digit hex character in the Buffer. But in var a = 'tést'; Buffer.from(a, 'utf8'); the 'utf8' refers to how the bytes will be stored in the resulting Buffer. Remember, strings are always two bytes per character, so declaring they are 'utf8' doesn't make sense.

Similarly, the encoding parameter in buf.toString(encoding) does not refer to the encoding of the output string - it refers to the encoding of the data in the buffer. Unless of course you specify hex or base64, in which case it does refer to the encoding of the output string. Got it?

HTTP

Incoming HTTP requests are often encoded using UTF-8. Node will give you data as a Buffer via the HTTP request's .on('data') event handler, which can be decoded using the content-type from the HTTP headers. In Express, this is handled in the body-parser module, which defers to the iconv-lite module for the actual encoding and decoding.

But if you expect e.g. a JSON or XML input, you will probably eventually want to turn that Buffer into a string - see the JSON section below.

When writing string data as an HTTP response, Node will wait until the last possible minute to convert that string into a Buffer that can be written to the socket. This leads to odd behaviors like the net library needing to know what encoding to use for a string you pass to it.

JSON

JSON.parse operates on a string and returns a JSON object, which may be a string or contain strings. JSON.parse and JSON.stringify don't do anything with the string encoding - if you pass in a garbage string, you get a garbage string back out.

So you need to depend on something else to determine the character encoding, before you pass a string to JSON.parse. Usually this will be your HTTP middleware; you have to trust (or verify) that it handles character sets correctly, before creating UTF-8 strings.

Files on disk

Node source files are expected to be encoded with UTF-8. That means you can encode UTF-8 source characters in a string, like this:

var x = "¢"

Where the cent character is the UTF-8 encoded byte sequence "\xc2\xa2". When Node starts and you try to reference x in your program, it will be re-encoded as a UTF-16 string. If you type the literal characters:

var x = "\xc2\xa2";

This will be turned into the UTF-16 string "\xc2\x00\xa2\x00". So be careful to mind your inputs and outputs.

Conclusion

Encoding in Node is extremely confusing, and difficult to get right. It helps, though, when you realize that Javascript string types will always be encoded as UTF-16, and most of the other places strings in RAM interact with sockets, files, or byte arrays, the string gets re-encoded as UTF-8.

This is all massively inefficient, of course. Most strings are representable as UTF-8, and using two bytes to represent their characters means you are using more memory than you need to, as well as paying an O(n) tax to re-encode the string any time you encounter a HTTP or filesystem boundary.

There's nothing stopping us from packing UTF-8 bytes into a UTF-16 string: to use each of the two bytes to store one UTF-8 character. We would need custom encoders and decoders, but it's possible. And it would avoid the need to re-encode the string at any system boundary.

Liked what you read? I am available for hire.

Pragmatic Web Development in Go

You should write your next web server in Go. Yes, you! Compared with Ruby, PHP, Python, or Javascript, you're going to get great memory and latency performance, and libraries that do what you expect.

The standard library can be a bit lacking though, if you are used to developing with a tool like Rails. I found myself adding the same helpers to every web project that I started in Go, so I compiled those tools into a starter pack. Most of the code is thin wrappers around the standard library, and any/all of it can be ripped out. I wanted to go over some of the tools in the starter pack.

Serving Static Assets

I compile static assets into the binary. This way you don't need to worry about copying static assets to your deployment server, passing in the relative path from the working directory to the static asset directory, or ensuring that you have the right version of the static assets for the right version of the binary.

A make target runs go-bindata, which compiles everything in the static directory and the templates directory into a single Go file. When a request comes in for static assets, we find the right one and serve it:

data, err := assets.Asset(strings.TrimPrefix(r.URL.Path, "/"))
if err != nil {
    rest.NotFound(w, r)
    return
}
http.ServeContent(w, r, r.URL.Path, s.modTime, bytes.NewReader(data))

Tests ensure that the static files on disk and the Go assets are up to date.

Routing

The Go http router only offers exact match, or the ability to route a directory, e.g. http.Handle("/static/", staticHandler()). I use a http.Handler that takes a regular expression as the first argument, instead of a string, and lets you specify which methods you want to handle.

r := new(handlers.Regexp) // github.com/kevinburke/handlers
r.HandleFunc(regexp.MustCompile("^/$"), []string{"GET"}, func(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "text/html; charset=utf-8")
    io.WriteString(w, "<html><body><h1>Hello World</h1></body></html>")
})

You can use regexp.FindStringSubmatch to pull parameters out of the URL, or replace this with your own routing framework.

HTTP/2 Server Push

"We shall go south," Paul said.

"Even if I say we shall turn back to the north when this day is over?" Stilgar said. "We shall go south," Paul repeated.

r.HandleFunc(regexp.MustCompile(`^/$`), []string{"GET"}, func(w http.ResponseWriter, r *http.Request) {
    if pusher, ok := w.(http.Pusher); ok {
        pusher.Push("/static/style.css", nil)
    }
    // Render the homepage HTML
})

It's really, really easy to do server push in Go, and the idea of pushing resources your clients are about to ask for is so exciting that I wanted to use it even if the benefits aren't going to be that large for your project.

Server push requires that you terminate TLS in your Go server. Run make generate_cert to generate a self-signed certificate for use locally. Use autocert.NewListener() for one-line certificate provisioning in production.

If you can't terminate TLS (say you are running on App Engine Flex or Heroku), the starter pack falls back to sending a preload Link header.

Logging

Being able to see details about every incoming request and response helps with debugging and is rarely a performance bottleneck on a domain that's being browsed by humans. In particular it's useful to see the user agent, the status code, the request ID (for correlating with downstream services), and any username used for HTTP basic authentication.

INFO[07:52:24.810-07:00] method=GET path=/ time=0 bytes=468 status=200
remote_addr=[::1]:50170 host=localhost:7065 user_agent=HTTPie/1.0.0-dev
request_id=a8cf0a8c-4f30-4b01-a49d-396611b3ca15

I use a logging middleware for this but you are welcome to rip it out and use your own.

Flash Messages

You'll probably want to show error and success messages to your end users. Again this is trying to take the simplest possible approach - a flash message is set by encrypting a string and setting the result as a cookie. Retrieving the message deletes the cookie.

func FlashSuccess(w http.ResponseWriter, msg string, key *[32]byte) {
	c := &http.Cookie{
		Name:     "flash-success",
		Path:     "/",
		Value:    opaque(msg, key),
		HttpOnly: true,
	}
	http.SetCookie(w, c)
}

We use the secretbox library to encrypt and decrypt cookies using a secret key - you can use the same to encrypt session ID's or other data in a tamper-safe way.

Loading Configuration

You will probably need to configure the server somehow. I prefer loading configuration from a YAML file to the command line or to environment variables, because it's more difficult to accidentally leak secrets from YAML to e.g. Sentry (as you can with environment variables), and your secrets can't be read from /proc/<pid>/env by other users.

main.go walks you through loading a YAML file and using it to configure the server.

Release Binaries

Run make release version=0.x.y to increment the server's Version, cross compile binaries for Windows, Mac, and Linux, and push them to the releases page on Github. You'll want to set a GITHUB_TOKEN in the environment with permission to push new releases to your project, and you'll probably need to change the username in your version of the Makefile.

Conclusion

It's not too difficult to get a lot of the core functionality of a larger framework in a different language, with a lot better performance and a more understandable set of libraries. It should be pretty easy to rip any of these parts out; you should deploy this by copying the files into your own project and modifying as you see fit.

I hope the starter pack is useful and I hope you choose Go for your next web development project!

Liked what you read? I am available for hire.

Why Leap Seconds are Tricky and How to Deal With Them

You may have seen this on New Year's Eve:

I'd heard a little about this problem, but I didn't understand how it broke code, and what to do about it. So here is an explainer.

Background

Earlier this year, the International Earth Rotation and Reference Systems Service decided to add an additional second to the year. This is due to the fact that the Earth's day-to-day rotation does not perfectly line up with a 86400 second day.

So when the clock hit 23:59:59 on New Year's Eve, it usually ticks over to 00:00:00, but this year, repeated the second 23:59:59. So if you had measurement code that looked like:

a := time.Now() // Dec 31 2016 23:59:59 and 800ms
// While this operation is in progress, the leap second starts
doSomeExpensiveOperation()
duration := time.Since(a) // Taken at Dec 31 2016 23:59:59 (#2!) and 100ms
fmt.Println(duration)

We usually assume in our code that time can't go backwards, but the duration there is -700ms! This can cause big problems if you have code that always expects nonnegative time durations. In particular code that accepts timeouts might panic or immediately error if it gets a negative value. Cloudflare passed a negative duration to rand.Int63n, which crashed some of their servers.

But this is only a problem on leap seconds, which occur once a year at most. That's not good, but I can handle one problem a year. That would be nice, but unfortunately time on your servers can jump around quite a bit. An individual server might get a few seconds ahead or behind the rest of your servers. When it gets the correct time from the NTP server, the time on the server might jump forward or backward a few seconds. If you run any number of servers you are bound to hit this problem sooner or later.

Ok, how do I deal with it?

Good question! The Linux community implemented CLOCK_MONOTONIC as an option to clock_gettime to solve this problem. CLOCK_MONOTONIC returns an integer number of nanoseconds that only ever increases. You can use this to get more accurate (and always nonnegative) deltas than getting the system time twice and subtracting the samples.

You'll have to check whether your programming language uses CLOCK_MONOTONIC for implementing calls to get the system time. Many languages don't use it for the simple "what time is it" call, because it returns you time from a random starting point, not the epoch.

So you can't really translate between a CLOCK_MONOTONIC value and any given human date; it's only useful when you take multiple samples and compare them.

In particular

The Go standard library does not use CLOCK_MONOTONIC for calls to time.Now(). Go has a function that implements CLOCK_MONOTONIC, but it's not a public part of the standard library. You can access it via the monotime library - replace your calls to time.Now() with monotime.Now(). Note you will get back a uint64 that only makes sense if you call monotime.Now() again a little while later (on the same machine) and subtract it from the first value you got.

import "time"
import "github.com/aristanetworks/goarista/monotime"
func main() {
    a := monotime.Now()
    someExpensiveOperation()
    b := monotime.Now()
    // b - a will *always* be greater than 0
    fmt.Println(time.Duration(b - a))
}

Porting code errata

I ported Logrole (a Twilio log viewer) to implement monotonic time. In the past it was really easy to call t := time.Now() and then, later, somewhere else in the code, call diff := time.Since(t) to get a Duration. This let you both use t as a wall clock time, and get an elapsed amount of time since t, at some unspecified point later in the codebase and in time. With CLOCK_MONOTONIC you have to separate these two use cases; you can get a delta or you can get a Time but you can't get both.

The patch was pretty messy and an indication of the problems you might have in your own codebase.

Another problem to watch out for is a uint64 underflow. If you have code like this:

if since := time.Since(aDeadline); since > 0 {
    return nil, errors.New("deadline exceeded")
}

You might port it to something like:

now := monotime.Now()
deadline := now + uint64(500*time.Millisecond)
someExpensiveOperation()
if monotime.Now() - deadline > 0 {
    return nil, errors.New("deadline exceeded")
}

The problem here is you are subtracting uint64 values and if now is under the deadline, you'll underflow and get a very large uint64 value (which is greater than 0), so you'll always execute the if condition. Instead you need to avoid the subraction as well and use

if monotime.Now() > deadline {

instead.

Conclusion

It is hard to get this right, but it's probably a good practice to separate out wall clock time values from time values used for deltas and timeouts. If you don't do this from the start though you are going to run into problems. It would be nice if this was better supported in your language's standard library; ideally you'd be forced to figure out which use case you needed a time value for, and use it for that.

Liked what you read? I am available for hire.

Tradeoffs in Software Provisioning Tools

A while ago my friend Alan and I were discussing configuration management. In particular we wondered why every configuration management tool has to ship a DSL, or be loaded from YAML files.

We wondered if it would be possible to just write code that deploys servers — it might let you describe what you want to do much more precisely.

I started working on a library that lets you do this. Basically, turn every module + state combination in Ansible into a function, add any required arguments as part of the method signature, and add a Opts dictionary for any optional arguments. Right now it looks something like this.

if err := core.AddGroup(ctx, host, "wheel", core.GroupOpts{
    System: false,
    Gid: "1001",
}); err != nil {
    log.Fatal(err)
}

But starting to implement this led to several more non-obvious tradeoffs.

Abstraction

This is the most obvious reason to use a configuration management tool. Whether you are deploying OpenBSD or Ubuntu or Darwin, you still need to create users and create folders and install packages. A good provisioning tool will abstract these for you, and choose sensible defaults.

However abstractions can be leaky; maybe one filesystem offers a feature that others don't, and it can be hard to make this available while also saying "this is only supported on these systems."

Run Commands On Local Machine vs. Remote Machine

Do you want to run commands on the machine that triggered the provisioning process, or the machine being provisioned? Take mysql for example. If you have a mysql client on the local machine, you can issue commands to the remote machine using the mysql protocol on the wire.

This requires the remote machine to expose port 3306, which you might not want to do. You also need to trust mysql's ability to encrypt a connection and trust a remote certificate. Compared with SSH, mysql has had much less auditing of its security code, and is not as good of a bet for encrypting/safely compressing content going over the wire. (This becomes more salient when you have N protocols for issuing commands over the wire, instead of just SSH.)

Another option would be to SSH to the remote machine, then run a mysql client on that machine to configure/provision MySQL. But this requires that you have a MySQL client on the remote machine. It's also considerably trickier to issue multiple MySQL commands over a single SSH session, and take action based on the results.

Run Multiple Commands Per SSH Connection

A single task in Ansible for "create this recursive directory" embeds a ton of complexity. If the directory exists, but has the wrong permissions, or the wrong owner, the directories are recursively created and chowned/chmodded. You can do this with SSH commands, e.g. ssh host mkdir foo && chmod 755 foo && chown -R root:wheel foo, but it gets more and more complicated, and tougher to determine which command failed, the more commands you layer on.

You can work around this by issuing each command as part of a single SSH connection, then getting the result, and making some decision based on it. But this vastly increases the latency of what you're trying to do, even if you enable pipelining you're talking about 1 second latency per operation.

Ansible works around this by copying a Python file to the remote machine, then running that file on the remote machine with several arguments. This occurs for each directive that Ansible runs. This has two implications: each machine needs to have Python on it, and Ansible is really slow - think one second per directive you put in Ansible.

With Go, we could copy a binary to the remote host and then run it. This would let us take advantage of Go's standard libraries (instead of issuing Unix commands directly via SSH). We could either compile+SCP this binary on the fly, or ship precompiled binaries for each architecture as part of the distribution.

But if we are going to go to that length, why not just add tools to compile the user's entire program, SCP that to the remote filesystem, and run it there?

Run Multiple Directives Per SSH Connection

The only way you are going to get really fast execution is by executing multiple directives/tasks/modules as part of a single SSH connection to the host. The way to achieve the most benefits would be to compile the user's entire configuration program, SCP the binary to the host, then run the binary on the host.

But this requires giving up some flexibility as well. Some Ansible tasks involve copying data from the remote machine to a local machine - for example, mysql_db in target mode. You can do this over the SSH connection, but it might be tricky to separate output that's part of control flow - e.g. "RUN: add group wheel" - from output that's supposed to be copied to the local machine — e.g. a mysql dump. Alternatively, if you need to copy a file from the local machine to the remote machine, you need to bundle that file as part of the target you SCP to the remote machine.

For Go specifically, you'd either need a Go binary on the remote machine, and then copy all of the source files, or you'd need to cross compile the source on the local machine, which means things like user.Current() won't work.

Conclusion

There are a few thorny problems that weren't immediately apparent when I started working on this. For the moment I'm going to try to proceed with the Go solution, porting over an existing set of Ansible directives, and I'm going to try to prioritize speed of remote execution.

But I'm much less confident is going to work well, without a lot of effort.

Liked what you read? I am available for hire.

An API Client that’s Faster than the API

For the past few weeks I've been working on Logrole, a Twilio log viewer. If you have to browse through your Twilio logs, I think this is the way that you should do it. We were able to do some things around performance and resource conservation that have been difficult to accomplish with today's popular web server technologies.

Picture of Logrole

Fast List Responses

Twilio's API frequently takes over 1 second to return a page of calls or messages via the API. But Logrole usually returns results in under 100ms. How? Every 30 seconds we fetch the most recent page of Calls/Messages/Conferences and store them in a cache. When we download a page of resources, we get the URL to the next page - Twilio's next_page_uri — immediately, but a user might not browse to the next page for another few seconds. We don't have to wait for you to hit Next to get the results - on the server side, we fetch/cache the next page immediately, so it's ready when you finally hit the Next button, and it feels really snappy.

The cache is a simple LRU cache. We run Go structs through encoding/gob and then gzip before storing them, which takes the size of a cache value from 42KB to about 3KB. At this size, about 8,300 values can fit in 25MB of memory.

var buf bytes.Buffer
writer := gzip.NewWriter(&buf)
enc := gob.NewEncoder(writer)
if err := enc.Encode(data); err != nil {
	panic(err)
}
if err := writer.Close(); err != nil {
	panic(err)
}
c.mu.Lock()
defer c.mu.Unlock()
c.c.Add(key, buf.Bytes())
c.Debug("stored data in cache", "key", key, "size", buf.Len(),
    "cache_size", c.c.Len())

Right now one machine is more than enough to serve the website, but if we ever needed multiple machines, we could use a tool like groupcache to share/lookup cached values across multiple different machines.

Combining Shared Queries

The logic in the previous paragraphs leads to a race. If the user requests the Next page before we've finished retrieving/saving the value in the cache, we'll end up making two requests for the exact same data. This isn't too bad in our case, but doubles the load on Twilio, and means the second request could have gotten the results sooner by reusing the response from first request.

The singleflight package is useful for ensuring only one request ever gets made at a time. With singleflight, if a request is already in progress with a given key, a caller will get the return value from the first request. We use the next page URI as the key.

var g singleflight.Group
g.Do(page.NextPageURI, func() (interface{}, error) {
    // 1) return value from cache, if we've stored it
    // 2) else, retrieve the resource from the API
    // 3) store response in the cache
    // 4) return response
})

This technique is also useful for avoiding thundering herd problems.

Canceling Unnecessary Queries

You've configured a 30 second request timeout in nginx, and a query is taking too long, exceeding that timeout. nginx returns a 504 Gateway Timeout and moves on, but your server is still happily processing the request, even though no one is listening. It's a hard problem to solve because it's much easier for a thread to just give up than to give up and tell everyone downstream of you that they can give up too. A lot of our tools and libraries don't have the ability to do that kind of out of band signaling to a downstream process.

Go's context.Context is designed as an antidote for this. We set a timeout in a request handler very early on in the request lifecycle:

ctx, cancel := context.WithTimeout(req.Context(), timeout)
defer cancel()
req = req.WithContext(ctx)
h.ServeHTTP(w, req)

We pass this context to any method call that does I/O - a database query, a API client request (in our case), or a exec.Command. If the timeout is exceeded, we'll get a message on a channel at ctx.Done(), and can immediately stop work, no matter where we are. Stopping work if a context is canceled is a built in property of http.Request and os/exec in Go 1.7, and will be in database/sql starting with Go 1.8.

This is so nice - as a comparison, one of the most popular npm libraries for "stop after a timeout" is the connect-timeout library, which let you execute a callback after a particular amount of time, but does nothing to cancel any in-progress work. No popular ORM's for Node support canceling database queries.

It can be really tricky to enforce an absolute deadline on a HTTP request. In most languages you compute a timeout as a duration, but this timeout might reset to 0 every time a byte is received on the socket, making it difficult to enforce that the request doesn't exceed some wall-clock amount of time. Normally you have to do this by starting a 2nd thread that sleeps for a wall-clock amount of time, then checks whether the HTTP request is still in progress and kills the HTTP request thread if so. This 2nd thread also has to cleanup and close any open file descriptors.

Starting threads / finding and closing FD's may not be easy in your language but Contexts make it super easy to set a deadline for sending/receiving data and communicating that to a lot of different callers. Then the http request code can clean up the same way it would for any canceled request.

Metrics

I've been obsessed with performance for a long time and one of the first things I like to do in a new codebase is start printing out timestamps. How long did tests take? How long did it take to start the HTTP server? It's impossible to optimize something if you're not measuring it and it's hard to make people aware of a problem if you're not printing durations for common tasks.

Logrole prints three numbers in the footer of every response: the server response time, the template render time, and the Twilio API request time. You can use these numbers to get a picture of where the server was spending its time, and whether template rendering is taking too long. I use custom template functions to implement this - we store the request's start time in its Context, and then print time elapsed on screen. Obviously this is not a perfect measure since we can't see the time after the template footer is rendered - mainly the ResponseWriter.Write call. But it's close enough.

Page Footer

HTML5

Logrole loads one CSS file and one font. I would have had to use a lot more Javascript a few years ago, but HTML5 has some really nice features that eliminate the need for Javascript. For example, there's a built in date picker for HTML5, that people can use to filter calls/messages by date (It's best supported in Chrome at the moment). Similarly you don't need Javascript to play recordings anymore. HTML5 has an <audio> element that will provide controls for you.

I've needed Javascript in only three places so far:

  • a "click to show images" toggle button where the CSS necessary to implement it would have been too convoluted
  • a "click-to-copy" button
  • To submit a user's timezone change when they change it in the menu bar (instead of having a separate "Submit" button).

About 50 lines in total, implemented in the HTML below the elements where it's needed.

Conclusion

Combining these techniques, we get a server that uses little memory, doesn't waste any time doing unnecessary work, and responds and renders a slow data set extraordinarily quickly. Before starting a new project, evaluate the feature set of the language/frameworks you plan to use - whether the ORM/API clients you are planning to use support fast cancelation, whether you can set wall-clock timeouts and propagate them easily through your stack, and whether your language makes it easy to combine duplicate requests.

If you are a Twilio customer I hope you will give Logrole a try - I think you will like it a lot.

Thanks to Bradley Falzon, Kyle Conroy and Alan Shreve for reading drafts of this post. Thanks to Brad Fitzpatrick for designing and implementing most of the code mentioned here.

Liked what you read? I am available for hire.

Dumb Tricks to Save Database Space

I have seen a few databases recently that could have saved a lot of space by being more efficient with how they stored data. Sometimes this isn't a big problem, when a table is not going to grow particularly quickly. But it can become a big problem and you can be leaving a lot of disk savings on the table.

Let's review some of the benefits of smaller tables:

  • Indexes are smaller. This means your database needs less space to index your tables, and more RAM can be used to cache results.

  • The cache can hold more objects, since the objects are smaller.

  • You'll delay the point at which your database won't fit on a single disk, and you have to shard.

  • Query results which might have fit in 2 TCP packets will now fit in one.

  • Backups complete more quickly.

  • Your application servers will use less RAM to hold the result.

  • Migrations complete more quickly.

  • Full table searches complete more quickly.

Let's review some common data types and strategies for holding these. If these are obvious to you - great! You can stop reading at any point. They're not obvious to a lot of people.

A brief reminder before we get started - a bit is a single 0 or 1, and a byte is a series of 8 bits. Every ASCII character can be represented in a single byte.

UUID's

It's common to store UUID's as text fields in your database. A typical UUID representation - "ad91e02b-a147-4c47-aa8c-1f3c2240c0df" - will take up 36 bytes and more if you store it with a prefix like SMS or user_. A UUID uses only 16 different characters (the hyphens are for display only, like hyphens in a phone number). This means you only need 4 bits to store a UUID character. There are 32 characters in a UUID, so can fit a UUID in 16 bytes, a saving of 55%. If you're using Postgres, you can use the uuid data type to store UUID's, or the binary data type - in MySQL, you can use a binary(16).

CREATE TABLE users (id uuid PRIMARY KEY);

It's often useful to store a prefix with a UUID, so you know what the ID represents from looking at it - for example, SM123 or user_456. I wrote a short library that stores a prefix with a UUID, but strips it before writing to the database. To read UUID's out of the database with a prefix, attach them to the SELECT statement:

SELECT 'user_' || id FROM users LIMIT 5;

My old team at Shyp recently converted text ID's to UUID's and wrote about that process on their engineering blog.

UUID's in JSON

It's becoming more common to store relational data in JSON or JSONB columns. There are a lot of reasons to do this or not do this - I don't want to rehash that discussion here. JSONB does lead to inefficient data storage for UUID's, however, since you are limited to storing characters that are valid JSON. If you are storing UUID's this means you can't get down to 16 bytes, since you can't just any byte in JSON. You can base64 encode your 16 byte UUID. In Go, that encoding dance looks something like this:

import "encoding/base64"
import "encoding/hex"
import "strings"
rawUUID := "ad91e02b-a147-4c47-aa8c-1f3c2240c0df"
// Strip the hyphens
uuidStr := strings.Replace(rawUUID, "-", "", 4)
// Decode the hex string into a slice of 16 bytes.
bits, _ := hex.DecodeString(uuidStr)
// Re-encode that 16-byte slice using base64.
fmt.Println(base64.RawURLEncoding.EncodeToString(bits))

That outputs rZHgK6FHTEeqjB88IkDA3w, which is only 22 bytes, a 38% improvement.

Use smaller numbers

A default Postgres integer is 4 bytes and can hold any number from -2147483648 to 2147483648. If you know that the integer you are storing is never going to exceed 32,760, you can use a smallint (2 bytes) to store it and save 2 bytes per row.

Use an enum type

Let's say you have a subscription that can have one of several states (trial, paid, expired). Storing the strings "trial", "paid", "expired" in the database can take up extra space. Instead use an enum type, which is only 4 bytes (1 byte in MySQL) and ensures you can't accidentally write a bad status like "trail". Another alternative is to store a smallint and convert them to values that make sense in the application, but this makes it harder to determine what things are if you're querying the database directly, and doesn't prevent mistakes.

Use binary for password hashes

Most password hashing algorithms should give you back raw bytes. You should be able to store the raw bytes directly in the database using bytea.

Move fields out of JSON columns

One downside of JSON/JSONB is that the key gets stored alongside the value for each row in the application. If you are storing a boolean like {"show_blue_button": true} in JSON, you're using 18 bytes per row to store the string "show_blue_button" and only one bit to store the boolean true. If you store this field in a Postgres column, you are only using one or two bits per row. Moving this to a column pays off in terms of space even if you only need the show_blue_button boolean once every 70-140 rows. It's much easier to add indexes on columns than JSON fields as well.

Conclusion

That's it! A small amount of thought and work upfront can pay big dividends down the line. Migrating columns after they're already in place can be a pain. In general, the best approach is to do the following:

  • Add the new column with the correct type.

  • Edit your application to write/update both the new and the old column.

  • Backfill the new column, copying over all values from the old column for old records in the database. If the table is large, do this in batches of 1000 rows or so to avoid locking your table for too long.

  • Edit your application to read exclusively from the new column.

  • Drop the old column.

I hope this helps!

Inspired by some tweets from Andrey Petrov and a Heap Analytics post about JSONB.

Liked what you read? I am available for hire.

Real Life Go Benchmarking

I've been following the commits to the Go project for some time now. Occasionally someone will post a commit with benchmarks showing how much the commit improves performance along some axis or another. In this commit, they've increased the performance of division by 7 (a notoriously tricky number to divide by) by about 40% on machines running the ARM instruction set. I'm not 100% sure what the commit does, but it switches around the instructions that get generated when you do long division on ARM in a way that makes things faster.

Anyway, I wanted to learn how to do benchmarks, and practice making something faster. Some motivation came along when Uber released their go-zap logging library, with a set of benchmarks showing my friend Alan's logging library as the worst performer. So I thought it would be a good candidate for benchmarking.

Fortunately Alan has already included a set of benchmarks in the test suite. You can run them by cloning the project and then calling the following:

go test -run=^$ -bench=.

You need to pass -run=^$ to exclude all tests in the test suite, otherwise all of the tests will run and also all of the benchmarks. We only care about the benchmarks, so -run=^$ filters out every argument.

We get some output!

BenchmarkStreamNoCtx-4                   	  300000	      5735 ns/op
BenchmarkDiscard-4                       	 2000000	       856 ns/op
BenchmarkCallerFileHandler-4             	 1000000	      1980 ns/op
BenchmarkCallerFuncHandler-4             	 1000000	      1864 ns/op
BenchmarkLogfmtNoCtx-4                   	  500000	      3866 ns/op
BenchmarkJsonNoCtx-4                     	 1000000	      1816 ns/op
BenchmarkMultiLevelFilter-4              	 2000000	      1015 ns/op
BenchmarkDescendant1-4                   	 2000000	       857 ns/op
BenchmarkDescendant2-4                   	 2000000	       872 ns/op
BenchmarkDescendant4-4                   	 2000000	      1029 ns/op
BenchmarkDescendant8-4                   	 1000000	      1018 ns/op
BenchmarkLog15AddingFields-4             	   50000	     29492 ns/op
BenchmarkLog15WithAccumulatedContext-4   	   50000	     33599 ns/op
BenchmarkLog15WithoutFields-4            	  200000	      9417 ns/op
PASS
ok  	github.com/inconshreveable/log15	30.083s

The name on the left is the benchmark name. The number (4) is the number of CPU's used for the benchmark. The number in the middle is the number of test runs; to get a good benchmark, you want to run a thing as many times as feasible, then divide the total runtime by the number of test runs. Otherwise you run into problems like coordinated omission and weighting the extreme outliers too much, or failing to weight them at all.

To get a "good" benchmark, you want to try to isolate the running code from anything else running on the machine. For example, say you run the tip benchmarks while a Youtube video is playing, make a change, and then run the benchmarks while nothing is playing. Video players require a lot of CPU/RAM to play videos, and all other things being equal, the benchmark is going to be worse with Youtube running. There are a lot of ways to accidentally bias the results as well, for example you might get bored with the tip benchmarks and browse around, then make a change and observe the console intensely to see the new results. You're biasing the results simply by not switching tabs or screens!

If you have access to a Linux machine with nothing else running on it, that's going to be your best bet for benchmarking. Otherwise, shut down as many other programs as are feasible on your main machine before starting any benchmark tests.

Running a benchmark multiple times can also be a good way to compensate for environmental effects. Russ Cox's benchstat program is very useful for this; it gathers many runs of a benchmark, and tells you whether the results are statistically significant. Run your benchmark with the -count flag to run it multiple times in a row:

go test -count=20 -run=^$ -bench=Log15AddingFields | tee -a master-benchmark

Do the same for your change, writing to a different file (change-benchmark), then run benchstat:

benchstat master-benchmark change-benchmark

You'll get really nice looking output. This is generally the output used by the Go core development team to print benchmark results.

$ benchstat benchmarks/tip benchmarks/early-time-exit
name                 old time/op  new time/op  delta
StreamNoCtx-4        3.60µs ± 6%  3.17µs ± 1%  -12.02%  (p=0.001 n=8+6)
Discard-4             837ns ± 1%   804ns ± 3%   -3.94%  (p=0.001 n=7+6)
CallerFileHandler-4  1.94µs ± 2%  1.88µs ± 0%   -3.01%  (p=0.003 n=7+5)
CallerFuncHandler-4  1.72µs ± 3%  1.65µs ± 1%   -3.98%  (p=0.001 n=7+6)
LogfmtNoCtx-4        2.39µs ± 2%  2.04µs ± 1%  -14.69%  (p=0.001 n=8+6)
JsonNoCtx-4          1.68µs ± 1%  1.66µs ± 0%   -1.08%  (p=0.001 n=7+6)
MultiLevelFilter-4    880ns ± 2%   832ns ± 0%   -5.44%  (p=0.001 n=7+6)
Descendant1-4         855ns ± 3%   818ns ± 1%   -4.28%  (p=0.000 n=8+6)
Descendant2-4         872ns ± 3%   830ns ± 2%   -4.87%  (p=0.001 n=7+6)
Descendant4-4         934ns ± 1%   893ns ± 1%   -4.41%  (p=0.001 n=7+6)
Descendant8-4         990ns ± 2%   958ns ± 2%   -3.29%  (p=0.002 n=7+6)

OK! So now we have a framework for measuring whether a change helps or hurts.

How can I make my code faster?

Good question! There's no one answer for this. In general, there are three strategies:

  • Figure out a way to do less work than you did before. Avoid doing an expensive computation where it's not necessary, exit early from functions, &c.

  • Do the same work, but in a faster way; use a different algorithm, or use a different function, that's faster.

  • Do the same work, but parallelize it across multiple CPU's, or threads.

Each technique is useful in different places, and it can be hard to predict where you'll be able to extract performance improvements. It is useful to know how expensive various operations are, so you can evaluate the costs of a given operation.

It's also easy to spend a lot of time "optimizing" something only to realize it's not the bottleneck in your program. If you optimize something that takes 5% of the code's execution time, the best overall speedup you can get is 5%, even if you get the code to run instantaneously. Go's test framework has tools for figuring out where your code is spending the majority of its time. To get the best use out of them, focus on profiling the code execution for a single benchmark. Here, I'm profiling the StreamNoCtx benchmark in the log15 library.

$ go test -cpuprofile=cpu.out -benchmem -memprofile=mem.out -run=^$ -bench=StreamNoCtx -v
BenchmarkStreamNoCtx-4   	  500000	      3502 ns/op	     440 B/op	      12 allocs/op

This will generate 3 files: cpu.out, mem.out, and log15.test. These are binary files. You want to use the pprof tool to evaluate them. First let's look at the CPU profile; I've started it and then run top10 to get the top 10 functions.

$ go tool pprof log15.test cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top5
560ms of 1540ms total (36.36%)
Showing top 5 nodes out of 112 (cum >= 60ms)
      flat  flat%   sum%        cum   cum%
     180ms 11.69% 11.69%      400ms 25.97%  runtime.mallocgc
     160ms 10.39% 22.08%      160ms 10.39%  runtime.mach_semaphore_signal
     100ms  6.49% 28.57%      260ms 16.88%  github.com/inconshreveable/log15.escapeString
      60ms  3.90% 32.47%       70ms  4.55%  bytes.(*Buffer).WriteByte
      60ms  3.90% 36.36%       60ms  3.90%  runtime.stringiter2

The top 5 functions are responsible for 36% of the program's runtime. flat is how much time is spent inside of a function, cum shows how much time is spent in a function, and also in any code called by a function. Of these 5, runtime.stringiter2, runtime.mallocgc, and runtime.mach_semaphore_signal are not good candidates for optimization. They are very specialized pieces of code, and they're part of the Go runtime, so changes need to pass all tests and get approved by the Go core development team. We could potentially figure out how to call them less often though - mallocgc indicates we are creating lots of objects. Maybe we could figure out how to create fewer objects.

The likeliest candidate for improvement is the escapeString function in our own codebase. The list function is super useful for checking this.

(pprof) list escapeString
ROUTINE ======================== github.com/inconshreveable/log15.escapeString in /Users/kevin/code/go/src/github.com/inconshreveable/log15/format.go
      30ms      330ms (flat, cum) 23.40% of Total
      10ms       10ms    225:func escapeString(s string) string {
         .          .    226:	needQuotes := false
         .       90ms    227:	e := bytes.Buffer{}
         .          .    228:	e.WriteByte('"')
      10ms       50ms    229:	for _, r := range s {
         .          .    230:		if r <= ' ' || r == '=' || r == '"' {
         .          .    231:			needQuotes = true
         .          .    232:		}
         .          .    233:
         .          .    234:		switch r {
         .          .    235:		case '\\', '"':
         .          .    236:			e.WriteByte('\\')
         .          .    237:			e.WriteByte(byte(r))
         .          .    238:		case '\n':
         .          .    239:			e.WriteByte('\\')
         .          .    240:			e.WriteByte('n')
         .          .    241:		case '\r':
         .          .    242:			e.WriteByte('\\')
         .          .    243:			e.WriteByte('r')
         .          .    244:		case '\t':
         .          .    245:			e.WriteByte('\\')
         .          .    246:			e.WriteByte('t')
         .          .    247:		default:
         .      100ms    248:			e.WriteRune(r)
         .          .    249:		}
         .          .    250:	}
         .       10ms    251:	e.WriteByte('"')
         .          .    252:	start, stop := 0, e.Len()
         .          .    253:	if !needQuotes {
         .          .    254:		start, stop = 1, stop-1
         .          .    255:	}
      10ms       70ms    256:	return string(e.Bytes()[start:stop])

The basic idea here is to write a string, but add a backslash before double quotes, newlines, and tab characters. Some ideas for improvement:

  • We create a bytes.Buffer{} at the beginning of the function. We could keep a Pool of buffers, and retrieve one, so we don't need to allocate memory for a buffer each time we escape a string.

  • If a string doesn't contain a double quote, newline, tab, or carriage return, it doesn't need to be escaped. Maybe we can avoid creating the Buffer entirely for that case, if we can find a fast way to check whether a string has those characters in it.

  • If we call WriteByte twice in a row, we could probably replace it with a WriteString(), which will use a copy to move two bytes, instead of growing the array twice.

  • We call e.Bytes() and then cast the result to a string. Maybe we can figure out how to call e.String() directly.

You can then look at how much time is being spent in each area to get an idea of how much any given idea will help your benchmarks. For example, replacing WriteByte with WriteString probably won't save much time; you're at most saving the time it takes to write every quote and newline, and most strings are made up of letters and numbers and space characters instead. (It doesn't show up at all in the benchmark but that's because the benchmark writes the phrase "test message" over and over again, and that doesn't have any escape-able characters).

That's the CPU benchmark; how much time the CPU spends running each function in the codebase. There's also the memory profile; how much memory each function allocates. Let's look at that! We have to pass one of these flags to pprof to get it to show memory information.

Sample value selection option (for heap profiles):
  -inuse_space      Display in-use memory size
  -inuse_objects    Display in-use object counts
  -alloc_space      Display allocated memory size
  -alloc_objects    Display allocated object counts

Let's pass one. Notice in this case, the top 5 functions allocate 96% of the objects:

go tool pprof -alloc_objects log15.test mem.out
(pprof) top5
6612331 of 6896359 total (95.88%)
Showing top 5 nodes out of 18 (cum >= 376843)
      flat  flat%   sum%        cum   cum%
   2146370 31.12% 31.12%    6612331 95.88%  github.com/inconshreveable/log15.LogfmtFormat.func1
   1631482 23.66% 54.78%    1631482 23.66%  github.com/inconshreveable/log15.escapeString
   1540119 22.33% 77.11%    4465961 64.76%  github.com/inconshreveable/log15.logfmt
    917517 13.30% 90.42%    1294360 18.77%  github.com/inconshreveable/log15.formatShared
    376843  5.46% 95.88%     376843  5.46%  time.Time.Format

Let's look at a function:

ROUTINE ======================== github.com/inconshreveable/log15.logfmt in /Users/kevin/code/go/src/github.com/inconshreveable/log15/format.go
   1540119    4465961 (flat, cum) 64.76% of Total
         .          .     97:		if i != 0 {
         .          .     98:			buf.WriteByte(' ')
         .          .     99:		}
         .          .    100:
         .          .    101:		k, ok := ctx[i].(string)
         .    2925842    102:		v := formatLogfmtValue(ctx[i+1])
         .          .    103:		if !ok {
         .          .    104:			k, v = errorKey, formatLogfmtValue(k)
         .          .    105:		}
         .          .    106:
         .          .    107:		// XXX: we should probably check that all of your key bytes aren't invalid
         .          .    108:		if color > 0 {
         .          .    109:			fmt.Fprintf(buf, "\x1b[%dm%s\x1b[0m=%s", color, k, v)
         .          .    110:		} else {
   1540119    1540119    111:			fmt.Fprintf(buf, "%s=%s", k, v)
         .          .    112:		}
         .          .    113:	}

In the common case on line 111 (when color = 0), we're calling fmt.Fprintf to write the key, then an equals sign, then the value. Fprintf also has to allocate memory for its own byte buffer, then parse the format string, then interpolate the two variables. It might be faster, and avoid allocations, to just call buf.WriteString(k), then write the equals sign, then call buf.WriteString(v). But you'd want to benchmark it first to double check!

Conclusion

Using a combination of these techniques, I was able to improve the performance of log15 by about 30% for some common code paths, and reduce memory allocations as well. I was not expecting this, but I also found a way to speed up JSON encoding in the Go standard library by about 20%!

Want someone to benchmark/improve performance in your company's application, or teach your team more about benchmarking? I am available for consulting; email me and let's set something up!

Liked what you read? I am available for hire.

Cleaning up Parallel Tests in Go 1.7

I have a lot of tests in Go that integrate with Postgres, and test the interactions between Go models and the database.

A lot of these tests can run in parallel. For example, any test that attempts to write a record, but fails with a constraint failure, can run in parallel with all other tests. A test that tries to read a random database ID and expects to not fetch a record can run in parallel with other tests. If you write your tests so they all use random UUID's, or all run inside of transactions, you can run them in parallel. You can use this technique to keep your test suite pretty fast, even if each individual test takes 20-40 milliseconds.

You can mark a test to run in parallel by calling t.Parallel() at the top of the test. Here's an example test from the job queue Rickover:

func TestCreateMissingFields(t *testing.T) {
  t.Parallel()
  test.SetUp(t)
  job := models.Job{
    Name: "email-signup",
  }
  _, err := jobs.Create(job)
  test.AssertError(t, err, "")
  test.AssertEquals(t, err.Error(), "Invalid delivery_strategy: \"\"")
}

This test will run in parallel with other tests marked Parallel and only with other tests marked Parallel; all other tests run sequentially.

The problem comes when you want to clear the database. If you have a t.Parallel() test clean up after it has made its assertions, it might try to clear the database while another Parallel() test is still running! That wouldn't be good at all. Presumably, the sequential tests are expecting the database to be cleared. (They could clear it at the start of the test, but this might lead to unnecessary extra DB writes; it's better for tests that alter the database to clean up after themselves).

(You can also run every test in a transaction, and roll it back at the end. Which is great, and gives you automatic isolation! But you have to pass a *sql.Tx around everywhere, and make two additional roundtrips to the database, which you probably also need to do in your application).

Go 1.7 adds the ability to nest tests. Which means we can run setup once, run every parallel test, then tear down once. Something like this (from the docs):

func TestTeardownParallel(t *testing.T) {
  // This Run will not return until the parallel tests finish.
  t.Run("group", func(t *testing.T) {
    t.Run("Test1", parallelTest1)
    t.Run("Test2", parallelTest2)
    t.Run("Test3", parallelTest3)
  })
  // <tear-down code>
}

Note you have to lowercase the function names for the parallel tests, or they'll run inside of the test block, and then again, individually. I settled on this pattern:

var parallelTests = []func(*testing.T){
  testCreate,
  testCreateEmptyPriority,
  testUniqueFailure,
  testGet,
}
func TestAll(t *testing.T) {
  test.SetUp(t)
  defer test.TearDown(t)
  t.Run("Parallel", func(t *testing.T) {
    for _, parallelTest := range parallelTests {
      test.F(t, parallelTest)
    }
  })
}

The test mentioned there is the set of test helpers from the Let's Encrypt project, plus some of my own. test.F finds the defined function name, capitalizes it, and passes the result to test.Run:

// capitalize the first letter in the string
func capitalize(s string) string {
  r, size := utf8.DecodeRuneInString(s)
  return fmt.Sprintf("%c", unicode.ToTitle(r)) + s[size:]
}
func F(t *testing.T, f func(*testing.T)) {
  longfuncname := runtime.FuncForPC(reflect.ValueOf(f).Pointer()).Name()
  funcnameparts := strings.Split(longfuncname, ".")
  funcname := funcnameparts[len(funcnameparts)-1]
  t.Run(capitalize(funcname), f)
}

The result is a set of parallel tests that run a cleanup action exactly once. The downside is the resulting tests have two levels of nesting; you have to define a second t.Run that waits for the parallel tests to complete.

=== RUN   TestAll
=== RUN   TestAll/Parallel
=== RUN   TestAll/Parallel/TestCreate
=== RUN   TestAll/Parallel/TestCreateEmptyPriority
=== RUN   TestAll/Parallel/TestUniqueFailure
=== RUN   TestAll/Parallel/TestGet
--- PASS: TestAll (0.03s)
    --- PASS: TestAll/Parallel (0.00s)
        --- PASS: TestAll/Parallel/TestCreate (0.01s)
        --- PASS: TestAll/Parallel/TestCreateEmptyPriority (0.01s)
        --- PASS: TestAll/Parallel/TestUniqueFailure (0.01s)
        --- PASS: TestAll/Parallel/TestGet (0.02s)

The other thing that might trip you up: If you add print statements to your tear down lines, they'll appear in the console output before the PASS lines. However, I verified they run after all of your parallel tests are finished running.

Liked what you read? I am available for hire.

Buying stocks without a time limit

Recently I put the maximum amount of cash into my IRA account. Since stock prices jump up and down all the time, I wondered whether the current price would be the best one to buy the stock at. In particular, I'm not withdrawing money from my IRA for the better part of 40 years, so I'm not going to sell it — I want the absolute lowest price possible.

Stock markets offer a type of order that's useful for this - you can put in a limit order, which says essentially "I'll buy N shares at price X", where X is some price below the last traded price of the stock. Anyone can take the other side of that order at any time, which usually happens when the stock price drops below the limit order! So you can put in a limit order at any time and it will execute whenever the stock price drops below that amount. I'm simplifying slightly but that's the gist.

So if the current stock price is X, what price should I put in my limit order for? You should be able to come up with some kind of model of the stock price and how it moves and say, roughly, "If you set your limit order at Z, there will be a ~98% chance of it executing in the next year." The normal criticism of stock price models is they underestimate the probability of extreme events, which is alright for us, since extreme events make it more likely our limit order will trigger.

I figured someone studied this problem before so I asked about it on Quant Stack Exchange and I got a really good answer! The answer pointed to two different papers, which have attempted to model this.

Essentially the model looks at the volatility of the stock - how often it jumps up and down - the stock's current price, the probability you want the order to hit, and the amount of time you want to wait. The formula is a little complex but it looks like this:

I wrote some code to compute these values for the stocks I wanted to buy. The hardest part is figuring out the volatility, because you need access to a lot of stock prices, and you'll want to compute the standard deviation from that. I was able to get historical price data from Quandl, and current stock prices from Yahoo.

Combine the two and you get output that looks like this:

$ go run erf.go main.go --stock=<stock> --total=5500 --percent 0.95
the standard deviation is: 0.006742
annualized: 0.107022
current price: 74.95
Based on this stock's volatility, you should set a limit order for: $74.35.

Compared with buying it at the current price, you'll be able to buy 0.6 extra shares (a value of $43.77)

Pretty neat! I've put the code up on Github, so you can take a look for yourself.

Liked what you read? I am available for hire.

A Two Month Debugging Story

This is the story about an error that caused our tests to fail maybe one out of one hundred builds. Recently we figured out why this happened and deployed code to fix it.

The Problem

I've been fighting slow test times in Javascript and Sails since pretty much the day I started at Shyp. We run about 6000 tests on a popular CI service, and they take about 7 minutes to run end to end. This is pretty slow, so we spend a lot of time thinking about ways to make them faster.

To ensure each test has a clean slate, we clear the database between each test. Normally you clear tables by running TRUNCATE TABLE table1 table2 table3 table4 CASCADE. Unfortunately this takes about 200ms, which is too slow when you want to run it between every test - it would double the runtime of our test suite.

DELETE FROM runs much faster, but if you have records in table A that reference table B, DELETE FROM tableB will fail, since records in A still point to it. We could draw a dependency graph between our 60 tables and issue the DELETEs in the correct order, but this seems painful to maintain as we add more constraints. Instead our contractor James Cropcho found a solution - disable the constraints, run the DELETE, then re-enable the constraints. This ran in about 20ms.

ALTER TABLE tableA DISABLE TRIGGER ALL;
DELETE FROM tableA;
ALTER TABLE tableA ENABLE TRIGGER ALL;

This worked for a few months. Unfortunately, occasionally a test would time out and fail after 18 seconds - maybe 1 in 100 test runs would fail with a timeout. Normally when the tests fail you get back some kind of error message — a Postgres constraint failure will print a useful message, or a Javascript exception will bubble up, or Javascript will complain about an unhandled rejection.

With this error we didn't get any of those. We also observed it could happen anywhere - it didn't seem to correlate with any individual test or test file.

Where to start looking

For a long time we suspected that our ORM was losing some kind of race and a callback wasn't being hit, deep in the framework code. This type of problem isn't uncommon in Javascript - the most common tool for stubbing the system time causes hangs in many libraries, including Express, Request, async, Bluebird, and Co. These types of issues are incredibly difficult to debug, because when things hang, or fail to hit callbacks, you have virtually no insight into what the stack looks like, or what callback failed to get hit. We could add console.log lines but you have to have some idea what we were looking for. Furthermore annotating every single library and every single query would have caused readability problems elsewhere.

Run the tests forever

Our CI provider lets us SSH into a host, but you have to enable SSH access before the build completes, which isn't practical when 1 in 100 builds fails. My normal approach to debugging intermittent test failures is to run the tests in a continuous loop until they fail, and then inspect the test logs/SSH to the host and look for... something, until you figured out what was going on. Unfortunately that wasn't triggering it often enough either, and even when I did trigger it, our CI provider shuts down SSH access after 30 minutes, and I wasn't fast enough to notice/investigate.

Instead we had to ship whatever logs we needed as artifacts from the test. This required turning on database logging for Postgres, then copying logs to the directory where artifacts get exported. Database logs across multiple containers shared the same filename, which meant only database logs from the first container became available as artifacts, so we also had to add a unique ID to each log file so they'd get exported properly.

This was a lot of steps, and took just enough debugging/setup time that I procrastinated for a while, hoping a better stack trace or error message might reveal itself, or that the problem would go away.

Here are the Postgres settings we enabled:

logging_collector = on
log_statement = 'all'
log_duration = on
# Log if we are waiting on a lock.
deadlock_timeout = 1s
log_lock_waits = on
# Ensure all logs are in one file.
log_rotation_size = 200MB
log_line_prefix = '%m [%p-%c-%l] %e %q%[email protected]%d '

Finally we found something!

20:38:04.947 LOG: process 16877 still waiting for AccessExclusiveLock on relation 16789 of database 16387 after 1000.113 ms
20:38:04.947 DETAIL: Process holding the lock: 16936. Wait queue: 16877.
20:38:23.141 LOG: process 16877 acquired AccessExclusiveLock on relation 16789 of database 16387 after 19193.981 ms
20:38:23.141 ERROR: canceling autovacuum task
20:38:23.141 CONTEXT: automatic vacuum of table "circle_test.public.events"
20:38:23.141 LOG: process 16877 acquired AccessExclusiveLock on relation 16789 of database 16387 after 19193.981 ms

It looks like, unbeknownst to us, autovacuum was running in the background, and conflicting with the multi-statement ALTER TABLE query, in a way that each one was waiting for the other to complete. If you get deep enough in the weeds of Postgres blog posts, you find that stuck VACUUMs can and do happen.

Although it requires a fairly unusual combination of circumstances, a stuck VACUUM can even lead to an undetected deadlock situation, as Tom Lane recently pointed out. If a process that has a cursor open on a table attempts to take an exclusive lock on that table while it's being vacuumed, the exclusive lock request and the vacuum will both block until one of them is manually cancelled.

Not out of the woods

Unfortunately, we continued to see deadlocks. In some cases, a test would continue to make background database queries after they had reported completion. These would interact poorly with our ALTER TABLE commands, and deadlock, leading to a timeout. I posted on the Postgresql mailing list about this, and Alexey Bashtanov suggested a different approach. Instead of running ALTER TABLE, start a transaction and defer enforcement of foreign key constraints until the COMMIT. That way we could run the DELETE FROM's in any order and they'd still work.

BEGIN; SET CONSTRAINTS ALL DEFERRED;
DELETE FROM tableA;
DELETE FROM tableB;
COMMIT

This approach also sped up that query! It now runs in about 10ms, versus 20ms before.

Not out of the woods, again

We thought we'd nipped it in the bud, and certainly we were happy with the performance improvement. Unfortunately we've still been seeing timeouts; less frequently, but maybe once or twice a week. They usually happen when INSERTing records into an empty table; Postgres just hangs for 15 seconds, until the statement_timeout kicks in and Postgres kills the query, which fails the build. There are no messages about locks, so we're not sure what's going on. This is pretty uncharacteristic behavior for Postgres, and we're not doing anything particularly crazy with queries. We're open to suggestions; I posted about it again on the Postgres mailing list but did not get any good feedback.

Conclusion

The amount of work that's gone into tracking this failure down is depressing, though it's surprising that Javascript isn't the culprit for intermittent test failures. It would be nice if our CI provider had a way to auto-enable SSH on failed builds, enabled Postgres logging by default, or disabled autovacuum by default. I'd also recommend if you're adding foreign key constraints to make them DEFERRABLE INITIALLY IMMEDIATE; this gives you the most flexibility to pursue strategies like we did.

Liked what you read? I am available for hire.