Posts Tagged With: Code

The SSL library ecosystem needs diversity

The Heartbleed bug was really bad for OpenSSL - it let you ask a server a simple question like "How are you" and then have the server tell you anything it wants (password data, private keys that could be used to decrypt all traffic), and the server would have no idea it was happening.

A lot of people have said that we should ditch OpenSSL because this bug is so bad, and because there are parts of the codebase that are odd, and would usually indicate bad programmers, except that they are found in a library that is deployed everywhere.

Ditching OpenSSL is not going to happen any time soon because it is the standard implementation for any server that has to terminate SSL traffic, and writing good crypto libraries is very difficult. So this is not a promising approach.

However this bug and the subsequent panic (as well as the flood of emails telling me to reset passwords etc) indicate the problem with having every software company in the world rely on the same library. Imagine that there were three different SSL software tools that each had a significant share of the market. A flaw in one could affect, at most, the users of that library. Diversification reduces the value of any one exploit and makes it more difficult to find general attacks against web servers.

This diversity is what makes humans so robust against things like the Spanish Flu, which killed ~100 million people but didn't make a dent on the overall human population. Compare that with the banana, which is susceptible to a virus that could wipe out the entire stock of bananas around the world.

You can see the benefits of diversity in two places. One, even inside the OpenSSL project, users had different versions of the library installed on their servers. This meant that servers that didn't have versions 1.0.1a-f installed (like Twilio) were not vulnerable, which was good.

The second is that servers use different programming languages and different frameworks. This means that the series of Rails CVE's were very bad for Rails deployments but didn't mean anything for anyone else (another good thing).

After Heartbleed I donated $100 to the OpenSSL Foundation, in part because it is really important and in part because it's saved me from having to think about encrypting communication with clients (most of the time) which is really, really neat. I will match that donation to other SSL libraries, under these conditions:

  • The library's source code is available to the public.

  • There is evidence that the code has been used in a production environment to terminate SSL connections.

  • The project has room for more funding.

This is not a very large incentive, but it's at least a step in the right direction; if you want to join my pledge, I'll update the dollar amounts and list your name in this post. A prize of $10 million put a rocket into space; I'm hoping it will help spur diversity in the SSL ecosystem as well.

A look at the new retry behavior in urllib3

"Build software like a tank." I am not sure where I read this, but I think about it a lot, especially when writing HTTP clients. Tanks are incredible machines - they are designed to move rapidly and protect their inhabitants in any kind of terrain, against enemy gunfire, or worse.

HTTP clients often run in unfriendly territory, because they usually involve a network connection between machines. Connections can fail, packets can be dropped, the other party may respond very slowly, or with a new unknown error message, or they might even change the API from under you. All of this means writing an HTTP client like a tank is difficult. Here are some examples of things that a desirable HTTP client would do for you, that are never there by default.

  • If a request fails to reach the remote server, we would like to retry it no matter what. We don't want to wait around for the server forever though, so we want to set a timeout on the connection attempt.

  • If we send the request but the remote server doesn't respond in a timely manner, we want to retry it, but only on requests where it is safe to send the request again - so called idempotent requests.

  • If the server returns an unexpected response, we want to always retry if the server didn't do any processing - a 429, 502 or a 503 response usually indicate this - as well as all idempotent requests.

  • Generally we want to sleep between retries to allow the remote connection/server to recover, so to speak. To help prevent thundering herd problems, we usually sleep with an exponential back off.

Here's an example of how you might code this:

def resilient_request(method, uri, retries):
    while True:
        try:
            resp = requests.request(method, uri)
            if resp.status < 300:
                break
            if resp.status in [429, 502, 503]:
                retries -= 1
                if retries <= 0:
                    raise
                time.sleep(2 ** (3 - retries))
                continue
            if resp.status >= 500 and method in ['GET', 'PUT', 'DELETE']:
                retries -= 1
                if retries <= 0:
                    raise
                time.sleep(2 ** (3 - retries))
                continue
        except (ConnectionError, ConnectTimeoutError):
            retries -= 1
            if retries <= 0:
                raise
            time.sleep(2 ** (3 - retries))
        except TimeoutError:
            if method in ['GET', 'PUT', 'DELETE']:
                retries -= 1
                if retries <= 0:
                    raise
                time.sleep(2 ** (3 - retries))
                continue

Holy cyclomatic complexity, Batman! This suddenly got complex, and the control flow here is not simple to follow, reason about, or test. Better hope we caught everything, or we might end up in an infinite loop, or try to access resp when it has not been set. There are some parts of the above code that we can break into sub-methods, but you can't make the code too much more compact than it is there, since most of it is control flow. It's also a pain to write this type of code and verify its correctness; most people just try once, as this comment from the pip library illustrates. This is a shame and the reliability of services on the Internet suffers.

A better way

Andrey Petrov and I have been putting in a lot of work make it really, really easy for you to write resilient requests in Python. We pushed the complexity of the above code down into the urllib3 library, closer to the request that goes over the wire. Instead of the above, you'll be able to write this:

def retry_callable(method, response):
    """ Determine whether to retry this
    return ((response.status >= 400 and method in IDEMPOTENT_METHODS)
            or response.status in (429, 503))
retry = urllib3.util.Retry(read=3, backoff_factor=2,
                           retry_callable=retry_callable)
http = urllib3.PoolManager()
resp = http.request(method, uri, retries=retry)

You can pass a callable to the retries object to determine the retry behavior you'd like to see. Alternatively you can use the convenience method_whitelist and codes_whitelist helpers to specify which methods to retry.

retry = urllib3.util.Retry(read=3, backoff_factor=2,
                           codes_whitelist=set([429, 500]))
http = urllib3.PoolManager()
resp = http.request(method, uri, retries=retry)

And you will get out the same results as the 30 lines above. urllib3 will do all of the hard work for you to catch the conditions mentioned above, with sane (read: non-intrusive) defaults.

This is coming soon to urllib3 (and with it, to Python Requests and pip); we're looking for a bit more review on the pull request before we merge it. We hope this makes it easier for you to write high performance HTTP clients in Python, and appreciate your feedback!

Thanks to Andrey Petrov for reading a draft of this post.

How to create rich links in your Sphinx documentation

This will be short, but it seems there's some difficulty doing this, so I thought I'd share.

The gist is, any time you reference a class or method in your own library, in the Python standard library, or in another third-party extension, you can provide a link directly to that project's documentation. This is pretty amazing and only requires a little bit of extra work from you. Here's how.

The Simplest Type of Link

Just create a link using the full import path of the class or attribute or method. Surround it with backticks like this:

Use :meth:`requests.Request.get` to make HTTP Get requests.

That link will show up in text as:

Use requests.Request.get to make HTTP Get requests.

There are a few different types of declarations you can use at the beginning of that phrase:

:attr:
:class:
:meth:
:exc:

The full list is here.

I Don't Want to Link the Whole Thing

To specify just the method/attribute name and not any of the modules or classes that precede it, use a squiggly, like this:

Use :meth:`~requests.Request.get` to make HTTP Get requests.

That link will show up in text as:

Use get to make HTTP Get requests.

I Want to Write My Own Text

This gets a little trickier, but still doable:

Use :meth:`the get() method <requests.Request.get>` to make HTTP Get requests.

That link will show up in text as:

Use the get() method to make HTTP Get requests.

I want to link to someone else's docs

In your docs/conf.py file, add 'sphinx.ext.intersphinx' to the end of the extensions list near the top of the file. Then, add the following anywhere in the file:

    # Add the "intersphinx" extension
    extensions = [
        'sphinx.ext.intersphinx',
    ]
    # Add mappings
    intersphinx_mapping = {
        'urllib3': ('http://urllib3.readthedocs.org/en/latest', None),
        'python': ('http://docs.python.org/3', None),
    }

You can then link to other projects' documentation and then reference it the same way you do your own projects, and Sphinx will magically make everything work.

I want to write the documentation inline in my source code and link to it

Great! I love this as well. Add the 'sphinx.ext.autodoc' extension, then write your documentation. There's a full guide to the inline syntax on the Sphinx website; confusingly, it is not listed on the autodoc page.

    # Add the "intersphinx" extension
    extensions = [
        'sphinx.ext.autodoc',
    ]

Hope that helps! Happy linking.

New blog post about HAProxy

Over on the Twilio Engineering Blog, I have a new post about optimizing your HAProxy configuration. I wrote this mostly because we had some confusion in our configuration about setting options, and if I had it I figured others would as well. Here's a sample:

retries 2
option redispatch
When I said a 30 second connect timeout meant HAProxy would try a bad connection for 30 seconds, I lied. It turns out that by default HAProxy will retry the connect attempt 3 times. So our 30 second connect timeout is actually a 120 second connect timeout, blowing through our SLA and meaning we're returning an empty response to the customer.

Read the full post to learn more about HAProxy.

Automating your IPython Notebook Setup (and getting launchctl to work)

Recently I've fallen in love with the IPython Notebook. It's the Python REPL on steroids and I've probably just scratched the surface of what it can actually do. This will be a short post because long posts make me feel pain when I think about blogging more again. This is also really more about setting up launchctl than IPython, but hopefully that's useful too.

Starting it from the command line is kind of a pain (it tries to save .ipynb files in your current directory, it warns you to save files before closing tabs) so I thought I'd just set it up to run in the background each time I run ipython. Here's how you can get that set up.

Create a virtualenv with iPython

First, you need to install the ipython binary, and the other packages you need to run IPython Notebook.

    # Install virtualenvwrapper, then source it
    pip install virtualenvwrapper
    source /path/to/virtualenvwrapper.sh

mkvirtualenv ipython
pip install ipython tornado pyzmq

Starting IPython When Your Mac Boots

Open a text editor and add the following:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
      <key>Label</key>
      <string>com.kevinburke.ipython</string>
      <key>ProgramArguments</key>
      <array>
          <string>/Users/kevin/.envs/ipython/bin/ipython</string>
          <string>notebook</string>
      </array>
      <key>RunAtLoad</key>
      <true/>
      <key>StandardOutPath</key>
      <string>/Users/kevin/var/log/ipython.log</string>
      <key>StandardErrorPath</key>
      <string>/Users/kevin/var/log/ipython.err.log</string>
      <key>ServiceDescription</key>
      <string>ipython notebook runner</string>
      <key>WorkingDirectory</key>
      <string>/Users/kevin/.ipython_notebooks</string>
    </dict>
    </plist>

You will need to replace the word kevin with your name and relevant file locations on your file system. I also save my notebooks in a directory called .ipython_notebooks in my home directory, you may want to add that as well.

Save that in /Library/LaunchDaemons/<yourname>.ipython.plist. Then change the owner to root:

sudo chown root:wheel /Library/LaunchDaemons/<yourname>.ipython.plist

Finally load it:

sudo launchctl load -w /Library/LaunchDaemons/<yourname>.ipython.plist

If everything went ok, IPython should open in a tab. If it didn't go okay, check /var/log/system.log for errors, or one of the two logfiles specified in your plist.

Additional Steps

That's it! I've also found it really useful to run an nginx redirecter locally, as well as a new rule in /etc/hosts, so I can visit http://ipython and get redirected to my notebooks. But that is a topic for a different blog post.

Speeding up test runs by 81% and 13 minutes

Yesterday I sped up our unit/integration test runs from 16 minutes to 3 minutes. I thought I'd share the techniques I used during this process.

  • We had a hunch that an un-mocked network call was taking 3 seconds to time out. I patched this call throughout the test code base. It turns out this did not have a significant effect on the runtime of our tests, but it's good to mock out network calls anyway, even if they fail fast.

  • I ran a profiler on the code. Well that's not true, I just timed various parts of the code to see how long they took, using some code like this:

    import datetime
    start = datetime.datetime.now()
    some_expensive_call()
    total = (datetime.datetime.now() - start).total_seconds()
    print "some_expensive_call took {} seconds".format(total)
    

    It took about ten minutes to zero in on the fixture loader, which was doing something like this:

    def load_fixture(fixture):
        model = find_fixture_in_db(fixture['id'])
        if not model:
            create_model(**fixture)
        else:
            update_model(model, fixture)
    

    The call to find_fixture_in_db was doing a "full table scan" of our SQLite database, and taking about half of the run-time of the integration tests. Moreover in our case it was completely unnecessary, as we were deleting and re-inserting everything with every test run.

    I added a flag to the fixture loader to skip the database lookup if we were doing all inserts. This sped up observed test time by about 35%.

  • I noticed that the local test runner and the Jenkins build runner were running different numbers of tests. This was really confusing. I ended up doing some fancy stuff with the xunit xml output to figure out which extra tests were running locally. Turns out, the same test was running multiple times. The culprit was a stray line in our Makefile:

    nosetests tests/unit tests/unit/* ...
    

    The tests/unit/* change was running all of the tests in compiled .pyc files as well! I felt dumb because I actually added that tests/unit/* change about a month ago, thinking that nosetests wasn't actually running some of the tests in subfolders of our repository. This change cut down on the number of tests run by a factor of 2, which significantly helped the test run time.

  • The Jenkins package install process would remove and re-install the virtualenv before every test run, to ensure we got up-to-date dependencies with every run. Well that was kind of stupid, so instead we switched to running

    pip install --upgrade .
    

on our setup.py file, which should pull in the correct version of dependencies when they changed (most of them are specified either with double-equals, == or greater-than, >=, signs). Needless to say, skipping the full test run every time saved about three to four minutes.

  • I noticed that pip would still uninstall and reinstall packages that were already there. This happened for two reasons. One, our Jenkins box is running an older version of pip, which doesn't have this change from pip 1.1:

    Fixed issue #49 - pip install -U no longer reinstalls the same versions of packages. Thanks iguananaut for the pull request.

    I upgraded the pip and virtualenv versions inside of our virtualenv.

    Also, one dependency in our tests/requirements.txt would install the latest version of requests, which would then be overridden in setup.py by a very specific version of requests, every single time the tests ran. I fixed this by explicitly setting the requests version in the tests/requirements.txt file.

That's it! There was nothing major that was wrong with our process, just fixing the way we did a lot of small things throughout the build. I have a couple of other ideas to speed up the tests, including loading fewer fixtures per test and/or instantiating some objects like Flask's test_client globally instead of once per test. You might not have been as dumb as we were but you'll likely find some speedups if you check your build process as well.

Eliminating more trivial inconveniences

I really enjoyed Sam Saffron's post about eliminating trivial inconveniences in his development process. This resonated with me as I tend to get really distracted by minor hiccups in the development process (page reload taking >2 seconds, switch to a new tab, etc). I took a look at my development process and found a few easy wins.

Automatically run the unit tests in the current file

Twilio's PHP test suite are really slow - we're sloppy about trying to have unit tests avoid hitting the disk, which means that the suite takes a while to run. I wrote a short vim command that will run only the tests in the current file. This tends to make the test iteration loop much, much faster and I can run the entire suite of tests once the current file is passing. The <leader> function in Vim is excellent and I recommend you become familiar with it.

nnoremap <leader>n :execute "!" . "/usr/local/bin/phpunit " . bufname('%') . ' \| grep -v Configuration \| egrep -v "^$" '<CR>

bufname('%') is the file name of the current Vim buffer, and the last two commands are just grepping away output I don't care about. The result is awesome:

Unit test result in vim

Auto reloading the current tab when you change CSS

Sam has a pretty excellent MessageBus option that listens for changes to CSS files, and auto-refreshes a tab when this happens. We don't have anything that good yet but I added a vim leader command to refresh the current page in the browser. By the time I switch from Vim to Chrome (or no time, if I'm viewing them side by side), the page is reloaded.

function! ReloadChrome()
    execute 'silent !osascript ' .
                \'-e "tell application \"Google Chrome\" " ' .
                \'-e "repeat with i from 1 to (count every window)" ' .
                \'-e "tell active tab of window i" ' .
                \'-e "reload" ' .
                \'-e "end tell" ' .
                \'-e "end repeat" ' .
                \'-e "end tell" >/dev/null'
endfunction
nnoremap <leader>o :call ReloadChrome()<CR>:pwd<cr>

Then I just hit <leader>o and Chrome reloads the current tab. This works even if you have the "Developer Tools" open as a separate window, and focused - it reloads the open tab in every window of Chrome.

Pushing the current git branch to origin

It turns out that the majority of my git pushes are just pushing the current git branch to origin. So instead of typing git push origin <branch-name> 100 times a day I added this to my .zshrc:

    push_branch() {
        branch=$(git rev-parse --symbolic-full-name --abbrev-ref HEAD)
        git push $1 $branch
    }
    autoload push_branch
    alias gpob='push_branch origin'

I use this for git pushes almost exclusively now.

Auto reloading our API locally

The Twilio API is based on the open-source flask-restful project, running behind uWSGI. One problem we had was changes to the application code would require a full uWSGI restart, which made local development a pain. Until recently, it was pretty difficult to get new Python code running in uWSGI besides doing a manual reload - you had to implement a file watcher yourself, and then communicate to the running process. But last year uWSGI enabled the py-auto-reload feature, where uWSGI will poll for changes in your application and automatically reload itself. Enable it in your uWSGI config with

py-auto-reload = 1   # 1 second between polls

Or at the command line with uwsgi --py-auto-reload=1.

Conclusion

These changes have all made me a little bit quicker, and helped me learn more about the tools I use on a day to day basis. Hope they're useful to you as well!

Submit forms using Javascript without breaking the Internet, a short guide

Do you write forms on the Internet? Are you planning to send them to your server with Javascript? You should read this.

The One-Sentence Summary

It's okay to submit forms with Javascript. Just don't break the internet.

What Do You Mean, Break the Internet?

Your browser is an advanced piece of software that functions in a specific way, often for very good reasons. Ignore these reasons and annoy your users. User annoyance translates into lower revenue for you.

Here are some of the ways your Javascript form submit can break the Internet.

Submitting to a Different Endpoint Than the Form Action

A portion of your users are browsing the web without Javascript enabled. Some of them, like my friend Andrew, are paranoid. Others are on slow connections and want to save bandwidth. Still others are blind and browse the web with the help of screen readers.

All of these people, when they submit your form, will not hit your fancy Javascript event handler; they will submit the form using the default action and method for the form - which, if unspecified, default to a GET to the current page. Likely, this does not actually submit the form. Which leads to my favorite quote from Mark Pilgrim:

Jakob Nielsen's dog

There is an easy fix: make the form action and method default to the same endpoint that you are POSTing to with Javascript.

You are probably returning some kind of JSON object with an error or success message and then redirecting the user in Javascript. Just change your server endpoint to redirect if the request is not an AJAX request. You can do this because all browsers attach an X-Requested-With: XMLHttpRequest HTTP header to asynchronous Javascript requests.

Changing Parameter Names

Don't change the names of the submitted parameters in Javascript - just submit the same names that you had in your form. In jQuery this is easy, just call the serialize method on the form.

var form = $("#form-id");
$.post('endpoint', $(form).serialize(), function(response) {
    // do something with the response.
});

Attaching the Handler to a Click Action

Believe it or not, there are other ways of submitting a form besides clicking on the submit button. Screen readers, for example, don't click, they submit the form. Also there are lots of people like me who use tab to move between form fields and press the spacebar to submit forms. This means if your form submit starts with:

$("#submit-button").click(function() {
    // Submit the form.
});

You are doing it wrong and breaking the Internet for people like me. You would not believe how many sites don't get this right. Examples in the past week: WordPress, Mint's login page, JetBrains's entire site.

The correct thing to do is attach the event handler to the form itself.

$("#form-id").submit(function() {
    // Write code to submit the form with Javascript
    return false; // Prevents the default form submission
});

This will attach the event to the form however the user submits it. Note the use of return false to avoid submitting the form.

Validation

It's harder to break the Internet with validation. To give fast feedback loop to the user, you should detect and prevent invalid input on the client side.

The annoying thing is you have to do this on both the client side and the server side, in case the user gets past the client side checks. The good news is the browser can help with most of the easy stuff. For example, if you want to check that an email address is valid, use the "email" input type:

<input type="email" />

Then your browser won't actually submit a form that doesn't have a valid email. Similarly you can note required fields with the required HTML attribute. This makes validation on the client a little easier for most of the cases you're trying to check for.

Summary

You can submit forms with Javascript, but most of the time you'll have to put in extra effort to duplicate functionality that already exists in your browser. If you're going to go down that road, please put in the extra effort.

Helping Beginners Get HTML Right

If you've ever tried to teach someone HTML, you know how hard it is to get the syntax right. It's a perfect storm of awfulness.

  • Newbies have to learn all of the syntax, in addition to the names of HTML elements. They don't have the pattern matching skills (yet) to notice when their XML is not right, or the domain knowledge to know it's spelled "href" and not "herf".

  • The browser doesn't provide feedback when you make mistakes - it will render your mistakes in unexpected and creative ways. Miss a closing tag and watch your whole page suddenly acquire italics, or get pasted inside a textarea. Miss a quotation mark and half the content disappears. Add in layouts with CSS and the problem doubles in complexity.

  • Problems tend to compound. If you make a mistake in one place and don't fix it immediately, you can't determine whether future additions are correct.

This leads to a pretty miserable experience getting started - people should be focused on learning how to make an amazingly cool thing in their browser, but instead they get frustrated trying to figure out why the page doesn't look right.

Let's Make Things A Little Less Awful

What can we do to help? The existing tools to help people catch HTML mistakes aren't great. Syntax highlighting helps a little, but sometimes the errors look as pretty as the actual text. XML validators are okay, but tools like HTML Validator spew out red herrings as often as they do real answers. Plus, you have to do work - open the link, copy your HTML in, read the output - to use it.

We can do better. Most of the failures of the current tools are due to the complexity of HTML - which, if you are using all of the features, is Turing complete. But new users are rarely exercising the full complexity of HTML5 - they are trying to learn the principles. Furthermore the mistakes they are making follow a Pareto distribution - a few problems cause the majority of the mistakes.

Catching Mistakes Right Away

To help with these problems I've written an validator which checks for the most common error types, and displays feedback to the user immediately when they refresh the page - so they can instantly find and correct mistakes. It works in the browser, on the page you're working with, so you don't have to do any extra work to validate your file.

Best of all, you can drop it into your HTML file in one line:

</p>
<script type="text/javascript" src="https://raw.github.com/kevinburke/tecate/master/tecate.js"></script>
<p>

Then if there's a problem with your HTML, you'll start getting nice error messages, like this:

error message

Read more about it here, and use it in your next tutorial. I hope you like it, and I hope it helps you with debugging HTML!

It's not perfect - there are a lot of improvements to be made, both in the errors we can catch and on removing false positives. But I hope it's a start.

PS: Because the browser will edit the DOM tree to wipe the mistakes users make, I have to use raw regular expressions to check for errors. I have a feeling I will come to regret this. After all, when parsing HTML with regex, it's clear that the <center> cannot hold. I am accepting this tool will give wrong answers on some HTML documents; I am hoping that the scope of documents turned out by beginning HTML users is simple enough that the center can hold.

How to design your API SDK

I've worked with Twilio's client libraries pretty much every day for the last year and I wanted to share some of the things we've learned about helper libraries.

Should you have helper libraries?

You should think about helper libraries as a more accessible interface to your API. Your helper libraries trade the details of your authentication scheme and URL structure for the ease of "do X in N lines of code." If the benefits of a more accessible interface outweigh the costs, do it.

If people are paying to access your API (Twilio, AWS, Sendgrid, Stripe, for example), then you probably should write helper libraries. A more accessible API translates directly into more revenue for your company.

If you're two founders in a garage somewhere, maybe not. The gap between your company's success and failure is probably not a somewhat easier API interface. Writing a helper library is a lot of work, maybe one to four man-weeks depending on the size of your API and your familiarity with the language in question, plus ongoing maintenance.

You might not need a client library if your customers are all highly experienced programmers. For example the other day I wrote my own client for the Recaptcha API. I knew how I wanted to consume it and learning/installing a Recaptcha library would have been unnecessary overhead.

You may also not need a client library if standard libraries have very good HTTP clients. For example, the Requests library dramatically lowers the barrier for writing a client that uses HTTP basic auth. Developers who are familiar with Requests will have an easier time writing http clients. Implementing HTTP basic auth remains a large pain point in other languages.

How should you design your helper libraries?

Realize that if you are writing a helper library, for many of your customers the helper library will be the API. You should put as much care into its design as you do your HTTP API. Here are a few guiding principles.

  • If you've designed your API in a RESTful way, your API endpoints should map to objects in your system. Translate these objects in a straightforward way into classes in the helper library, making the obvious transformations - translate numbers from strings in the API representation into integers, and translate date strings such as "2012-11-05" into date objects.

  • Your library should be flexible. I will illustrate this with a short story. After much toil and effort, the Twilio SMS team was ready to ship support for Unicode messages. As part of the change, we changed the API's 'Content-Type' header from

application/json

to

application/json; charset=utf-8

We rolled out Unicode SMS and there was much rejoicing; fifteen minutes later, we found out we'd broken three of our helper libraries, and there was much wailing and gnashing of teeth. It turns out the libraries had hard-coded a check for an application/json content-type, and threw an exception when we changed the Content-Type header.

  • Your library should complain loudly if there are errors. Per the point on flexibility above, your HTTP API should validate inputs, not the client library. For example let's say we had the library raise an error if you tried to send an SMS with more than 160 characters in it. If Twilio ever wanted to ship support for concatenated SMS messages, no one who had this library installed would be able to send multi-message texts. Instead, let your HTTP API do the validation and pass through errors in a transparent way.

  • Your library use consistent naming schemes. For example, the convention for updating resources should be the same everywhere. Hanging up a call and changing an account's FriendlyName both represent the same concept, updating a resource. You should have methods to update each that look like:

$account->update('FriendlyName', 'blah');
$call->update('Status', 'completed');

It's okay, even good, to have methods that map to readable verbs:

$account->reserveNumber('+14105556789');
$call->hangup();

However, these should always be thin wrappers around the update() methods.

class Call {
    function hangup() {
        return $this->update('Status', 'completed');
    }
}

Having only the readable-verb names is a path that leads to madness. It becomes much tougher to translate from the underlying HTTP request to code, and much trickier to add new methods or optional parameters later.

  • Your library should include a user agent with the library name and version number, that you can correlate against your own API logs. Custom HTTP clients rarely (read: never) will add their own user agent, and standard library maintainers don't like default user agents much.

  • Your library needs to include installation instructions, preferably written at a beginner level. Users have varying degrees of experience with things you might take for granted, like package managers, and will try to run your code in a variety of different environments (VPS, AWS, on old versions of programming languages, behind a firewall without admin rights, etc). Any steps your library can take to make things easier are good. As an example, the Twilio libraries include the SSL cert necessary for connecting to the Twilio API.

How should you test your library?

The Twilio API has over 20 different endpoints, split into list resources and instance resources, which support the HTTP methods GET, POST, and sometimes DELETE. Let's say there are 50 different combinations of endpoints and HTTP methods in total. Add in implementations for each helper library, and the complexity grows very quickly - if you have 5 helper libraries you're talking about 250 possible methods, all of which could have bugs.

One solution to this is to write a lot of unit tests. The problem is these take a lot of time to write, and at some level you are going to have to mock out the API, or stop short of making the actual API request. Instead we've taken the following approach to testing.

  1. Start with a valid HTTP request, and the parameters that go with it.
  2. Parse the HTTP request and turn it into a piece of sample code that exercises an aspect of your helper library.
  3. Run that code sample, and intercept the HTTP request made by the library.
  4. Compare the output with the original HTTP request.

This approach has the advantage of actually checking against the HTTP request that gets made, so you can test things like URL encoding issues. You can reuse the same set of HTTP requests across all of your libraries. The HTTP "integration" tests will also detect actions that should be possible with the API but are not implemented in the client.

You might think it's difficult to do automated code generation, but it actually was not that much work, and it's very easy if you've written your library in a consistent way. Here's a small sample that generates snippets for our Python helper library.

def process_instance_resource(self, resource, sid, method="GET", params=None):
    """ Generate code snippets for an instance resource """
    get_line = '{} = {}.get("{}")'.format(self.instance_name, self.base, sid)
    if method == "GET":
        interesting_line = 'print {}.{}'.format(self.instance_name,
            self.get_interesting_property(resource))
        return "\n".join([get_line, interesting_line])
    elif method == "POST":
        update_line = '{} = {}.update("{}", {})'.format(
            self.instance_name, self.base, sid, self.transform_params(params))
        interesting_line = 'print {}.{}'.format(
            self.instance_name, self.get_interesting_property(resource))
        return "\n".join([update_line, interesting_line])
    elif method == "DELETE":
        return '{}.delete("{}")'.format(self.base, sid)
    else:
        raise ValueError("Method {} not supported".format(method))

Generating code snippets has the added advantage that you can then easily embed these into your customer-facing documentation, as we've done in our documentation.

How do people use helper libraries?

While pretty much every resource gets used in the aggregate, individual accounts tend to only use one or two resources. This suggests that your API is only being referenced from one or two places within a customer's codebase.

How should you document your helper library?

Per the point above, your library is probably being used in only one or two places in a customer's codebase. This suggests your customer is hiring your API to do a specific job. Your documentation hierarchy should be aligned around those jobs. Combined with the integration test/code snippet generator above, and you should have a working code example for every useful action in your API. You will probably also want to have documentation for the public library interface, such as the types and parameters for each method, but the self-service examples will be OK for 85% of your users.