Continuous Integration for Twilio (part 1)

cat1

Introduction

At Simply Business, we are currently working on a project to replace our existing call centre with Twilio’s TaskRouter and Voice APIs.

This project is very different from our normal insurance-related applications and we encountered a few problems we usually don’t face, mostly around how to setup continuous integration test and we would like to share our experience with you.

We didn’t come up with the final solution in a day. We started off by automating a simple user journey scenario, then iterated over our process to the point that it could be fully automated on Semaphore CI, an external Continuous Integration (CI) service.

Before you start reading this blog post, we assume you are familiar with the following concept, tools, and terminologies:

  • Ruby and a little bit of Javascript (there is a bit of React.js code snippet, but they are simple and self explanatory).
  • Twilio’s TaskRouter related terminologies, such as Worker and Task.
  • Behavior Driven Development (BDD) tools and techniques such as Cucumber, Gherkin, RSpec, and Mocking.

We will explain our journey across three blog posts explaining the following.

Fully automating Twilio integration tests consists of many steps which look daunting, but don’t give up yet. You don’t have to implement everything we show you. If you can find a couple of tricks and manage to incorporate into your test suits, you’ve gained something from this blog post. Go and get some coffee before you start!

How our system works and why integration tests are important

Just to give you an idea of how our application works, the below diagram is the modified version of how our frontend, backend, and Twilio work together with end users and and our call centre consultants (who act as the Worker in Twilio’s task router terminology).

flow diagram

It looks complex but here is the gist of the flow:

  • Our application sends a task to Twilio
  • Twilio calls back our backend (/task_assigned)
  • Once acknowledged, Twilio sends WebSocket event to frontend (reservation.created)
  • Once receiving the event on frontend, we call Worker.
  • Worker receives the call via frontend Soft phone (WebRTC encapsulated by Twilio.Device JS object)
  • Twilio calls back our backend (/worker_joined) and connects to a conference.
  • Once worker joins, backend calls user’s real phone.
  • User picks it up
  • Twilio calls back our backend (/user_joined) and connects to the conference.

We use sinatra.rb for a simple backend service and React.js for building Single Page App (SPA).

Before the user and worker join the same conference to start the conversation, we have to send many messages back and forth between our application and Twilio (I personally call it the “Twilio Dance”). As you can see, the core logic of our application is tightly integrated with the Twilio environment.

When you normally write integration tests for a third party, you tend to isolate and often stub out the component using some mocking libraries. However, due to this tight dependency on the third party, we wanted to have a set of integration tests which give us the confidence that our code integrates with Twilio properly every time we deploy to production environment.

Why testing Twilio integration is hard

At the beginning of the project, we asked Twilio for advice on the best practices around automating Twilio integration. Their answer was as following:

People usually mock API calls to Twilio via libraries like VCR. I don’t think we know any clients who write continuous integration tests >around Twilio integration. For testing front end behavior (such as actually receiving calls), manual testing is the normal approach.

This is understandable due to the following aspects of how Twilio integration works:

  • As part of the Task assignment, Twilio has to hit the callback endpoint of our backend. However, most CI platforms do not provide a public-facing endpoint.
  • Phone calls are implemented through WebRTC. Though the actual WebRTC call is abstracted via Twilio’s JS wrapper, the integration server does not necessarily have a Chrome browser with such devices supported.
  • Someone has to manually pick up a phone to test that calls are made

With these constraints in mind, we initially started with a “set of test tools to make manual automation easy”. Over the course of three months, we improved our configuration to remove repetitive tasks a bit by bit. Then one day, one of our engineers Peter said “Actually, we could just put this test onto a CI environment”. He spent a couple of hours working on this and it just worked!!

Throughout the rest of the blog posts, we will look back how we reached this point in three steps; hop!, step!!, and jump!!!

In the next blog post, we will start writing some code. Are you ready?