568 Days Uptime, 200ms Global Response Time, and How We Do It

Uptime Image

Enterprise RADIUS, or ERAD, is Eleven’s secure 802.1x Wi-Fi authentication system that is used by leading hotel brands. I had the honor of building this system in 2015 and we continue to improve and maintain it. As of today, it has achieved continuous uptime of 568 days; no downtime for scheduled maintenance and no unexpected outages. This system also achieves worldwide response times of under 200 milliseconds.

Now, I don’t want to mislead you. These statistics are for the user interface component of ERAD. Most web technologies also have a backend component. In ERAD’s case, the API backend has achieved uptime of “only” 252 days.

How did we achieve these results? There are 3 keys to our success with the frontend component of ERAD:

  1. The frontend consists entirely of static files.
  2. The static files are hosted on AWS S3.
  3. The files are globally distributed using AWS CloudFront.

First, and in my opinion most important, is that the user interface is purely static. There is no server side processing — no dynamic logic. The files hosted on the server are sent as-is to the user’s browser. By doing this, we are able to host the files on Amazon’s S3 service, which has excellent up-time statistics.

Using purely static files does introduce some difficulties but these are good difficulties to have. It forces a separation of concerns, requiring application logic to be contained in the API and not the UI. For example, there is nowhere in the UI that a connection to the database can be made because doing so would mean embedding the database credentials in the source code that is sent to the user’s browser. Instead, the UI must interact with the database through the API, as would be expected when following best practices.

This configuration is highly stable and results in long uptime statistics because it is able to use mature services such as S3 and because it enforces best practices, which help improve stability. And because these files are hosted on S3, they can be globally distributed via AWS CloudFront, which is how we achieve 200ms response times