I like to build things. I figured this out when I was young and asked to build a Roman aqueduct out of cardboard and toilet paper rolls for a school project. Not only did I succeed in my first engineering project, but I remember tweaking it all week to ensure it never leaked and looked great too.
My most recent aqueduct wasn’t such - but rather - a multi-node socket.io node.js application at Understoodit.
I learned how-to:
It’s easy to build an application that runs - but can you make it scale? It’s not good enough to beef up your application server - it needs to scale horizontally. So, here’s how I went about my business.
1. Start with one-to-one:
This should be pretty straight forward. If you’re on Amazon, or any other cloud offering, you can move on. The only tricky part should be opening up the right port to the public.
2. Add a load balancer
A load balancer is a way to distribute workload across multiple servers. We use HAProxy, an open source, fast proxy server. It’s free, it’s flexible, it’s a topic of discussion online, and (most importantly) it just works! See documentation.
Once you install HAProxy - you need to run your haproxy.cfg file. Here is an example haproxy.cfg I used to get it up and running:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Nothing crazy here. Some things to keep note of:
- Uncomment the
debugstatement on line 4 to help you get going.
- Notice the front end called “unsecured” (line 13) is binding to port 80 (the standard open web server port). Ensure you can access your server on this port.
- Notice when the connection to the LB is made, it will connect to our default backend (line 20) called “www_backend”. Our default backend is our node.js server running on port 8001 (seen on line 25) of our local machine.
3. Add an SSL certificate!
Should be easy right? Well, with the release of HAProxy Development 1.5-dev17 it is A LOT easier. Before, I was using Stud with HAProxy. It worked, but it was another process I had to keep track of. Also, it would never forward my client’s IP. If your into masochism - give it a shot. Or, just use the new version of HAProxy.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
A couple additional things here:
- I have two frontends binding to two ports 80 and 443 - unsecure (line 13) and secure (line 17 & 22) respectively.
- When a client connects on the 80 port, they are redirected to the secure 443 port.
- The statement
bind 0.0.0.0:443 ssl crt /etc/haproxy/yourcert.pem(line 22) connects the client to HAProxy using SSL. HAProxy will then process SSL and connect to the server. SSL is a new feature of HAProxy, if you want more details on creating your .pem file - contact me directly - it can be a pain!
- Try this out - if you see a green lock like this in chrome - success!
4. Now, let’s get interesting - scale it!
The important thing with multiple socket.io servers is that we need socket.io to maintain our user’s state. We are using socket.io’s xhr-polling transport mechanism. When using xhr-polling, a socket.io process will consider a client disconnected if it doesn’t hear back. So, if we have two socket.io processes, it is important that all xhr-polling for one client is always directed to the same socket.io process. HAProxy helps ensure this by maintaining our user’s stickiness and manipulating our user’s cookie.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
There are a couple differences to the HAProxy backend in the config file.
- We first have to add
balance roundrobin. Round Robin rotates requests among backend servers and also accepts a weight parameter which specifies their relative weight.
- Next we must add the
cookie SERVERID insert indirect nocachestatement. This tells HAProxy to add a new parameter called SERVERID to our user’s cookie. This line is integral for ensuring that our user “sticks” to the same socket.io process - which inturn maintains our user’s state.
- Our keep-alive functionality (line 18) gives us a speed boost by maintianing our clients connection open to the web server.
- The only other thing we need to add is our second server and give both of our servers a cookie name. As seen in line 32 & 33 of the .cfg.
5. Test it!
Now, how do we know when our load balancer is actually working? Using Google Chrome’s developer tools, in the network tab, click on our latest GET request. From there, click on “Headers”. In the headers you will see your cookie. Within the cookie you should see a cookie parameter called SERVERID= (look familiar?)
This is HAProxy marking which which socket.io process it should route your requests to. Want to test the roundrobin-ing? Change the weight of your servers, clear your cache, and try and see what you get for SERVERID.
Once you have demonstrated you can scale two app severs, scaling out horizontally is a snap. Keep running more nodes - and keep letting HAProxy know where they live!
Funny enough, there are more similarities between Aqueducts and HAProxy than I originally thought!
Some other HAProxy links: