caustik's blog

programming and music

Scaling node.js to 100k concurrent connections!

with 26 comments

UPDATE: Broke the 250k barrier, too :]

The node.js powered sprites fun continues, with a new milestone:

That’s right, 100,004 active connections! Note the low %CPU and %MEM numbers in the picture. To be fair, the CPU usage does wander between about 5% and 40% – but it’s also not a very beefy box. This is on a $0.12/hr rackspace 2GB cloud server.

Each connection simulates sending a single sprite every 5 seconds. The destination for each sprite is randomize to an equal distribution across all nodes. This means there is traffic of 20,000 sprites per second, which amounts to 40,000 JSON packets per second. This doesn’t even include the keep-alive pings which occur on a 2-minute interval per connection.

At this scale, the sprite network topology remains very responsive. Tested using my desktop PC neighboring my laptop, throwing a sprite off the screen arrives at the laptop so fast that I can’t gauge any latency at all.

Here are a few key tweaks which contribute to this performance:

1) Nagle’s algorithm is disabled

If you’re familiar at all with real-time network programming, you’ll recognize this algorithm as a common socket tweak. This makes each response leave the server much quicker.

The tweak is available through the node.js API “socket.setNoDelay“, which is set on each long-poll COMET connection’s socket.

2) V8’s idle garbage collection is disabled via “–nouse-idle-notification”

This was critical, as the server pre-allocates over 2 million JS Objects for the network topology. If you don’t disable idle garbage collection, you’ll see a full second of delay every few seconds, which would be an intolerable bottleneck to scalability and responsiveness. The delay appears to be caused by the garbage collector traversing this list of objects, even though none of them are actually candidates for garbage collection.

I’m eager to experiment further by scaling this up to 250k connections. The only thing keeping that test from being run is the quota on my amazon EC2 account, which is limiting the number of simulated clients I can run simultaneously. They have responded to my request to increase quota, but sadly it hasn’t taken effect yet.

The sprites source code, both client and server, are available via subversion. The repository URLs are provided on the sprites web site.

http://sprites.caustik.com/

For more information about the testing and tweaks involved in scaling the server, check my previous post Node.js scalability testing with EC2.

Written by caustik

April 8th, 2012 at 9:09 am

26 Responses to 'Scaling node.js to 100k concurrent connections!'

Subscribe to comments with RSS or TrackBack to 'Scaling node.js to 100k concurrent connections!'.

  1. This is awesome stuff. I work on a lot of “traditional” stacks that often struggle with this scenario, especially if customers are flooding in requests or needing responses at a high rate. This had def piqued my interest and I look forward to your future posts on this topic.

    Just looking for the “validation” for Node in our systems.

    Adrian Pomilio

    8 Apr 12 at 8:17 pm

  2. Wow, did not know the “nouse-idle-notification” option. How do you do GC then?

    Nico

    9 Apr 12 at 5:26 pm

  3. You can use the flag “–expose_gc” to make the JS function “gc();” available. That triggers garbage collection at your whim, so for example it could run every hour or so via setTimeout or setInterval.

    I’m trying to find a way to detach JS Objects from the heap, for the dual purpose of excluding them from the time consuming GC traversal, and to exclude them from the address space limitations imposed by the heap.

    caustik

    9 Apr 12 at 6:08 pm

  4. Charlie

    9 Apr 12 at 9:16 pm

  5. I love you

    kapouer

    9 Apr 12 at 9:20 pm

  6. Charlie — I have been able to hit 250k concurrent with Node.js – the only current limitation appears to be V8’s heap addressing limitation (1GB) – combined with the overhead in JS heap per-connection in Node.js — basically, once you have 250k connections going, your JS heap is riiiiight at capacity and the garbage collector isn’t having any of that. The GC just starts churning away hopelessly trying to scrounge up enough memory to continue.

    I think it would be possible to skirt this issue using “cluster” in Node.js, since each child will have it’s own heap address space. I have a few other things to investigate, but if none of those pan out I’m going to go that route.

    caustik

    9 Apr 12 at 9:46 pm

  7. […] (continued from the sprites scalability experiments [250k] [100k]) […]

  8. […] With all the hype going on about Node.js and how well it can scale out […]

  9. Thank you for the advices ! We did a real time multiple content type stream last year that was struggling at 30k with no apparent reason. Will deffinitly try your way this year.

    Corentin

    9 Aug 12 at 9:30 am

  10. […] http://blog.caustik.com/2012/04/08/scaling-node-js-to-100k-concurrent-connections/ Share this:TwitterFacebookLike this:LikeBe the first to like this. Posted in: programming […]

  11. Kernel parameters or it didn’t happened.

  12. Great Job!!!

    GreatJob

    9 Aug 12 at 1:53 pm

  13. aside from ulimit -n, no kernel parameters were needed for this scale of connections

    caustik

    18 Aug 12 at 3:56 am

  14. Thanks for the reply !

  15. […] know why a thousand calls are stressing my system so much, though, since node.js handles many more clients perfectly; it must be true that setInterval() is CPU intensive (or I am doing something […]

  16. […] –nouse-idle-notifications. Playing with some of these tweaks seemed to help our performance. Scaling Node to 100k connections Scaling Node to 250k connections Escaping the 1.4Gb Heap […]

  17. What tool do you use for testing the Node.js workload? Is it a Firefox plugin?

    YongGang

    30 Oct 12 at 3:23 am

  18. Hrm I’d be interested to see a before/after result on a mature web app like Etherpad Lite. Would make for an interesting blog post too

    John McLear

    29 Jan 13 at 12:18 pm

  19. I’m new at node.js and I found your project very interesting. I couldnt find the source code (trunk is 404not found) is there any chance you will be able to provide the node.js code somewhere (like github) ; just wonder is the part that you fake 100k connections are in the code or not?

    Thanks
    Zareh

  20. Zareh – here’s a link to the sprite’s node.js source code http://pastebin.com/CkaGzzg1

    The connections are real tcp connections from external servers.

    caustik

    29 Jan 13 at 12:42 pm

  21. thank you

  22. […] by Scaling node.js to 100k concurrent connections! and Node.js w/250k concurrent connections!. I did some test for […]

  23. Hi, testing environment is static page where client does not change page or reload page usually, right?
    Did you ever test with dynamic environment where client change page much more, like contents/forums… website. Then socket close/open everytime client change page.

    canhnm

    17 May 13 at 3:24 am

  24. it would have been nice if there were some solid scripts that would let me fully reproduce the outcome. (it’s the cold fusion thing all over). Trust but verify.

    richard

    9 Jul 13 at 8:43 pm

  25. yea, that would have been nice. honestly, just didn’t pour a ton of time into this, it was mostly for my own curiosity.

    caustik

    12 Jul 13 at 3:27 pm

  26. I would like to download the client and server part to try it out with 100k sessions, would you please show me how. Thanks so much

    Tuan

    16 Aug 14 at 11:56 am

Leave a Reply