Niko's Project Corner

HTTP API load tester

Description HTPP API load tester
Languages C++
Tags FastCGI
Duration Fall 2013
Modified 3rd November 2013

When de­vel­op­ing REST­ful APIs, it is im­por­tant to know how many re­quests per min­ute the end point is able to serve. Be­cause of my in­ter­est in Ng­inx, FastCGI and multi-threaded C+++, I de­cided to de­velop my own in-broser HTTP load tester which sup­ports easy con­fig­ura­tion, any num­ber of par­al­lel load-gen­er­at­ing worker threads and real-time graph­ing based on jQuery pow­ered High­Charts li­brary.

The pro­ject started as a sim­ple com­mand line util­ity writ­ten in C++, which used pthreads for multi-thread­ing and hap­py­http for send­ing and re­ceiv­ing HTTP pack­ets. Con­fig­ura­tion is done by com­mand line pa­ram­eters, and the fi­nal out­put is writ­ten into a text file. Im­ple­mented con­fig­ura­tion flags are:

  • -accept: A list of accepted first characters of a valid response, default '*' (matches everything)
  • -hostname: Host name of the target, default 'localhost'
  • -incremental: Incremental route generation (an advanced feature), default no
  • -limit: The worker thread's query limit (x / minute), default 100
  • -output: Path of the final results file, default '/dev/null'
  • -port: Connection port, default 80
  • -route: Request route (appended after the host name), default '/'
  • -time: How long the test should be run (in seconds), default 2
  • -verbose: Display verbose output, default no
  • -workers: Number of parallel worker threads, default 1

The to­tal query rate is the "queries / min­ute / worker" × "num­ber of work­ers". For ex­am­ple set­ting "limit = 100" and "work­ers = 24" would set the tar­get query rate at 2400 queries / min­ute (or 40 / sec­ond). The num­ber of worker threads doesn't af­fect much the mem­ory us­age, be­cause con­fig­ura­tions and other needed ob­jects are dis­tributed to work­ers via const point­ers.

The pro­gram sup­ports quite flex­ible route def­ini­tions, most im­por­tantly ran­dom pa­ram­eter gen­er­ation. The syn­tax re­sem­bles reg­ular ex­pres­sions, where you can de­fine a range of char­ac­ters to use and the to­ken length. For ex­am­ple "/?search=[a-z]{4}_[0-9]{3}" would gen­er­ate routes such as "/?search=jiec_568", "/?search=bbdw_968" and "/?search=coux_937". Clearly the route has a fixed pat­tern, with four ran­dom char­ac­ters a - z and three char­ac­ters 0 - 9. This avoids in­cor­rect mea­sure­ments due to caching ef­fects.

An other sim­ulated case is when a user is typ­ing in a search term, in which case the API sees in­cre­men­tally length­en­ing query terms. This is achieved by set­ting =true and adding a spe­cial to­ken (ver­ti­cal pipe |) to the route def­ini­tion. For ex­am­ple "/?q=|[a-z]{2}_[0-9]". This may gen­er­ate the fol­low­ing HTTP GET se­quence: "/?q=d", "/?q=dl", "/?q=dl_", "/?q=dl_5", "/?q=g", "/?q=gq", "/?q=gq_", "/?q=gq_6". The ran­domly gen­er­ated parts are dl_5 and gq_6, but they are in­cre­men­tally con­structed and queried.

This is an ex­am­ple sum­mary of the per­centiles for two work­ers run­ning for 20 sec­onds:
Worker 1, ex­ecuted 34 re­quests (50th 0.032 sec, 95th 0.063 sec, 99th 0.063 sec)
Worker 2, ex­ecuted 33 re­quests (50th 0.028 sec, 95th 0.051 sec, 99th 0.051 sec)
Fin­ished af­ter 19.8643 sec­onds, to­talled 67 suc­cess and 0 fail­ures.

To­tal sum­mary:
25th 0.025 sec
50th 0.030 sec
90th 0.040 sec
95th 0.051 sec
99th 0.063 sec

To make the tool more user friendly and to provide graph­ing ca­pa­bil­ities, I de­cided to de­velop a FastCGI wrap­per for it. It was easy to con­fig­ure Ng­inx HTTP server to route re­quests to the FastCGI pro­cess, which gen­er­ates ei­ther HTML or JSON re­sponses. The ba­sic FastCGI func­tion­al­ity is achieved by us­ing the libfcgi li­brary and and spawn-fcgi util­ity pro­gram.

The first chal­lenge was to im­ple­ment per-tab ses­sions, be­cause I wanted any num­ber of users to have any num­ber of in­de­pen­dent ses­sions ac­tive si­mul­ta­ne­ously. This is achieved by not us­ing the stan­dard ses­sion ap­proach (per-do­main cook­ies), but in­stead car­ry­ing the SES­SID at­tribute in GET and POST pa­ram­eters. Ad­di­tion­ally each ses­sion times out in mere five sec­onds, and it is kept alive by an once-a-sec­ond AJAX call. Each AJAX call has a unique AJAXID as well, and the new id is gen­er­ated on the server side upon each re­quest and is sent back to the client pro­gram. When a ses­sion dies out, all ses­sion-speci­fic mem­ory needs to be cor­rectly freed to avoid mem­ory leaks.

When the client starts the load tester, new threads are started on the server side which are run on the back ground. Then the client's only re­main­ing task is to poll for new re­sults. The user may even re­fresh the browser win­dow, and it won't af­fect the back­ground test ex­ecu­tion.

An other chal­lenge was to choose and uti­lize a suit­able HTML tem­ple en­gine. I had prior ex­pe­ri­ence with cross-lan­guage cross-markup logic-less tem­plate sys­tem Mus­tache. It is al­ready im­ple­mented for C++ (see Plus­tache), but it de­pended on Boost's Regex and was miss­ing a few fea­tures. I de­vel­oped a sim­ple Mus­tache to­kenizer and out­put gen­er­ator with­out ex­ter­nal de­pen­den­cies in ~440 lines of code.

The HTML in­put form is au­to­mat­ically gen­er­ated from the in­ter­nal con­fig­ura­tion of the load tester con­fig­ura­tion class, which is also used for pars­ing the com­mand line ar­gu­ments. This en­sures that there is only one im­ple­men­ta­tion of the pro­gram con­fig­ura­tion, and that the two al­ter­nate op­tions are in­ter­nally iden­ti­cal and com­pat­ible. This is seen in fig­ure 1. It has one ad­di­tional con­fig­ura­tion flag "con­tin­uous", which af­fects whether the re­sult graphs at the bot­tom are up­dated real-time or only when the test fin­ishes.

Figure 1: The load tester con­fig­ura­tion panel.

The biggest ad­van­tage over the com­mand line in­ter­face is the pos­si­bil­ity of uti­liz­ing jQuery and any of the nu­mer­ous graph­ing li­braries out there. Ear­lier I have used Google Charts, but this time I wanted to test out the newer High­charts li­brary. The cur­rent ver­sion of the pro­ject pro­vides three views on the per­for­mance re­sults, and the first of them is shown in Fig­ure 2. It dis­plays each in­di­vid­ual re­quest's re­sponse time on a lin­ear time axis and log­arith­mic re­sponse time axis.

Figure 2: Plot­ting re­sponse times of each in­di­vid­ual re­quest on a time axis.

The sec­ond graph is shown in Fig­ure 3. From this graph we can con­firm that the tar­get re­quest rate per min­ute was in­deed achieved. There are small ran­dom de­vi­ations, but over­all we can see that the achieved re­quest rate was very close the to the tar­get.

Figure 3: Plot­ting the tar­get and achieved re­quest rate per min­ute.

The fi­nal graph is shown in Fig­ure 4. It sum­ma­rizes the re­sponse times on a his­togram with fixed bins, rang­ing from 0 to 1000 mil­lisec­onds. From this we can eas­ily es­ti­mate the me­dian be­ing at around 25 - 30 mil­lisec­onds, and that small per­cent­age of re­sponses took 50 - 100 mil­lisec­onds.

Figure 4: The sum­mary of the re­sponse times in a his­togram form.

Related blog posts: