background

Niko's Project Corner

Service discovery with Docker, Consul and Registrator

Description Automated and reliable service catalog + key-value store.
Languages Bash
Tags Ar­chi­tec­ture
Docker
Databases
Ng­inx
Duration Spring 2016
Modified 28th August 2016
GitHub nikonyrh/docker-scripts
thumbnail

Tra­di­tion­ally com­put­ers were named and not eas­ily re­placed in the event it broke down. Server soft­ware was lis­ten­ing on a hard-coded port, and to link pieces to­gether these ma­chine names and ser­vice ports were hard-coded into other soft­ware's con­fig­ura­tion files. Now in the era of cloud com­put­ing and ser­vice ori­ented ar­chi­tec­ture this is no longer an ad­equate so­lu­tion, thus elas­tic scal­ing and ser­vice dis­cov­ery are be­com­ing the norm. One easy so­lu­tion is to com­bine the pow­ers of Docker, Con­sul and Reg­is­tra­tor.

Docker is the most pop­ular plat­form for build­ing and de­ploy­ing soft­ware in a very well iso­lated con­tain­ers with­out hav­ing to in­stall its de­pen­den­cies on the host OS. It has made it easy to hor­izon­tally scale in­di­vid­ual ser­vices by just start­ing more in­stances of the same im­age but bind­ing them to dif­fer­ent ports. The ques­tion re­mains how clients know to which ips and ports to con­nect to, as this in­for­ma­tion may change at any time. On larger scale where or­ches­tra­tion and auto-scal­ing is needed pro­jects such as Ku­ber­netes and Apach Mesos provide great value. They typ­ically have built-in so­lu­tion for ser­vice dis­cov­ery, but they might be overkill for sim­pler sce­nar­ios. An other com­mon pat­tern is to have all re­quests to go through a load-bal­ancer which is reg­is­tered to DNS. In its sim­plest form this re­sults in an un­de­sired sin­gle point of fail­ure. On the other hand Ng­inx is very ro­bust soft­ware which should con­tinue work­ing as long as the net­work and the in­stance con­tinue work­ing with­out in­ter­rup­tions.

As men­tioned in the in­tro­duc­tion, this ar­ti­cle is about Con­sul and Reg­is­tra­tor. Con­sul is a dis­tributed key-value store with em­pha­sis on the dis­tributed na­ture of mod­ern ar­chi­tec­ture. It also has ser­vice dis­cov­ery ori­ented HTTP API, which makes in­te­gra­tion with other soft­ware triv­ial. It can be started in ei­ther server or client mode, the dif­fer­ence be­ing that clients only act as a "gate­way" to the in­for­ma­tion stored in the clus­ter but do not store a copy of the data. To start a client it needs to know from which IP to find any mem­ber of the clus­ter, from there the client learns about which other nodes ex­ist, who is the mas­ter and so forth. If there ex­ists a Con­sul server or client on each com­puter in­stance then it makes triv­ial for other soft­ware on that ma­chine to query the database, as they can al­ways just con­nect to 127.0.0.1.

Server nodes can join and leave the clus­ter at any time, al­though you shouldn't abruptly re­move too many server nodes at once (with­out de-reg­is­ter­ing them first) as the clus­ter would lose the re­quired quo­rum. When a new clus­ter is cre­ated one has to tell the first Con­sul agent the ex­pected num­ber of nodes via the -boot­srap-ex­pect pa­ram­eter. Then other agents are told to con­nect to the 1st agent. If Con­sul is run in­side a con­tainer the docker run com­mand grows quite ver­bose, so I wrote the start­Con­sul­Con­tainer.sh wrap­per script. It sup­ports start­ing Con­sul server, Con­sul client and Reg­is­tra­tor con­tain­ers.

Reg­is­tra­tor is used to auto-reg­is­ter all run­ning con­tain­ers to Con­sul's ser­vice cat­alog. It also sup­ports other back­ends such as Etcd and Zookeeper. To know which con­tain­ers are run­ning, the docker pro­cess' socket has to be mounted there as a vol­ume via -v /var/run/docker.sock:/tmp/docker.sock com­mand line ar­gu­ment, start­Con­sul­Con­tainer.sh does this au­to­mat­ically. Reg­is­tra­tor in­spects the en­vi­ron­ment vari­ables of each con­tainer and uses them to auto-gen­er­ate tags and other meta­data to Con­sul's ser­vice cat­alog. With good con­ven­tions it is easy to find all ser­vices of type X, in en­vi­ron­ment Y (like test, qa or pro­duc­tion) or client Z. Given all this it is easy to im­ple­ment ser­vice dis­cov­ery, auto-con­fig­ure load bal­ancers, have health mon­itor­ing and so forth. Also with docker as its only de­pen­dency it works the same in on-premises hard­ware as well as in the cloud.


Related blog posts:

AnalyticsPlatform
NginxBridge
BenchmarkTaxiridesEsSql
InternalNetwork
CljTaxirides