Zookeeper is an Apache project that was designed and implemented to help other developers build distributed applications. Many of the services that Zookeeper provides are a common set of services that all distributed applications require. Zookeeper implements them for you so that you don't have to implement them yourself, letting you focus on your application rather than on distributed infrastructure components, like naming, configuration management, synchronization, group services e.t.c.
Note: This is the first story in our encounter with Zookeeper. Following story can be found here.
This post is a brief introduction to Zookeeper. We will show you how to
with a Zookeeper server.
Then we will also show you how to interact with the Zookeeper server using Ruby.
Let's start.
Note: For those who work in OS X and who like to use
brew
, installing Zookeeper might be as simple asbrew install zookeeper
.
You can download Zookeeper from any site that is listed in the download page. Pick up a stable release by downloading the file inside the stable folder.
I have downloaded the file zookeeper-3.4.10.tar.gz
.
Then, I have unzipped/untarred the file into the folder ~/Documents/zookeeper-3.4.10
.
Before we can start the Zookeeper server, we will have to specify a minimum configuration file. Let's create the file conf/zoo.cfg
inside the
folder where you have your Zookeeper installation. This file can be created as a copy of the existing sample file.
In the Zookeeper folder:
$ cp conf/zoo_sample.cfg conf/zoo.cfg
The minimum configuration that will allow you to start 1 Zookeeper server is the following:
tickTime=2000
dataDir=/Users/pmatsino/Documents/zookeeper-data
clientPort=2181
As you can see, I have specified the dataDir
to be a directory in my Documents
folder. This folder, /Users/pmatsino/Documents/zookeeper-data
needs to be
present. Go ahead and create this folder before you continue. This directory is going to keep the memory snapshots and the transaction log for the updates
of the database. Note that Zookeeper keeps its state in memory in order to be efficient, but it flashes it into snapshots inside the dataDir
. Also, the updates
are atomic, and the transaction log is kept inside the dataDir
too.
The tickTime
is given in milliseconds and it is used to be the basic time unit for every Zookeeper configuration key that specifies time. It is used
to implement heartbeats and also, it specifies the minimum session timeout, which is going to be twice the tickTime
.
The clientPort
is the port the Zookeeper server is going to listen to.
With the above settings in place, it is very easy to start a Zookeeper server:
In the Zookeeper folder:
$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
$
Zookeeper data model is based on nodes, which are called znodes. Each node has a unique name which is composed of the path parts to reach that node.
Like we do with the file system and the folders and files. The root node is /
. Then you can create as many children nodes as you like.
And then children of children and so on. Here is another node: /foo/bar
. The node bar
is a child of node /foo
, which in turn is a child of the root node (/
).
Besides that, each node may have data attached to it. In fact, a node either has children nodes or data or both. But it cannot be without either of them. However, the data might be an empty string.
Now we can use the client tool that is provided with Zookeeper installation in order to connect to the Zookeeper server. Here it is how:
In the Zookeeper folder:
$ bin/zkCli.sh -server localhost:2181
...
2017-08-29 13:58:40,109 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15e2de052c30000, negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]
After lots of lines being printed on your terminal, you reach the [zk: localhost:2181(CONNECTED) 0]
which is the prompt that you can
use to send commands from the client to the server.
The help
command will list the commands that you can use:
[zk: localhost:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
history
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
quit
getAcl path
close
connect host:port
[zk: localhost:2181(CONNECTED) 1]
Now that we are using the command line interface, let's list the current nodes:
[zk: localhost:2181(CONNECTED) 1] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 2]
As you can see, we have one node, the /
in the zookeeper
namespace.
Let's create a new node, and then list the nodes again. Note that when we create a new node we give the data to attach to the node. In the
following example, "bar"
is a piece of string data to attach to node /foo
.
[zk: localhost:2181(CONNECTED) 4] create /foo bar
Created /foo
[zk: localhost:2181(CONNECTED) 5] ls /
[zookeeper, foo]
And we can get the details of a node:
[zk: localhost:2181(CONNECTED) 6] get /foo
bar
... ( more output here ) ...
dataLength = 3
numChildren = 0
[zk: localhost:2181(CONNECTED) 7]
Do you see the first line? bar
. It is the data associated to the node /foo
.
We can update the details of a node with the set
command:
[zk: localhost:2181(CONNECTED) 7] set /foo mary
... ( more output here ) ...
dataLength = 4
numChildren = 0
[zk: localhost:2181(CONNECTED) 8] get /foo
mary
... ( more output here ) ...
dataLength = 4
numChildren = 0
[zk: localhost:2181(CONNECTED) 9]
With set /foo mary
we update the data for the node /foo
to be the string mary
. We then confirm with get /foo
.
Let's now delete the /foo
node:
[zk: localhost:2181(CONNECTED) 9] delete /foo
[zk: localhost:2181(CONNECTED) 10] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 11]
The delete /foo
, deletes the /foo
node. We then confirm with ls /
.
Things are getting more interesting, of course, when you have multiple Zookeeper servers replicating your data. Working with 1 server is good while doing development. On the other hand, your production system will have to have more Zookeeper servers working in a replicated configuration.
Let's see how we can start 3 Zookeeper servers locally.
First, let's stop the server that is running at the moment:
In the Zookeeper folder:
$ bin/zkServer.sh stop
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
$
Create 3 different data directories for each one of the Zookeeper servers.
In ~/Documents
folder:
$ mkdir zookeeper-data-1
$ mkdir zookeeper-data-2
$ mkdir zookeeper-data-3
myid
filesInside the data directories, you need to create a myid
file with the id of each server. Let's keep it very simple for our demo:
In ~/Documents
folder:
$ echo '1' > zookeeper-data-1/myid
$ echo '2' > zookeeper-data-1/myid
$ echo '3' > zookeeper-data-1/myid
The ids of our servers will be 1
, 2
and 3
respectively.
Now, let's go to our conf
folder and create three different configurations. One for each of the servers. We will use these files
to start each server accordingly:
conf/1.cfg
The configuration file for the first server, with id 1
(create it inside ~/Documents/zookeeper-3.4.10/conf
):
tickTime=2000
dataDir=/Users/pmatsino/Documents/zookeeper-data-1
clientPort=2181
initLimit=10
syncLimit=5
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
Pay attention to the following. Since we are starting all servers in the same machine:
dataDir
is different per server. Here we specify the dataDir
for the server with id 1
.clientPort
is different per server.quorum
and leader
election ports are different for each server. Also, the configuration of a server needs to know the
quorum
and leader
election ports for the other servers too. Here we specify the 2888
and 3888
for the first server with id 1
.
Then 2889
and 3889
for server with id 2
. Finally, 2890
and 3890
for server with id 3
.Having said the above, let's create the configuration files for the other 2 servers:
conf/2.cfg
The configuration file for the server with id 2
(create it inside ~/Documents/zookeeper-3.4.10/conf
):
tickTime=2000
dataDir=/Users/pmatsino/Documents/zookeeper-data-2
clientPort=2182
initLimit=10
syncLimit=5
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
conf/3.cfg
The configuration file for the server with id 3
(create it inside ~/Documents/zookeeper-3.4.10/conf
):
tickTime=2000
dataDir=/Users/pmatsino/Documents/zookeeper-data-3
clientPort=2183
initLimit=10
syncLimit=5
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
start all
scriptEverything is ready. However, let's make our life a little bit easier by creating the start_all.sh
script inside the folder ~/Documents/zookeeper-3.4.10/bin
# !/usr/bin/env bash
bin/zkServer.sh start ~/Documents/zookeeper-3.4.10/conf/1.cfg
bin/zkServer.sh start ~/Documents/zookeeper-3.4.10/conf/2.cfg
bin/zkServer.sh start ~/Documents/zookeeper-3.4.10/conf/3.cfg
As you can see, we start the Zookeeper server by giving as argument the configuration file they need to use.
Make sure that the script is an executable:
In the Zookeeper folder:
$ chmod +x bin/start_all.sh
stop all
scriptSimilarly, let's create the stop_all.sh
script inside the folder ~/Documents/zookeeper-3.4.10/bin
:
# !/usr/bin/env bash
bin/zkServer.sh stop ~/Documents/zookeeper-3.4.10/conf/1.cfg
bin/zkServer.sh stop ~/Documents/zookeeper-3.4.10/conf/2.cfg
bin/zkServer.sh stop ~/Documents/zookeeper-3.4.10/conf/3.cfg
Don't forget to make it executable:
In the Zookeeper folder:
$ chmod +x bin/stop_all.sh
Let's now kick-off our replicated Zookeeper servers:
In the Zookeeper folder:
$ bin/start_all.sh
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/1.cfg
Starting zookeeper ... STARTED
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/2.cfg
Starting zookeeper ... STARTED
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/3.cfg
Starting zookeeper ... STARTED
$
Now, let's connect to server 1 and create some data and then quit:
In the Zookeeper folder:
$ bin/zkCli.sh -server 127.0.0.1:2181
...
WatchedEvent state:SyncConnected type:None path:null
[zk: 127.0.0.1:2181(CONNECTED) 0] create /replicated_demo three_servers
Created /replicated_demo
[zk: 127.0.0.1:2181(CONNECTED) 1] quit
$
Now, let's connect to server 3 and confirm that we have access to the same data:
In the Zookeeper folder:
$ bin/zkCli.sh -server 127.0.0.1:2181
...
WatchedEvent state:SyncConnected type:None path:null
[zk: 127.0.0.1:2183(CONNECTED) 0] ls /
[zookeeper, test, replicated_demo]
[zk: 127.0.0.1:2183(CONNECTED) 1] get /replicated_demo
three_servers
... ( more output here ) ...
dataLength = 13
numChildren = 0
[zk: 127.0.0.1:2183(CONNECTED) 2] quit
$
Bingo! The same data is available via all the servers. Try the second one too. And this is the idea behind the replicated Zookeeper.
Let's now stop all servers:
In the Zookeeper folder:
$ bin/stop_all.sh
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/1.cfg
Stopping zookeeper ... STOPPED
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/2.cfg
Stopping zookeeper ... STOPPED
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/conf/3.cfg
Stopping zookeeper ... STOPPED
$
Zookeeper provides client bindings for Java and C. But you can also use zookeeper gem which allows you to access a Zookeeper server using Ruby. Let's see an example:
Start again the single instance Zookeeper server:
In the Zookeeper folder:
$ bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /Users/pmatsino/Documents/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
$
zookeeper
gemIn the Zookeeper folder:
$ gem install zookeeper --no-ri --no-rdoc
Building native extensions. This could take a while...
Successfully installed zookeeper-1.4.11
1 gem installed
$
Start irb
and issue commands to Zookeeper server using Ruby:
In the Zookeeper folder:
$ irb
irb(main):001:0> require 'zookeeper'
=> true
irb(main):002:0> zookeeper = Zookeeper.new('127.0.0.1:2181')
=> #<Zookeeper::Client:0x007fc2b9cc3c78 @host="127.0.0.1:2181", @chroot_path="", @req_registry=#<Zookeeper::RequestRegistry...>>>
irb(main):003:0> zookeeper.get_children(path: '/')
=> {:req_id=>0, :rc=>0, :children=>["zookeeper"], :stat=>#<Zookeeper::Stat:0x007fc2b9ca9788 @exists=true, @czxid=0, @mzxid=0, @ctime=0, @mtime=0, @version=0, @cversion=3, @aversion=0, @ephemeralOwner=0, @dataLength=0, @numChildren=1, @pzxid=22>}
irb(main):004:0>
It is done with the method #get_children
, which takes as input the path
key. Do you see the result containing :children=>["zookeeper"]
?
irb(main):004:0> zookeeper.create(path: '/foo', data: 'bar')
=> {:req_id=>1, :rc=>0, :path=>"/foo"}
irb(main):005:0> zookeeper.get_children(path: '/')
=> {:req_id=>8, :rc=>0, :children=>["zookeeper", "foo"], :stat=>#<Zookeeper::Stat:0x007fc2b9bf98b0 @exists=true, @czxid=0, @mzxid=0, @ctime=0, @mtime=0, @version=0, @cversion=6, @aversion=0, @ephemeralOwner=0, @dataLength=0, @numChildren=2, @pzxid=28>}
irb(main):012:0>
Do you see that the children
now has value ["zookeeper", "foo"]
?
irb(main):005:0> zookeeper.get(path: '/foo')
=> {:req_id=>2, :rc=>0, :data=>"bar", :stat=>#<Zookeeper::Stat:0x007fc2b9c82e08 @exists=true, @czxid=25, @mzxid=25, @ctime=1504016747266, @mtime=1504016747266, @version=0, @cversion=0, @aversion=0, @ephemeralOwner=0, @dataLength=3, @numChildren=0, @pzxid=25>}
irb(main):006:0>
irb(main):006:0> zookeeper.set(path: '/foo', data: 'mary')
=> {:req_id=>3, :rc=>0, :stat=>#<Zookeeper::Stat:0x007fc2b9c6bca8 @exists=true, @czxid=25, @mzxid=26, @ctime=1504016747266, @mtime=1504016802433, @version=1, @cversion=0, @aversion=0, @ephemeralOwner=0, @dataLength=4, @numChildren=0, @pzxid=25>}
irb(main):007:0> zookeeper.get(path: '/foo')
=> {:req_id=>4, :rc=>0, :data=>"mary", :stat=>#<Zookeeper::Stat:0x007fc2b9c58158 @exists=true, @czxid=25, @mzxid=26, @ctime=1504016747266, @mtime=1504016802433, @version=1, @cversion=0, @aversion=0, @ephemeralOwner=0, @dataLength=4, @numChildren=0, @pzxid=25>}
irb(main):008:0>
irb(main):008:0> zookeeper.delete(path: '/foo')
=> {:req_id=>5, :rc=>0}
irb(main):009:0> zookeeper.get_children(path: '/')
=> {:req_id=>6, :rc=>0, :children=>["zookeeper"], :stat=>#<Zookeeper::Stat:0x007fc2b9c325e8 @exists=true, @czxid=0, @mzxid=0, @ctime=0, @mtime=0, @version=0, @cversion=5, @aversion=0, @ephemeralOwner=0, @dataLength=0, @numChildren=1, @pzxid=27>}
irb(main):010:0>
Zookeeper will take a lot of the burden off your back, when designing and developing a distributed application. That was a first introduction to Zookeeper with the very basics of it.
Thank you for reading this blog post. And don't forget that your comments below are more than welcome. I am willing to answer any questions that you may have and give you feedback on any comments that you may post. I would like to have your feedback because I learn from you as much as you learn from me.
Panayotis Matsinopoulos works as Development Lead at Simply Business and, on his free time, enjoys giving and taking classes about Web development at Tech Career Booster.
Want to know more about what it's like to work in tech at Simply Business? Read about our approach to tech, then check out our current vacancies.
Find out moreWe create this content for general information purposes and it should not be taken as advice. Always take professional advice. Read our full disclaimer
Keep up to date with Simply Business. Subscribe to our monthly newsletter and follow us on social media.
Subscribe to our newsletter6th Floor99 Gresham StreetLondonEC2V 7NG
Sol House29 St Katherine's StreetNorthamptonNN1 2QZ
© Copyright 2023 Simply Business. All Rights Reserved. Simply Business is a trading name of Xbridge Limited which is authorised and regulated by the Financial Conduct Authority (Financial Services Registration No: 313348). Xbridge Limited (No: 3967717) has its registered office at 6th Floor, 99 Gresham Street, London, EC2V 7NG.