Single Node Cluster
In this section, we describe setting up a CitusDB cluster of one master and two worker databases. To avoid configuring authentication settings, we initialize these databases on a single node, and give our examples in that context. This single node cluster executes the exact same logic as a multiple node one; and we later describe the steps needed to set up authentication in between multiple nodes.
To get started with this cluster, you first need to download the corresponding Citus DB package from the downloads page. Alternatively, you can also use the command prompt for downloads if you already know the package's name. For example, if you are running Fedora 12+ or Ubuntu 10.04+ on a 64-bit machine:
localhost# wget http://packages.citusdata.com/readline-6.0/citusdb-2.0.1-1.x86_64.rpm localhost# sudo rpm --install citusdb-2.0.1-1.x86_64.rpm OR localhost# wget http://packages.citusdata.com/readline-6.0/citusdb-2.0.1-1.amd64.deb localhost# sudo dpkg --install citusdb-2.0.1-1.amd64.deb
When you install the package, our installer puts all binaries and libraries under /opt/citusdb/2.0, and also creates a data subdirectory to store the newly initialized database's contents. The installer then sets the owner for both directories to the current logged in user. If the installer can't determine the current user, it creates a new postgres user that owns the database directories.
In distributed setups, one database per node is enough. However, we need at least two more databases on this node to simulate our distributed behavior. We therefore assume that the already installed database will act as the master, and go ahead and initialize two more worker databases:
localhost# /opt/citusdb/2.0/bin/initdb -D /opt/citusdb/2.0/data.9700 localhost# /opt/citusdb/2.0/bin/initdb -D /opt/citusdb/2.0/data.9701
With these commands, we now have one master and two worker databases installed. We next need to tell the master database about the workers. For this, you'd open the master's membership file, and append worker database names to this file:
localhost# emacs -nw /opt/citusdb/2.0/data/pg_worker_list.conf # HOSTNAME [PORT] [RACK] localhost 9700 localhost 9701
Note that certain OS versions refer to the localhost as localhost.localdomain; and the actual name can be checked by running the hostname command. Once you set up the membership file, you can now start one master and two worker databases on different ports. In the following, we manually pass the port number as part of startup options, but you can also edit the port setting in the postgresql.conf configuration file.
localhost# /opt/citusdb/2.0/bin/pg_ctl -D /opt/citusdb/2.0/data -l logfile start localhost# /opt/citusdb/2.0/bin/pg_ctl -D /opt/citusdb/2.0/data.9700 -o "-p 9700" \ -l logfile.9700 start localhost# /opt/citusdb/2.0/bin/pg_ctl -D /opt/citusdb/2.0/data.9701 -o "-p 9701" \ -l logfile.9701 start
Now, you can connect to the master database and start issuing queries against your cluster. We cover some example queries in the next section.