Installation and Configuration

Planning resources

Hardware configuration needed for running the DLT will depend on the required throughput and latency requirements. Cassandra as a component, has the most CPU, disk IO, and memory requirements. In most cases, the performance will be either constrained by CPU usage or disk IO latency. We recommend running monitoring CPU and disk IO latencies for Cassandra instances to make sure Cassandra servers have enough CPU and disk bandwidth. Using server-attached disks provide higher throughput compared to network attached disks because of lower disk latency.

Core and Apollo have very little system requirements and can be installed together on the same instance. However, we recommend running these applications on instances separate from Cassandra instances.

Installation and configuration

This section will describe how to install DLT stack. The procedures hereafter do not assume any particular cloud provider and should be applicable to any infrastructure provider and bare metal setup.

As previously mentioned, there are three layers in DLT and installation instructions for each layer are provided below. The instructions mentioned in this guide can be used to create automated scripts to set up the stack in a particular infrastructure environment. Such a set of scripts for AWS is available here.

Step 1: Install Cassandra

Cassandra version 2.2.11 is used in the DLT stack. The following snippets can be used to install Cassandra in ALL instances for the cluster.

# Add Apache repository 
$ cat > /etc/yum.repos.d/cassandra.repo <<EOF
	[cassandra]
	name=Apache Cassandra
	baseurl=https://www.apache.org/dist/cassandra/redhat/22x/
	gpgcheck=0
	repo_gpgcheck=0
	gpgkey=https://www.apache.org/dist/cassandra/KEYS
	EOF
  # Install Cassandra
$ yum install cassandra

Step 2: Configure Cassandra

Once Cassandra is installed on all instances, it needs to be configured. Seed nodes should be chosen before configuration. One or two instances from the Cassandra cluster can be designated as the seed nodes. Note the IP address of each seed node. Next, change the following values in the Cassandra configuration file /etc/cassandra/conf/cassandra.yaml for each Cassandra instance. All other properties in the configuration file can be left as default values.

cluster_name: Orb DLT
seed_provider:
  - class_name: org.apache.cassandra.locator.SimpleSeedProvider
    parameters:
         - seeds: "<Seed node 1 IP>", "<Seed node 2 IP>"
listen_address: <private node ip>
rpc_address: <private node ip>
endpoint_snitch: GossipingPropertyFileSnitch
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 2

After the configuration has been completed, start Cassandra one by one on the seed nodes. After the seed nodes are UP, the other nodes can all be started at once.

$ sudo service cassandra start

After the service has been started on all nodes, the status of the cluster can be checked on any one node using nodetool status command. The expected output should look like this:

$ nodetool status
  Datacenter: datacenter1
  ===============
  Status=Up/Down
  |/ State=Normal/Leaving/Joining/Moving
  --  Address    Load       Tokens    Owns     Host ID                               Rack
  UN  10.0.0.10  103.26 KB  256       100.0%   8cb0b05d-f55d-4792-8512-cdbabeabe0b1  rack1
  UN  10.0.0.20  101.91 KB  256       100.0%   5775d3d3-e4be-4e80-9113-85920426ca98  rack1
  UN  10.0.0.30  102.11 KB  256       100.0%   8fb8b86d-67fa-4de7-b9e6-13fd04b6ab92  rack1

Step 3: Install Apollo

The next step is to install Apollo. It is recommended that you install Apollo in instances separate from instances running Cassandra.
Apollo is a Java application. Java needs to be installed as a prerequisite. We recommend Oracle JDK v1.8. The following steps provide an example of how to install Java along with some other prerequisites.

# Install Java

$	S3_STORE="s3://imagine-orb-com-k8s-state-store"
$	aws s3 cp $S3_STORE/jdk/jdk-8u144-linux-x64.tar.gz /tmp/
$	tar -xzf /tmp/jdk-8u144-linux-x64.tar.gz -C /usr/lib/jdk
$	ln -sf /usr/lib/jdk/jdk1.8.0_144/bin/java /usr/bin/java
$	rm /tmp/jdk-8u144-linux-x64.tar.gz

Now, download the Apollo jars and configuration files from S3. In the future these files might be available from other sources as well.

$	S3_STORE="s3://imagine-orb-com-k8s-state-store"
$	tar_name=apollo.tar.gz

	# Configuration Files
$	aws s3 cp --recursive $S3_STORE/applications/apollo/conf/ /etc/apollo/conf/

	# Apollo jars
$	aws s3 cp $S3_STORE/applications/apollo/$VERSION/$tar_name /usr/local/bin/apollo/$VERSION/
$	tar -xzf /usr/local/bin/apollo/$VERSION/$tar_name -C /usr/local/bin/apollo/$VERSION/
$	ln -sf /usr/local/bin/apollo/$VERSION/jars /apollo/

Lastly, configure Cassandra IP address in the /etc/apollo/conf/storage.yaml file. The file should look something like this:

# Configurations in Apollo Storage (/etc/apollo/conf/storage.yaml)

storage_class: com.orb.apollo.storage.api.CassandraStorage

endpoints:
    - 10.0.0.10
    - 10.0.0.20
    - 10.0.0.30

Apollo is now ready for initiation. We recommend using an upstart/systemd service file to run Apollo, depending on your linux distro. The expected output after running the server is shown below. By default Apollo server will be listening on port 9090.

$	java -cp "/apollo/jars/netty-all-4.0.42.Final.jar:/apollo/jars/*" com.orb.apollo.thrift.Server
	2018-01-17 14:30:49,728 [INFO  com.orb.apollo.thrift.Server] Apollo Thrift server is started.

Step 4: Install Core

The final step in installation is to install Core. Core is a single binary application. The same binary can be used as Coin Core, Account Core or Query Core depending on the arguments passed to it at run time. Core has no dependency requirements.

Download the binary file from AWS S3 using the following steps. The $VERSION variable should be chosen from the list available with the DevOps team.

$	S3_STORE="s3://imagine-orb-com-k8s-state-store"

	# Download the core binary 
$	mkdir -p /usr/local/bin/core/$VERSION
$	aws s3 cp $S3_STORE/applications/core/$VERSION/core /usr/local/bin/core/$VERSION/
$	ln -sf /usr/local/bin/core/$VERSION/core /usr/bin/core

Step 5: Configure Core

After successfully installing Core, Coin, Account and Query configuration files should be made available. A set of such configuration files is also available in S3. The following steps show how to download from S3, although you can choose to have your own configuration files. Do note that, currently, once the DLT stack has been made operational with a particular set of configuration files, it is not possible to change the configuration files.

# Download configuration
$	mkdir -p /etc/core/conf
$	aws s3 cp --recursive  $S3_STORE/applications/core/conf/ /etc/core/conf/

The binary can be used to start Coin Core, Account Core and Query Core processes as follows. If you have installed Apollo server on the same instance, then in default cases apollo ip:port will be localhost:9090. We recommend writing systemd/upstart service files for these services as well.

# Start Coin Core
$	/usr/bin/core coin -tls=false -config=/etc/core/conf -listen=0.0.0.0:8080 	-log=warning 	-storage-host=<apollo ip:port> -keyspace=orb_coin

	# Start Account Core
$	/usr/bin/core account -tls=false -config=/etc/core/conf -listen=0.0.0.0:8080 	-log=warning 	-storage-host=<apollo ip:port> -keyspace=orb_coin

	# Start QueryCore
$	/usr/bin/core query -tls=false -config=/etc/core/conf -listen=0.0.0.0:8080 	-log=warning 	-storage-host=<apollo ip:port> -keyspace=orb_coin

Cassandra Operations

All databases need periodic checks and maintenance. We advise you to run repair and backup procedure for you Cassandra cluster regularly. Repair ensures that all the nodes in the cluster are always in sync with each other. Backup will make sure you don’t lose the data in the unlikely event of a major failure of the Cassandra cluster. The following sections will describe each of these operations in detail.

Repair

Being a distributed database system, there can be scenarios when data on a node is out of sync with the rest of the cluster. This can occur if the node was down for some time or was unreachable due to network issues. Although Cassandra is designed to handle such scenarios automatically, it is recommended to run the repair command on each node periodically. This command syncs data on the node with the cluster.

The repair command can be run as follows on a node running Cassandra. Make sure to run it on each node running Cassandra. It is recommended to run it at least every 10 days. cron can be used to schedule this repair.

$	nodetool repair

The repair process is resource intensive. It is advisable to run repair process on only one node at a time. To manage the repair command, we recommend using Cassandra Reaper. To use Cassandra Reaper, remote JMX needs to be enabled on Cassandra. This can be done by disabling local JMX in the etc/cassandra/conf/cassandra-env.sh file. Next, install Reaper and configure it to repair the DLT Cassandra cluster. Reaper has a web based UI. SSH Tunneling can used to access the web based UI for configuration.

Backup

Cassandra database should be backed up periodically. The backups can help restore data or even help in point-in-time restore. To set up backup procedure, download the backup script tar on every Cassandra node. Note that each node needs to be backed up separately.

After downloading the backup scripts, configure it using the file conf/backup.yaml as follows:

# Application name
app_name: ORB_DLT

# keyspaces:
cass_keyspaces:
  system:
  Orb_coin:

# cassandra's config file path
cass_conf_path: /etc/cassandra/conf/cassandra.yaml

# the period to upload incremental backup file
cass_upload_upload_period_incremental_backup: 86400

The backup scripts will back up the Cassandra data files on the node and upload them to AWS S3. Backup is triggered using the following command.

$	AWS_S3_BUCKET_NAME=<bucket_name> backup.sh SNAP

Use cron to schedule the backup every week.

Monitoring

We recommend configuring scripts to monitor application health for all the three components of the DLT stack. AWS Cloudwatch can be used to store and view various stats about Cassandra, Apollo and Core applications. The following parameters can be tracked:

Cassandra: Cassandra is the only stateful application in the stack. It stores the entire ledger data. Key parameters in Cassandra that need monitoring are its process liveness, disk parameters and CPU usage. We recommend the following parameters:

Service status: Ideally, all the nodes should be up and running always. However, if you are running Cassandra at replication factor 3, the cluster will continue to be operational even with one node down. Use the service status metric to alert you in case a node goes down. This will lower the chances of more than two nodes going down together.
Remaining disk space: Ideally, the nodes should always be running at less than 60-70% disk usage. This is required to allow backups to run smoothly. In case the usage percentage goes up, either move to a larger instance or increase the disk space.
Disk read and write IO statistics: The disk IOPS (read and write) can become bottlenecks for ledger transactions. We have observed roughly 2K write IOPS for 20 ledger transactions per second. If you run into throughput issues, check if the disk IOPS are close to available limits.
CPU Usage: This is another possible bottle neck to high throughput. If you are constantly running at high CPU load, you will either need to increase the number of Cassandra nodes or use larger servers.

Apollo and Core

Memory Usage: Largely needed to make sure the application does not have memory leaks.
CPU Usage: Both Apollo and Core have very minimal CPU requirements. Both combined use 10% CPU on t2.large AWS instance at 20 transactions per second.

AWS Cloudwatch can be an easy way to setup a monitoring tool for the above parameters. It provides basic monitoring like CPU usage without any setup on the part of the developer. For more custom parameters, scripts here can be used to upload these metrics to cloudwatch. The AWS Cloudwatch console can be used to set up a dashboard to view these metrics.

Wrapping Up

To summarise, the DLT stack consists of Core, Apollo and Cassandra. This guide aims to explain how to install and connect each of those components. Also described are methods to run the DLT setup smoothly. Even though this guide largely focuses on AWS, it is not very difficult to adapt the steps mentioned here to run on any other cloud platform.