In this tutorial, we'll explain how to install Cassandra NoSQL Database on AlmaLinux 8.
Apache Cassandra is a distributed NoSQL database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle large amounts of data across multiple servers with no single point of failure.
Known for its high availability, fault tolerance, and ability to scale horizontally, Cassandra is ideal for applications that require high performance, reliability, and seamless data distribution across many nodes. It uses a decentralized, peer-to-peer architecture, making it well-suited for handling massive, real-time data workloads across geographically distributed environments.
Note: We have tried to install Apache Cassandra on AlmaLinux 9 but cqlsh command didn't work. We haven't found any good guide to resolve it. On AlmaLinux 8, by default Python 3.6 installed which is deprecated to use in Cassandra.
Prerequisites
Before you begin, ensure you have the following:
- A dedicated server or KVM VPS running AlmaLinux 8 or later.
- A user with sudo privileges.
- At least 2GB of RAM. For this tutorial, we have used 4GB RAM server.
Step 1: Update the System
Start by updating your package lists and upgrading your existing packages to the latest versions.
sudo dnf update -y
Step 2: Install Java
Cassandra requires Java to run. Install OpenJDK, which is a free and open-source implementation of the Java Platform.
sudo dnf install java-17-openjdk-devel -y
Verify the Java installation:
java -version
Update the current Java. By default, in AlmaLinux Java 1.8 installed. We need to change it to Java 17 because Apache Cassandra 5.0 requires Java 17.
sudo update-alternatives --config java
Output:
There are 2 programs which provide 'java'.
Selection Command
-----------------------------------------------
*+ 1 java-1.8.0-openjdk.x86_64 (/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.422.b05-2.el9.x86_64/jre/bin/java)
2 java-17-openjdk.x86_64 (/usr/lib/jvm/java-17-openjdk-17.0.12.0.7-2.el9.x86_64/bin/java)
Enter to keep the current selection[+], or type selection number: 2
You should see output similar to:
openjdk version "17.0.12" 2024-07-16 LTS
OpenJDK Runtime Environment (Red_Hat-17.0.12.0.7-1) (build 17.0.12+7-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-17.0.12.0.7-1) (build 17.0.12+7-LTS, mixed mode, sharing)
Step 3: Install Apache Cassandra
Apache Cassandra is not available in the default AlmaLinux repositories, so you need to add the official Cassandra repository to your system. First visit official documentation page and get the latest stable version. Official download page.
Create a new repository file:
sudo vi /etc/yum.repos.d/cassandra.repo
Add the following content to the file:
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/50x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
Save and close the file.
Once the repository is added, update your package lists and install Cassandra.
sudo dnf install cassandra -y
After the installation is complete, start the Cassandra service and enable it to start on boot.
sudo service cassandra start
sudo systemctl enable cassandra
Check the status of the Cassandra service to ensure it's running properly:
sudo systemctl status cassandra
Step 4: Configure Cassandra
Cassandra’s main configuration file is located at /etc/cassandra/conf/cassandra.yaml
. You can customize it according to your needs.
Open the configuration file:
sudo vi /etc/cassandra/conf/cassandra.yaml
Here are some key configurations you might want to adjust:
- cluster_name: The name of your Cassandra cluster.
- seeds: A comma-separated list of seed nodes in the cluster.
- listen_address: The IP address that Cassandra binds to.
- rpc_address: The address that clients will use to connect to Cassandra.
After making changes, save and close the file.
Step 5: Verify Cassandra Installation
To verify that Cassandra is installed and running, use the nodetool utility, which is included with the Cassandra installation.
sudo nodetool status
You should see output indicating that your node is up and running.
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 114.71 KiB 16 100.0% 224eda87-fac9-4b24-a61d-417d00e17fd5 rack1
Step 6: Connect to the Cassandra Database
Cassandra comes with a command-line tool called cqlsh
(Cassandra Query Language Shell), which you can use to connect to your Cassandra database.
To connect to your Cassandra database, run:
cqlsh
You will be dropped into the cqlsh shell, where you can execute Cassandra Query Language (CQL) commands.
Step 7: Create a Keyspace
In Cassandra, a keyspace is a namespace that defines data replication on nodes. Let's create a simple keyspace:
CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1};
To verify that your keyspace was created, you can list all keyspaces:
DESCRIBE keyspaces;
Step 8: Create a Table
Now that you have a keyspace, you can create a table within it. Here’s an example of creating a simple users table:
USE mykeyspace;
CREATE TABLE users (
user_id UUID PRIMARY KEY,
name text,
age int,
email text
);
Step 9: Insert Data into the Table
You can now insert data into the table using the INSERT statement:
INSERT INTO users (user_id, name, age, email)
VALUES (uuid(), 'John Doe', 30, 'johndoe@example.com');
Step 10: Query the Data
Finally, you can query the data from the table to verify that everything is working correctly:
SELECT * FROM users;
You should see the data you inserted displayed in the output.
user_id | age | email | name
--------------------------------------+-----+---------------------+----------
73dc58ac-766c-43b9-a434-02240b6b2837 | 30 | johndoe@example.com | John Doe
Conclusion
You have successfully seen how to install Apache Cassandra NoSQL DB on AlmaLinux 8 server. You can now start building scalable, high-performance applications that leverage Cassandra’s distributed architecture.