ETCD cluster Kubernetes: Review for the CKA Exam

On the journey to becoming a certified Kubernetes professional, the ETCD emerges as a critical component that every CKA exam candidate must thoroughly understand. In this article, I’ll explain more about the ETCD cluster Kubernetes, provide valuable insights, aim to achieve the CKA certification, go through some topics I’m using in my studies, and revise for this exam.

Kubernetes ETCD is much more than a data storage component in the Kubernetes ecosystem. It represents the distributed and reliable brain that maintains the consistent state of the entire cluster. Mastering its concepts is not just a requirement for the exam but an essential skill for anyone working with cloud-native infrastructure.

cka exam curriculum
CKA Curriculum

What is etcd in kubernetes?

Concept

ETCD is a distributed key-value database that stores all the information about the state of the Kubernetes cluster, and is simple, secure and fast. Developed by CoreOS, the ETCD plays a key role in managing the metadata and state of the Kubernetes cluster. It is responsible for:

  • Ensure high availability and consistency.
  • Storing cluster configurations.
  • Record the location of the nodes.

To put it simply, imagine the ETCD as the “brain” of the Kubernetes cluster. Without it, Kubernetes would have no way of knowing what to run or how to react to changes. That’s why understanding how it works is essential, especially in the CKA exam, which requires practical knowledge.

o que é etcd
Kubernetes master & application nodes

Additional details

The ETCD is made up of three main components:

  • Raft: this is the consensus algorithm used by the ETCD to guarantee data consistency.
  • Store: this is the component responsible for storing ETCD data.
  • Proxy: this is the component responsible for providing an application programming interface (API) for the ETCD.
  • The ETCD Datastore stores information about the Cluster, such as:
    • Nodes
    • PODs
    • Configs
    • Secrets
    • Accounts
    • Roles
    • Bindings
    • Others
  • All the information obtained from the kubectl get command is obtained from the ETCD Server.
  • The default port on which the ETCD “listens” is port 2379, as we can see from the ETCD Pod configuration: --advertise-client-urls https://${INTERNAL_IP}:2379

We can also check by looking at the running process, as shown in the image below:

kubernetes etcd
ETCD port

ETCD architecture

Data storage model

ETCD uses a simple but powerful key-value storage model. Each entry is stored as a key-value pair, allowing for fast and efficient retrievals.

Example of key-value storage:

{
    "name": "John,"age": 45,
    "location": "New York",
    "salary": 5000
}

RAFT consensus protocol

The RAFT protocol guarantees distributed consistency and is essential for the election of leaders and the replication of data between multiple nodes.

ETCD uses the Raft algorithm to ensure that all nodes have the same view of the data. It’s as if each node voted to keep the information consistent.

ETCD installation steps

Prerequisites

  • Linux system(tested on Ubuntu and CentOS)
  • Internet connection
  • Root or sudo user permissions

1. Downloading the binaries

Download the latest version of ETCD using the curl command:

# Check the latest version at https://github.com/etcd-io/etcd/releases
VERSION="v3.5.9"
ARCH="amd64"
DOWNLOAD_URL="https://github.com/etcd-io/etcd/releases/download/${VERSION}/etcd-${VERSION}-linux-${ARCH}.tar.gz"

curl -L ${DOWNLOAD_URL} -o etcd-${VERSION}-linux-${ARCH}.tar.gz

2. Extract the files

Use the tar command to extract the compressed file:

tar xzvf etcd-${VERSION}-linux-${ARCH}.tar.gz
cd etcd-${VERSION}-linux-${ARCH}/

3. Configure ETCD

Copy the binaries to a directory in PATH:

sudo cp etcd etcdctl /usr/local/bin/

4. Create system user

For security purposes, create a dedicated user for the ETCD service:

sudo useradd --no-create-home --shell /bin/false etcd

5. Create configuration directories

sudo mkdir -p /etc/etcd /var/lib/etcd
sudo chown -R etcd:etcd /var/lib/etcd

6. Configuring the Systemd service

Create a systemd service file:

sudo vi /etc/systemd/system/etcd.service

Paste the following content:

[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target

[Service]
User=etcd
Type=notify
ExecStart=/usr/local/bin/etcd \
  --name etcd-server \
  --data-dir /var/lib/etcd \
  --listen-client-urls http://localhost:2379 \
  --advertise-client-urls http://localhost:2379

Restart=always
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

7. Start and enable the service

sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd

8. Check service status

sudo systemctl status etcd

9. Basic test

Verify if the ETCD is working properly:

etcdctl version
etcdctl put myKey "Test Value"
etcdctl get myKey

Advanced settings

For more complex configurations, such as distributed clusters, consult the official ETCD documentation.

CKA Test Strategies

Main ETCD Topics for Study

One of the areas where etcd is most explored in the CKA exam is in the troubleshooting part. You may be asked to identify problems in etcd or even restore a backup. Here are some important topics to review:

  1. Backup and Recovery(This is one of the most popular topics)
  • Comandos etcdctl snapshot save
  • Cluster restoration procedures
  1. Cluster configuration
  • Understanding peers and endpoints
  • Security settings
    • etcd uses TLS for secure communication, and you need to know how to configure this.
  1. Troubleshooting
  • Log analysis
  • Cluster health check
    • Learn to diagnose common problems such as communication failures or data corruption.

Commands for ETCD

During the exam, you will need to perform tasks in the ETCD. Here are some of the most important commands:

# Check ETCD status
ETCDCTL_API=3 etcdctl endpoint status --write-out=table \
 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
 --cert=/etc/kubernetes/pki/etcd/server.crt \
 --key=/etc/kubernetes/pki/etcd/server.key

# Back up ETCD
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
 --cert=/etc/kubernetes/pki/etcd/server.crt \
 --key=/etc/kubernetes/pki/etcd/server.key

# Restore ETCD from a backup
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
 --data-dir=/var/lib/etcd-restored

# General etcdctl commands
etcdctl snapshot save
etcdctl endpoint health
etcdctl get
etcdctl put

Good Practices for CKA Testing

Essential tips

  • Practice ETCD management commands
  • Simulate recovery scenarios
  • Deep understanding of cluster architecture
  • ETCDCTL is used to insert or retrieve data from etcd. It exists in versions 2 or 3. It is important to set the version via the export ETCDCTL_API=3 variable, as some commands may not work between versions.
  • Make sure that the paths to the certificate files so that ETCDCTL can authenticate on the ETCD API Server are correct. The certificate files are available in etcd-master under the following paths:
--cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/server.crt
--key /etc/kubernetes/pki/etcd/server.key
  • ETCD in HA(high Availability) elects a leader.
    • The leader processes the data and sends a copy to the other members.
    • After a certain amount of time, the RAFT algorithm starts a new vote and discards the candidate who hasn’t sent a signal that he or she is still online.
    • The quantities of 3, 5 or 7 are good numbers to use for the quantity of etcd for HA.
      • 3 supports 1 inactive instance
      • 5 supports 2 inactive instances
      • 7 supports 3 inactive instances.
  • Quorum calculation in ETCD clusters, to understand fault tolerance:
    • The quorum calculation in distributed systems follows the formula N/2 + 1, where N represents the total number of nodes. See how this impacts different cluster configurations:

1 Node:

  • Calculation: (1/2) + 1 = 1
  • Result: Requires 1 node for write operations
  • Limitation: No fault tolerance

2 Nodes:

  • Calculation: (2/2) + 1 = 2
  • Result: Needs 2 nodes to write
  • Criticism: No real gain in fault tolerance
  • If 1 node fails, write operations are interrupted

3 Nodes:

  • Calculation: (3/2) + 1 = 2
  • Result: Requires 2 nodes to write
  • Benefit: Fault tolerance
  • If 1 node falls, the remaining 2 nodes ensure continuity

4 Nodes:

  • Calculation: (4/2) + 1 = 3
  • Result: Needs 3 nodes to write
  • Feature: Fault tolerance
  • Limitation: Loss of 2 nodes compromises the cluster

5 Nodes:

  • Calculation: (5/2) + 1 = 3
  • Result: Requires 3 nodes to write
  • Advantage: High availability
  • Even with 2 nodes down, the cluster maintains functionality

Recommended study resources

Use case – Snapshot recovery

To backup and restore ETCD in Kubernetes, the 2 main commands involved are these:

# Back up
ETCDCTL_API=3 etcdctl snapshot save snapshot.db

# Restore the snapshot
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db

However, when carrying out the procedures there are other important commands and a few more details, which I’ll go into in detail in the steps below.

Prerequisites

Before you start, make sure:

  • Access to a Kubernetes cluster
  • Have etcdctl installed
  • Have administrator permissions on the cluster

Backup procedure

1. Backing up the ETCD

# Set ETCD API version
export ETCDCTL_API=3

# Basic backup command
etcdctl snapshot save snapshot.db

# Backup with authentication parameters
etcdctl snapshot save /tmp/snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/etcd-server.crt \
--key=/etc/kubernetes/pki/etcd/etcd-server.key

2. Checking snapshot status

etcdctl snapshot status snapshot.db

Restore procedure

Step-by-step Restoration:

1. Stop kube-apiserver

service kube-apiserver stop

2. Restore a snapshot

etcdctl snapshot restore snapshot.db \
  --data-dir /var/lib/etcd-backup

3. Reload system settings

systemctl daemon-reload

4. Restart the etcd service

service etcd restart

5. Start kube-apiserver

service kube-apiserver start

ETCD vs Other storage

Comparing ETCD with the main storage solutions:

FeatureETCDConsulZooKeeper
ConsistencyStrongStrongEventual
PerformanceHighAverageAverage
ScalabilityHighAverageLow

Frequently Asked Questions (FAQs)

Why is ETCD important in Kubernetes?

The ETCD stores the complete state of the cluster, configurations and metadata, and is essential for the stability and recovery of the environment.

How to protect or ETCD?

Use TLS, configure certificates correctly and limit access to authorized components only.

What is the ideal backup frequency?

Periodic backups are recommended, preferably with each significant change in the cluster.


Conclusion

Learning more about ETCD is not just a requirement for the CKA exam, but a fundamental skill for professionals who are going to work with Kubernetes. Spend time practicing, exploring its architecture and understanding its internal mechanisms, carrying out simulations and tests in the lab, before getting hands-on in production or applying your knowledge in the CKA exam.

Next steps

📚 Explore more content about Kubernetes.

🚀 See how to install Kubeadm on WSL2 to help with labs and studying for the CKA.

Good luck with your CKA exam! 💪


Cover image by gstudioimagen on Freepik

Compartilhe / Share
Fernando Müller Junior
Fernando Müller Junior

I am Fernando Müller, a Tech Lead SRE with 16 years of experience in IT, I currently work at Appmax, a fintech located in Brazil. Passionate about working with Cloud Native architectures and applications, Open Source tools and everything that exists in the SRE world, always looking to develop and learn constantly (Lifelong learning), working on innovative projects!

Articles: 44

Leave a Reply

Your email address will not be published. Required fields are marked *