[CKA] Node 유형

1) 특정 워커노드 drain 하기 문제

문제

Set the node named ek8s-node-1 as unavailable and reschedule all the pods running on it.

풀이

이 문제는 https://killercoda.com/killer-shell-cka/scenario/playground 에서 풀이하는 것으로 하자
drain 명령을 통해 특정 노드를 스케줄러에서 제외시켜 파드가 할당되지 않도록 하고, 기존에 배포된 파드를 다른 노드로 이동한다.

kubectl drain {노드이름} --ignore-daemonsets --delete-local-data --force

공식문서에서 Safely Drain a Node를 찾으면 -ignore-daemonsets을 사용하는 옵션을 확인 할 수 있다.
- https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/

설정 원복

kubectl uncordon {노드이름}

실습

$ k drain --ignore-daemonsets controlplane
node/controlplane cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/canal-zl4tq, kube-system/kube-proxy-2mfwz
evicting pod local-path-storage/local-path-provisioner-6c5cff8948-2x89z
evicting pod default/nginx-deply-57df7df4-qg6pg
evicting pod kube-system/calico-kube-controllers-94fb6bc47-rxh7x
pod/nginx-deply-57df7df4-qg6pg evicted
pod/local-path-provisioner-6c5cff8948-2x89z evicted
pod/calico-kube-controllers-94fb6bc47-rxh7x evicted
node/controlplane drained


$ k get pod -o wide
NAME                         READY   STATUS    RESTARTS   AGE   IP            NODE     NOMINATED NODE   READINESS GATES
nginx-deply-57df7df4-66rgh   1/1     Running   0          8s    192.168.1.5   node01   <none>           <none>


$ k get nodes
NAME           STATUS                     ROLES           AGE   VERSION
controlplane   Ready,SchedulingDisabled   control-plane   12d   v1.31.0
node01         Ready                      <none>          12d   v1.31.0


$ kubectl uncordon controlplane
node/controlplane uncordoned


$ kubectl get node
NAME           STATUS   ROLES           AGE   VERSION
controlplane   Ready    control-plane   12d   v1.31.0
node01         Ready    <none>          12d   v1.31.0

drain vs cordon vs uncordon

drain

명령어를 적용한 노드에 존재하는 pod를 비우고 다른 노드에 pod를 스케줄링 시킴

cordon

노드에 이미 배포된 pod는 유지, 추가적인 pod 스케줄링은 막음

uncordon

drain 또는 cordon 상태를 해제하는 명령어로 다시 노드에 pod를 스케줄링 하도록 허용한다.

2) etcd 백업 및 복구 문제

문제

Create a snapshot of the etcd instance running at https://127.0.0.1:2379, saving the snapshot to the file path /srv/data/etcd-snapshot.db.

The following TLS certificates/key are supplied for connecting to the server with etcdctl:

CA certificate: /opt/KUCM00302/ca.crt
Client certificate: /opt/KUCM00302/etcd-client

풀이

https://etcd.io/docs/v3.5/tutorials/how-to-save-database/
https://etcd.io/docs/v3.5/op-guide/security/#example-2-client-to-server-authentication-with-https-client-certificates
명령을 조합해야 할 듯

# 백업 (etcdctl snapshot save)
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2397 \
        --cacert=/opt/KUCM00302/ca.crt \
        --cert=/opt/KUCM00302/etcd-client.crt \
        --key=/opt/KUCM00302/etcd-client.key \
        snapshot save /srv/data/etcd-snapshot.db

# 복구 (etcdctl snapshop restore)
ETCDCTL_API=3 etcdctl --data-dir <data-dir-location> snapshot restore snapshot.db

추가자료

https://cumulus.tistory.com/95 블로그에서 보면 sudo -i 명령어를 사용하여 root 권한으로 작업해야 한다고 한다.
백업 - https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#backing-up-an-etcd-cluster
복구 - https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster

# 백업 (etcdctl snapshot save)
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=<trusted-ca-file> --cert=<cert-file> --key=<key-file> \
  snapshot save <backup-file-location>

# 복구 (etcdctl snapshot restore)
ETCDCTL_API=3 etcdctl --data-dir <data-dir-location> snapshot restore snapshot.db

3) Namespace 생성 문제

문제

Create a pod as follows:

Name: mongo
Using Image: mongo
In a new Kubernetes namespace named: my-website

풀이

# namespace를 만든다
kubectl create ns my-website

# 이름이 "mongo"인 Pod를 "my-website" 네임스페이스에서 MongoDB 이미지를 사용
kubectl run mongo --image=mongo -n my-website

4) kubelet issue 가 발생한 노드 고치기 문제

문제1

Given a partially-functioning Kubernetes cluster, identify symptoms of failure on the cluster. Determine the node, the failing service, and take actions to bring up the failed service and restore the health of the cluster. Ensure that any changes are made permanently.

풀이1

이상이 발생한 Node를 정상화 처리를 한다!

sudo -i
vim /var/lib/kubelet/config.yaml
systemctl restart kubelet
systemctl enable kubelet
kubectl get nodes

문제2

A Kubernetes worker node, named. Investigate why this is the case, and perform any appropriate steps to bring the node to a state, ensuring that any changes are made permanent.

풀이2

kubectl get nodes
# 상태가 NotReady인 것을 찾아 SSH 접속을 한다.

sudo -i
systemctl restart kubelet
systemctl enable kubelet

# ssh 접속을 종료하고 다시 노드상태를 확인한다.

kubectl get nodes

5) 마스터 노드 업그레이드

문제

Given an existing Kubernetes cluster running version 1.20.0, upgrade all of the Kubernetes control plane and node components on the master node only to version 1.20.1. Be sure to drain the master node before upgrading it and uncordon it after the upgrade.

풀이

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

# docs 예시
sudo apt-mark unhold kubeadm && \
sudo apt-get update && sudo apt-get install -y kubeadm='1.31.x-*' && \
sudo apt-mark hold kubeadm

sudo apt-mark unhold kubelet kubectl && \
sudo apt-get update && sudo apt-get install -y kubelet='1.31.x-*' kubectl='1.31.x-*' && \
sudo apt-mark hold kubelet kubectl

sudo kubeadm upgrade apply v1.31.x


# 마스터 노드 1.20.1 업그레이드
# drain 
ssh ek8s
kubectl cordon k8s-master

kubectl drain k8s-master --delete-local-data --ignore-daemonsets --force

apt-get install kubeadm=1.20.1-00 kubelet=1.20.1-00 kubectl=1.20.1-00 --disableexclude=kubernetes

kubeadm upgrade apply 1.20.1 --etcd-upgrade=false

# 업그레이드 완료
systemctl daemon-reload
systemctl restart kubelet
kubectl uncordon k8s-master

번외

저작자표시 (새창열림)

'Kubernetes > Cert' 카테고리의 다른 글

[CKA] taint (0)	2025.02.13
[CKA] PV 생성 (0)	2025.02.11
[CKA] 스토리지 유형 (0)	2025.02.10
[CKA] Service 유형 (0)	2025.02.09
[CKA] Pod 유형 (0)	2025.02.07

lumination

[CKA] Node 유형

1) 특정 워커노드 drain 하기 문제

drain vs cordon vs uncordon

2) etcd 백업 및 복구 문제

3) Namespace 생성 문제

4) kubelet issue 가 발생한 노드 고치기 문제

5) 마스터 노드 업그레이드

'Kubernetes > Cert' 카테고리의 다른 글

티스토리툴바

[CKA] Node 유형

1) 특정 워커노드 drain 하기 문제

drain vs cordon vs uncordon

2) etcd 백업 및 복구 문제

3) Namespace 생성 문제

4) kubelet issue 가 발생한 노드 고치기 문제

5) 마스터 노드 업그레이드

'Kubernetes > Cert' 카테고리의 다른 글

관련글

티스토리툴바