Kubernetes

Kubernetes

Kubernetes

个人环境搭建

minikube https://opensource.com/article/18/10/getting-started-minikube

minikube offical https://minikube.sigs.k8s.io/docs/

minikube

1
2
3
sudo -iu liuliancao
sudo usermod -aG docker $USER && newgrp docker
minikube start --driver=docker --image-repository='registry.cn-hangzhou.aliyuncs.com/google_containers'

kubeadm安装

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

三服务器

master1 2u2g slave0 1u2g slave1 1u2g

设置hosts

1
2
3
192.168.122.239 master1
192.168.122.106 slave0
192.168.122.149 slave1

ssh免密略

安装docker 具体参考https://docs.docker.com/engine/install/debian/

批量运行

准备工作
1
sudo apt-get update && sudo apt-get install -y apt-transport-https ca-certificates curl
添加源 add apt source list && install kubelet kubeadm kubectl
1
2
3
4
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
确认cgroup driver

具体参考 https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/

docker cgroup driver
1
2
sudo docker info|grep -i 'Cgroup Driver'
 Cgroup Driver: systemd

k8s默认也是这个 先用这个测试

kubeadm初始化

按照指引即可https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

登录master1机器,执行kubeadm init

预检查
1
2
3
4
5
6
7
8
9
root@master1:~# kubeadm init phase preflight
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: missing optional cgroups: blkio
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: E1201 00:10:25.268012    3225 remote_runtime.go:948] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
time="2022-12-01T00:10:25-05:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

因为containerd默认禁止cri,这里去每个服务器执行下

1
2
sudo sed -i 's/disabled_plugins/# disabled_plugins/' /etc/containerd/config.toml
sudo systemctl restart containerd

估计没启动systemctl start containerd即可

containerd配置

如果不配置kubelet可能会来回重启 6443一直连不上

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
root@master1:~/study/first# cat /etc/containerd/config.toml
version = 2
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
   [plugins."io.containerd.grpc.v1.cri".containerd]
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true

systemctl restart containerd
systemctl restart docker
实际安装
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
root@master1:~# kubeadm init --apiserver-advertise-address=192.168.122.239 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.25.4 --service-cidr=10.96.0.0/12  --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.25.4
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.122.239]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master1] and IPs [192.168.122.239 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master1] and IPs [192.168.122.239 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 6.505005 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: trrma6.d54mmbgrpzldgvsy
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.122.239:6443 --token trrma6.d54mmbgrpzldgvsy \
	--discovery-token-ca-cert-hash sha256:67c3636e914b68643e2ee64c50425418700e2cf4311ac4041a080fd8937f359d

执行里面的命令

1
2
3
4
root@master1:~#   mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
root@master1:~#   export KUBECONFIG=/etc/kubernetes/admin.conf

初始化网络,增加cni配置文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
master执行
  下载文件https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
  root@master1:~# kubectl  apply -f kube-flannel.yml
  namespace/kube-flannel created
  clusterrole.rbac.authorization.k8s.io/flannel created
  clusterrolebinding.rbac.authorization.k8s.io/flannel created
  serviceaccount/flannel created
  configmap/kube-flannel-cfg created
  daemonset.apps/kube-flannel-ds created
  root@master1:~# cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist
{
 "cniVersion": "1.0.0",
 "name": "containerd-net",
 "plugins": [
   {
     "type": "bridge",
     "bridge": "cni0",
     "isGateway": true,
     "ipMasq": true,
     "promiscMode": true,
     "ipam": {
       "type": "host-local",
       "ranges": [
         [{
           "subnet": "10.88.0.0/16"
         }],
         [{
           "subnet": "2001:db8:4860::/64"
         }]
       ],
       "routes": [
         { "dst": "0.0.0.0/0" },
         { "dst": "::/0" }
       ]
     }
   },
   {
     "type": "portmap",
     "capabilities": {"portMappings": true}
   }
 ]
}
EOF

这一步三台服务器都要操作

增加补全

1
2
3
root@master1:~# echo 'source <(kubectl completion bash)' >>~/.bashrc
root@master1:~# source ~/.bashrc
root@master1:~# kubectl completion bash >/etc/bash_completion.d/kubectl

从节点加入

1
2
  # 找到你的join提示,切换到root执行join
  kubeadm join 192.168.122.239:6443 --token trrma6.d54mmbgrpzldgvsy        --discovery-token-ca-cert-hash sha256:67c3636e914b68643e2ee64c50425418700e2cf4311ac4041a080fd8937f359d

最终ready了

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
root@master1:~# kubectl cluster-info
Kubernetes control plane is running at https://192.168.122.239:6443
CoreDNS is running at https://192.168.122.239:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
root@master1:~# kubectl get nodes
NAME      STATUS   ROLES           AGE    VERSION
master1   Ready    control-plane   2d5h   v1.25.4
slave0    Ready    <none>          2d4h   v1.25.4
slave1    Ready    <none>          2d4h   v1.25.4

kubectl命令相关

好用的选项

1
kubectl explain statefulset

多用describe,了解缩写

1
kubectl describe

配置kubectl自动完成

1
echo $(kubectl completion bash) >> /etc/profile

kubectl生成yaml

1
kubectl get statefulset/xxx -o yaml --export > test.yaml

k8s集群信息查看

cluster-info查看集群信息

1
kubectl cluster-info

get componentstatus查看组件状态

1
kubectl get componentstatus

查看pod

1
kubectl get po

pod

https://kubernetes.io/docs/concepts/workloads/pods/

pod是k8s里面的一种workload, 是部署的最小单位,相当于逻辑主机。

pod里面的容器共享cgroup和network等namespace。

如何创建一个pod

命令行创建
yaml创建

官网 https://kubernetes.io/docs/concepts/workloads/pods/ 链接里面的simple-pod.yaml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
  apiVersion: v1
  kind: Pod
  metadata:
    name: nginx
  spec:
    containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
      - containerPort: 80

我们执行下看看会有什么结果

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
root@master1:~/study/pod# cat simple-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
root@master1:~/study/pod# kubectl  apply -f simple-pod.yaml 
pod/nginx created

root@master1:~/study/pod# kubectl get pod nginx
NAME    READY   STATUS              RESTARTS   AGE
nginx   0/1     ContainerCreating   0          21s

这里我们看看对应的yaml

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
root@master1:~/study/pod# kubectl  get pod nginx -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"containers":[{"image":"nginx:1.14.2","name":"nginx","ports":[{"containerPort":80}]}]}}
  creationTimestamp: "2023-01-01T03:13:28Z"
  name: nginx
  namespace: default
  resourceVersion: "4404"
  uid: 3a86b3c8-2df2-4c20-b602-5ddf18da5847
spec:
  containers:
  - image: nginx:1.14.2
    imagePullPolicy: IfNotPresent
    name: nginx
    ports:
    - containerPort: 80
      protocol: TCP
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-9jl84
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: slave1
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-9jl84
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-01-01T03:13:28Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-01-01T03:13:28Z"
    message: 'containers with unready status: [nginx]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-01-01T03:13:28Z"
    message: 'containers with unready status: [nginx]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-01-01T03:13:28Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: nginx:1.14.2
    imageID: ""
    lastState: {}
    name: nginx
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        reason: ContainerCreating
  hostIP: 192.168.122.149
  phase: Pending
  qosClass: BestEffort
  startTime: "2023-01-01T03:13:28Z"

可以发现 第一行表明apiVersion: v1使用v1版本的kubernets的API

kind表示创建的资源类型是Pod

metadata表示名称命名空间,标签和关于容器的其他信息。

spec包含pod内容的实际说明,比如pod里面的容器,卷和其他数据。

status包含pod的当前信息,比如所处的状态,内部ip和其他信息。创建时候无 需提供。

这里可以用kubectl port-forward进行测试端口

1
2
3
4
5
root@master1:~/study/pod# kubectl port-forward nginx 8899:80
Forwarding from 127.0.0.1:8899 -> 80
Forwarding from [::1]:8899 -> 80

Handling connection for 8899

打开另一个终端

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
liuliancao@master1:~$ curl localhost:8899
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Ctrl-C结束,检查 pod logs

1
2
root@master1:~/study/pod# kubectl logs nginx
127.0.0.1 - - [01/Jan/2023:08:35:34 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.74.0" "-"
pod标签管理
创建标签
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
root@master1:~/study/pod# cat kubia-manual-with-labels.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: kubia-manual-with-label
  labels:
    creation_method: manual
    env: prod
spec:
  containers:
  - name: kubia
    image: luksa/kubia
    ports:
    - containerPort: 8080
      protocol: TCP

在metadata添加labels即可,可以通过kubectl get pods –show-labels查看, 也可以通过kubectl get pods -L LABELNAME查看

1
2
3
4
5
6
7
8
9
root@master1:~/study/pod# kubectl get po --show-labels 
NAME                      READY   STATUS    RESTARTS   AGE     LABELS
kubia-manual-with-label   1/1     Running   0          48s     creation_method=manual,env=prod
nginx                     1/1     Running   0          7h21m   <none>

root@master1:~/study/pod# kubectl get po -L env,creation_method
NAME                      READY   STATUS    RESTARTS   AGE     ENV    CREATION_METHOD
kubia-manual-with-label   1/1     Running   0          87s     prod   manual
nginx                     1/1     Running   0          7h22m
修改标签

使用kubectl label pod POD LABEL=VALUE

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
root@master1:~/study/pod# kubectl label po kubia-manual-with-label env=test --overwrite 
pod/kubia-manual-with-label labeled

root@master1:~/study/pod# kubectl label po nginx env=test creation_method=auto
pod/nginx labeled

root@master1:~/study/pod# kubectl get po -L env
NAME                      READY   STATUS    RESTARTS   AGE     ENV
kubia-manual-with-label   1/1     Running   0          5m56s   test
nginx                     1/1     Running   0          7h26m   test
过滤标签的pod

使用kubectl get pod -l LABEL=VALUE

1
2
3
4
5
6
7
8
9
root@master1:~/study/pod# kubectl get po -l env=test
NAME                      READY   STATUS    RESTARTS   AGE
kubia-manual-with-label   1/1     Running   0          6m26s
nginx                     1/1     Running   0          7h27m
root@master1:~/study/pod# kubectl get po -l creation_method=auto
NAME    READY   STATUS    RESTARTS   AGE
nginx   1/1     Running   0          7h27m
root@master1:~/study/pod# kubectl get po -l creation_method=test
No resources found in default namespace.

这里注意 也可以写表达式 kubectl get pod -l '!env' # 没有env标签的 kubectl get po -l 'env notin (prod,devel)' # env不是prod和devel kubectl get po -l 'env in (prod,devel)' # env是prod或者devel kubectl get po -l 'env!=test' # env不是test

node添加标签

对于我们实际的工作节点,可能存在计算型,存储型,GPU型等,所以pod也希望 能够按照我们希望的调度到对应的节点上。

1
2
3
4
5
6
7
root@master1:~/study/pod# kubectl label node slave1 gpu=true
node/slave1 labeled
root@master1:~/study/pod# kubectl get nodes -L gpu
NAME      STATUS     ROLES           AGE   VERSION   GPU
master1   NotReady   control-plane   8h    v1.25.4   
slave0    Ready      <none>          8h    v1.25.4   
slave1    Ready      <none>          8h    v1.25.4   true

我们可以通过nodeSelector来选择节点

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  apiVersion: v1
  kind: Pod
  metadata:
    name: kubia-gpu
  spec:
    nodeSelector:
      gpu: "true"
    containers:
    - name: kubia
      image: luksa/kubia
      ports:
      - containerPort: 8080
        protocol: TCP

创建kubia-gpu

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
root@master1:~/study/pod# kubectl create -f kubia-gpu.yaml
pod/kubia-gpu created

root@master1:~/study/pod# kubectl describe pod kubia-gpu |grep Node
Node:             slave1/192.168.122.149
Node-Selectors:              gpu=true

root@master1:~/study/pod# kubectl get nodes -L gpu
NAME      STATUS     ROLES           AGE   VERSION   GPU
master1   NotReady   control-plane   8h    v1.25.4   
slave0    Ready      <none>          8h    v1.25.4   
slave1    Ready      <none>          8h    v1.25.4   true

我们发现确实调度到了slave1

pod添加注解
1
2
root@master1:~/study/pod# kubectl  describe pod kubia-gpu |grep -i anno
Annotations:      <none>

我们可以使用kubectl annotate添加注解

1
2
3
4
root@master1:~/study/pod# kubectl annotate pod kubia-gpu liuliancao.com/tips="this is for node selector test"
pod/kubia-gpu annotated
root@master1:~/study/pod# kubectl  describe pod kubia-gpu |grep -i anno
Annotations:      liuliancao.com/tips: this is for node selector test

CKA考前须知

Helm

helm是什么

官方参考视频

helm通常用来部署应用,通过Helm Chart定义应用的参数完成集群的搭建。

是kubernetes里面的一种包管理器,一键安装部署各种程序。

Argo

Argo官网 Open source tools for Kubernetes to run workflows, manage clusters, and do GitOps right. 目前看argo是包含多个产品的,这里面是argo workflow。关于argo project,更 多可以参考

Argo Workflow

argo workflow文档地址 https://argoproj.github.io/argo-workflows/ 可以通过如下文档安装一下 https://github.com/argoproj/argo-workflows/blob/master/docs/quick-start.md

install argo cli

https://github.com/argoproj/argo-workflows/releases/latest 找到最新的 版本

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
[root@minikube tmp]# wget https://ghproxy.com/https://github.com/argoproj/argo-workflows/releases/download/v3.4.9/argo-linux-amd64.gz
[root@minikube tmp]# gzip -d argo-linux-amd64.gz
[root@minikube tmp]# cp argo-linux-amd64 /usr/local/bin/
[root@minikube tmp]# chmod 755 /usr/local/bin/argo-linux-amd64
[root@minikube tmp]# argo-linux-amd64 version
argo: v3.4.9
  BuildDate: 2023-07-20T15:07:55Z
  GitCommit: b76329f3a2dedf4c76a9cac5ed9603ada289c8d0
  GitTreeState: clean
  GitTag: v3.4.9
  GoVersion: go1.20.6
  Compiler: gc
  Platform: linux/amd64
install argo controller and server
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
[admin@minikube ~]$ kubectl create namespace argo
namespace/argo created
[admin@minikube argo_project]$ wget https://ghproxy.com/github.com/argoproj/argo-workflows/releases/download/v3.4.9/install.yaml
[admin@minikube argo_project]$ kubectl apply -n argo -f install.yaml
...
service/argo-server created
priorityclass.scheduling.k8s.io/workflow-controller created
deployment.apps/argo-server created
deployment.apps/workflow-controller created
[admin@minikube argo_project]$ kubectl get pods -n argo
NAME                                   READY   STATUS             RESTARTS   AGE
argo-server-747fbdc4f8-wdmgp           1/1     Running            0          2m6s
workflow-controller-7c49cc575f-dphh9   0/1     ImagePullBackOff   0          2m6s
# 发现workflow-controller拉不下来..., 再次apply一下就好了。
[admin@minikube argo_project]$ kubectl get pods -n argo
NAME                                   READY   STATUS    RESTARTS   AGE
argo-server-747fbdc4f8-wdmgp           1/1     Running   0          2m42s
workflow-controller-7c49cc575f-dphh9   1/1     Running   0          2m42s

这里可以把type: ClusterIP改成type: NodePort

1
kubectl edit svc argo-server -n argo

并且一定要注意proxy_pass https://xxx:xxx; 写https,否则会报错empty

如果你是minikube,就是minikube的ip

1
2
3
4
5
6
7
8
[admin@minikube ~]$ kubectl get svc -n argo
NAME          TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
argo-server   NodePort   10.110.161.183   <none>        2746:32364/TCP   47m
[admin@minikube ~]$ curl http://$(minikube ip):32364/
curl: (52) Empty reply from server

[admin@minikube ~]$ curl -k https://$(minikube ip):32364/
<!doctype html><html lang="en"><head><meta charset="UTF-8"><title>Argo</title><base href="/"><meta name="viewport" content="width=device-width,initial-scale=1"><meta name="robots" content="noindex"><link rel="icon" type="image/png" href="assets/favicon/favicon-32x32.png" sizes="32x32"><link rel="icon" type="image/png" href="assets/favicon/favicon-16x16.png" sizes="16x16"></head><body><div id="app"></div><script src="main.9fb6cb6bddc82ba7f31d.js"></script></body></html>

同样参考quickstart

1
2
3
4
5
6
7
8
9
[admin@minikube ~]$ kubectl patch deployment \
  argo-server \
  --namespace argo \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args", "value": [
  "server",
  "--auth-mode=server"
]}]'
deployment.apps/argo-server patched

执行这个跳过认证

最终安装好了 ../images/k8s/argo/argo-first.png◎ ../images/k8s/argo/argo-first.png

上面每个选择都对应一个文档,可以看看。

hello world example

更多的example 在这里 argo-workflow example链接

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
cat <<EOF > hello-world.yml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-
  labels:
    workflows.argoproj.io/archive-strategy: "false"
  annotations:
    workflows.argoproj.io/description: |
      This is a simple hello world example.
spec:
  entrypoint: whalesay
  templates:
  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [cowsay]
      args: ["hello world"]
EOF



[admin@minikube argo_project]$ argo-linux-amd64 -n argo submit --watch hello-world.yml
Name:                hello-world-7k9x5
Namespace:           argo
ServiceAccount:      unset (will run with the default ServiceAccount)
Status:              Succeeded
Conditions:          
 PodRunning          False
 Completed           True
Created:             Tue Aug 01 17:37:59 +0800 (46 seconds ago)
Started:             Tue Aug 01 17:37:59 +0800 (46 seconds ago)
Finished:            Tue Aug 01 17:38:45 +0800 (now)
Duration:            46 seconds
Progress:            1/1
ResourcesDuration:   21s*(1 cpu),21s*(100Mi memory)

STEP                  TEMPLATE  PODNAME            DURATION  MESSAGE
 ✔ hello-world-7k9x5  whalesay  hello-world-7k9x5  37s  

[admin@minikube argo_project]$ kubectl logs hello-world-7k9x5 -n argo
 _____________ 
< hello world >
 ------------- 
    \
     \
      \     
                    ##        .            
              ## ## ##       ==            
           ## ## ## ##      ===            
       /""""""""""""""""___/ ===        
  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~   
       \______ o          __/            
        \    \        __/             
          \____\______/   
time="2023-08-01T09:38:36.179Z" level=info msg="sub-process exited" argo=true error="<nil>"

这个时候就会发现hello-world的argo workflow ../images/k8s/argo/argo-hello-world.png◎ ../images/k8s/argo/argo-hello-world.png

我们命令行也可以看到对应的执行

1
2
3
[admin@minikube argo_project]$ argo-linux-amd64 -n argo list
NAME                STATUS      AGE   DURATION   PRIORITY   MESSAGE
hello-world-7k9x5   Succeeded   9m    46s        0

GitOps and Kubernetes

本部分是为了了解kubernetes里面的ci/cd, 所记录的文档 书是2021年的。 ../images/k8s/gitops/gitops-cover.png◎ ../images/k8s/gitops/gitops-cover.png

备注

书里面的资源,可以先clone下来 https://github.com/gitopsbook/resources

1
2
3
4
5
6
7
8
➜  projects git clone https://github.com/gitopsbook/resources
正克隆到 'resources'...
remote: Enumerating objects: 396, done.
remote: Counting objects: 100% (46/46), done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 396 (delta 22), reused 25 (delta 18), pack-reused 350 (from 1)
接收对象中: 100% (396/396), 59.79 KiB | 362.00 KiB/s, 完成.
处理 delta 中: 100% (161/161), 完成.

环境要求

  • Kubectl (v1.16 or later) 可以装一个minikube
  • Minikube (v1.4 or later)
  • Bash or the Windows Subsystem for Linux (WSL)

书的整体框架 ../images/k8s/gitops/gitops-arch.png◎ ../images/k8s/gitops/gitops-arch.png

Chapter1 Why GitOps

Gitops是什么

Infrastructure configuration and Software deployment, 书里面概述 gitops包含这两个,我觉得和传统ops的不同在于gitops定义了deploy的配置文 件并且把这个流程串起来了,而传统是生成版本,把版本给运维或者通过发布平 台进行发布。

而gitops里面你可以通过把这个版本release后,后面一切的资源准备,发布到 机器就按照定义好的pipeline跑起来了。

很明显的是在传统的CD逻辑里面,很多版本的更新是需要确认的,为了线上的稳 定性,并且需要注意版本不能发错,然后会有一些协同成本。每个团队包括QA, CI, DEV, OPS各司其职。这样适合对于稳定性要求极高的,或者系统复杂度很高, 谨慎传统的团队。 ../images/k8s/gitops/gitops-traditional.png◎ ../images/k8s/gitops/gitops-traditional.png

所以这样可能会导致发布频率低,因为每一次都需要沟通成本,一般通过ticket 进行流转,中间需要按照各自的流程进行运行,中间有问题也需要一定的时间成 本反馈。

DevOps是什么

DevOps是Dev和Ops两个单词组成,是一种理念,把CI和CD连起来,运维团队不再 需要参与代码的发布和部署,开发或者devops成员进行这部分的设计和分发。

DevOps的好处包含

  • 开发和部署之间更好协同
  • 提高产品质量
  • 更多的发布和版本
  • 减少提供新功能的发布时间
  • 减少设计开发和运维成本

下图是书里面对比traditional和devops方式的运维方式 ../images/k8s/gitops/tradition-and-devops.png◎ ../images/k8s/gitops/tradition-and-devops.png

可以发现团队变成小团队,这样方便模块化和分包,每个团队可以自己完成测试 发布等任务,减少耦合

书中认为gitops从属于devops,但是有自己的特性

  • 对于容器化的管理监控和发布有最佳实践
  • 以开发为中心,通过流水线管理开发和发布等操作
  • 通过git跟踪代码和基础设施的变更

在理想的gitops里面,所有改变不应当是手动的,应当是通过对应模块负责人 review确认ok进行自动部署的,并且确认入口唯一。

GitOps是一种理念,具体到工具的功能,需要包含

  • 各个状态保存在Git里面
  • 能在各个状态间检查不同
  • 能实现到想要的状态的切换

Iac

这里又介绍了Iac(Infrastructure as Code) 好处是方便重复,稳定,高校,省钱,基础设施可见

Self-service

通过gitops-operator,原来需要提工单申请的资源,能够被迅速提供,来满足快 速开发发布的需求。

Code reviews

代码通过另外的同事包括leader,团队检查,可以让代码更健壮,也是一种分享 知识的方式,能保证设计的一致性,增强团队凝聚力。

review也可以不是人,可以是具体的工具,比如各种语法检查器,静态代码分析, 安全扫描。

Git pull requests

我们常说的pr, 只申请提交,会产生临时分支和一个临时链接,发给维护的人用 于批准,确认ok以后会自动把分支删掉

../images/k8s/gitops/git-branch-pr.png◎ ../images/k8s/gitops/git-branch-pr.png

Observability

可观测性指的是能够检查当前系统的运行状态并且在需要的时候告警,生产环境 必须要可观测,我们可以通过api,gui界面后者dashboard看到当前环境运行的情 况。

当发生问题的时候,预期和现在state不一致的时候,可以通过git很明显查看 diff。这种状态不同叫做配置漂移configuration drift。

Chapter2 Kubernetes and GitOps

Kubernetes

kubernetes是容器编排平台 2014, docker做了系统隔离,还有一些东西没有包 含

  • 容器之间如何沟通
  • 流量在容器之间怎么路由
  • 当负载增加的时候怎么扩容
  • 集群的底层如何扩展

类似的有 docker compose , apache mesos

minikube

关于minikube搭建我就不再赘述了,可以自行google下

这里需要注意把这个clone下来 https://github.com/gitopsbook/resources

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
➜  /tmp minikube start
😄  Debian 12.7 上的 minikube v1.31.1
🎉  minikube 1.34.0 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.34.0
💡  To disable this notice, run: 'minikube config set WantUpdateNotification false'

✨  自动选择 docker 驱动。其他选项:qemu2, ssh, none
📌  使用具有 root 权限的 Docker 驱动程序
👍  正在集群 minikube 中启动控制平面节点 minikube
🚜  正在拉取基础镜像 ...
💾  正在下载 Kubernetes v1.27.3 的预加载文件...
    > preloaded-images-k8s-v18-v1...:  393.19 MiB / 393.19 MiB  100.00% 1.13 Mi
    > gcr.io/k8s-minikube/kicbase...:  447.62 MiB / 447.62 MiB  100.00% 1.18 Mi
🔥  正在创建 docker container(CPUs=2,内存=3900MB)...
🐳  正在 Docker 24.0.4 中准备 Kubernetes v1.27.3…
    ▪ 正在生成证书和密钥...
    ▪ 正在启动控制平面...
    ▪ 配置 RBAC 规则 ...
🔗  配置 bridge CNI (Container Networking Interface) ...
    ▪ 正在使用镜像 gcr.io/k8s-minikube/storage-provisioner:v5
🔎  正在验证 Kubernetes 组件...
🌟  启用插件: default-storageclass, storage-provisioner
💡  kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
🏄  完成!kubectl 现在已配置,默认使用"minikube"集群和"default"命名空间

安装kubectl并且查看pods

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
➜  /tmp minikube kubectl -- get pods -A
    > kubectl.sha256:  64 B / 64 B [-------------------------] 100.00% ? p/s 0s
    > kubectl:  46.98 MiB / 46.98 MiB [--------------] 100.00% 2.15 MiB p/s 22s
NAMESPACE     NAME                               READY   STATUS    RESTARTS        AGE
kube-system   coredns-5d78c9869d-rdc9g           1/1     Running   0               2m31s
kube-system   etcd-minikube                      1/1     Running   0               3m
kube-system   kube-apiserver-minikube            1/1     Running   0               2m44s
kube-system   kube-controller-manager-minikube   1/1     Running   1 (2m54s ago)   3m
kube-system   kube-proxy-7qjvh                   1/1     Running   0               2m32s
kube-system   kube-scheduler-minikube            1/1     Running   0               3m
kube-system   storage-provisioner                1/1     Running   1 (107s ago)    2m37s

# 为了方便你可以 把这个追加到你的sh配置,我是zshell,所以是~/.zshrc
alias kubectl="minikube kubectl --"

进入 gitopsbook/resources/chapter-02 目录,创建 nginx-pod.yaml 资源看 看

1
2
3
4
5
➜  chapter-02 git:(master) kubectl apply -f nginx-pod.yaml 
pod/nginx created
➜  chapter-02 git:(master) kubectl get pods
NAME    READY   STATUS     RESTARTS   AGE
nginx   0/1     Init:0/1   0          31s

这个时候nginx的pod就在准备中了,你可以使用 kubectl describe pod nginx 来查看pod的状态

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
➜  chapter-02 git:(master) kubectl describe pod nginx
Name:             nginx
Namespace:        default
Priority:         0
Service Account:  default
Node:             minikube/192.168.49.2
Start Time:       Tue, 10 Sep 2024 11:39:17 +0800
Labels:           <none>
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Init Containers:
  nginx-init:
    Container ID:  
    Image:         docker/whalesay
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
    Args:
      echo "<pre>$(cowsay -b 'Hello Kubernetes')</pre>" > /data/index.html
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pc5g7 (ro)
Containers:
  nginx:
    Container ID:   
    Image:          nginx:1.11
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /usr/share/nginx/html from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pc5g7 (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kube-api-access-pc5g7:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  92s   default-scheduler  Successfully assigned default/nginx to minikube
  Normal  Pulling    88s   kubelet            Pulling image "docker/whalesay"
  Normal  Pulled     8s    kubelet            Successfully pulled image "docker/whalesay" in 1m19.596736044s (1m19.596743758s including waiting)