Etcd

Etcd

Etcd

etcd是什么

etcd是基于Raft协议,存放键值对的分布式数据存储,可以作为我们业务的配置中心,可以作为简单的kv存储,消费端能够检测etcd的配置变化。

应用场景

服务发现

消息发布和订阅

负载均衡

分布式锁

etcd安装

具体请参考etcd官方安装文档 centos系

1
sudo yum -y install etcd

debian系

1
sudo apt-get install etcd

日志

起来服务以后,通过journalctl -f -t etcd或者journalctl -u etcd查看

集群搭建

https://etcd.io/docs/v3.3/demo/ 建议看下这里的配置 这里我以三台服务器搭建为例,注意两台无法实现分布式集群搭建,etcd支持最大断掉(N-1)/2的掉线, 如果用2台,任意一台掉线都无法正常工作

大家可以减配,扩到3台或者5台

这里一定要注意,如果用镜像或者clone的话,可能会有各种各样的错误,比较好的办法是克隆机器上执行

1
mv /var/lib/etcd /var/lib/etcd.bak

这里简单贴下我的配置,配置完以后systemctl start etcd, 可能会卡一会,等其他的都上来就不卡了

如果出现报错等信息,可以从这里查看journalctl -f -t etcd和journalctl -u etcd里面查看

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[root@terraform-server01 ~]# cat /etc/etcd/etcd.conf | egrep -v '^$|^#'
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://172.31.253.73:2380"
ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://172.31.253.73:2379"
ETCD_NAME="terraform-server01"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.31.253.73:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://172.31.253.73:2379"
ETCD_INITIAL_CLUSTER="terraform-server01=http://172.31.253.73:2380,terraform-server02=http://172.31.253.75:2380,terraform-server03=http://172.31.253.76:2380"
ETCD_INITIAL_CLUSTER_TOKEN="battle-etcd"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_ENABLE_V2="true"
ETCD_DEBUG="true"
ETCD_LOG_OUTPUT="default"

[root@terraform-server02 ~]# cat /etc/etcd/etcd.conf | egrep -v '^$|^#'
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://172.31.253.75:2380"
ETCD_LISTEN_CLIENT_URLS="http://172.31.253.75:2379,http://127.0.0.1:2379"
ETCD_NAME="terraform-server02"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.31.253.75:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://172.31.253.75:2379"
ETCD_INITIAL_CLUSTER="terraform-server01=http://172.31.253.73:2380,terraform-server02=http://172.31.253.75:2380,terraform-server03=http://172.31.253.76:2380"
ETCD_INITIAL_CLUSTER_TOKEN="battle-etcd"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_ENABLE_V2="true"
ETCD_DEBUG="true"

[root@terraform-server03 ~]# cat /etc/etcd/etcd.conf | egrep -v '^$|^#'
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://172.31.253.76:2380"
ETCD_LISTEN_CLIENT_URLS="http://172.31.253.76:2379,http://127.0.0.1:2379"
ETCD_NAME="terraform-server03"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.31.253.76:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://172.31.253.76:2379"
ETCD_INITIAL_CLUSTER="terraform-server01=http://172.31.253.73:2380,terraform-server02=http://172.31.253.75:2380,terraform-server03=http://172.31.253.76:2380"
ETCD_INITIAL_CLUSTER_TOKEN="battle-etcd"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_ENABLE_V2="true"
ETCD_DEBUG="true"

[root@terraform-server ~]# etcdctl member list 52b8eb31b4811ac, started, terraform-server-bak, http://172.31.253.75:2380, http://172.31.253.75:2379 e526595207a29ee6, started, terraform-server, http://172.31.253.73:2380, http://172.31.253.73:2379

etcd常用命令

请确认你的etcd正在运行,etcd的命令主要是etcdctl相关

简要看下命令行

通过etcdctl执行看下有哪些命令

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
liuliancao@liuliancao:~/Documents/study$ etcdctl
NAME:
        etcdctl - A simple command line client for etcd3.

USAGE:
        etcdctl [flags]

VERSION:
        3.3.25

API VERSION:
        3.3


COMMANDS:
        alarm disarm		Disarms all alarms # 屏蔽告警
        alarm list		Lists all alarms # 列出所有告警
        auth disable		Disables authentication # 禁止认证
        auth enable		Enables authentication # 打开认证
        check perf		Check the performance of the etcd cluster # 压测qps
        compaction		Compacts the event history in etcd # 检查时间
        defrag			Defragments the storage of the etcd members with given endpoints # 存储整合
        del			Removes the specified key or range of keys [key, range_end) # 删除key或者一些key
        elect			Observes and participates in leader election # 选举
        endpoint hashkv		Prints the KV history hash for each endpoint in --endpoints # 打印hash,包含hash历史
        endpoint health		Checks the healthiness of endpoints specified in `--endpoints` flag # 节点健康度查询
        endpoint status		Prints out the status of endpoints specified in `--endpoints` flag # 节点状态
        get			Gets the key or a range of keys # 获取一个或者多个key的值
        help			Help about any command
        lease grant		Creates leases # 创建一个临时租约认证
        lease keep-alive	Keeps leases alive (renew) # 更新临时值租约时间
        lease list		List all active leases # 列出所有租约
        lease revoke		Revokes leases # 取消租约
        lease timetolive	Get lease information # 获取租约到期时间
        lock			Acquires a named lock # 获取一个lock
        make-mirror		Makes a mirror at the destination etcd cluster # 集群克隆
        member add		Adds a member into the cluster # 集群添加成员
        member list		Lists all members in the cluster # 集群列出成员
        member remove		Removes a member from the cluster # 集群移走成员
        member update		Updates a member in the cluster # 集群更新成员信息
        migrate			Migrates keys in a v2 store to a mvcc store # 迁移v2的到mvcc
        move-leader		Transfers leadership to another etcd cluster member. # 更换新的leader
        put			Puts the given key into the store # 增加一个key
        role add		Adds a new role # 角色增加
        role delete		Deletes a role # 角色删除
        role get		Gets detailed information of a role # 获取角色
        role grant-permission	Grants a key to a role # 角色授权
        role list		Lists all roles # 获取所有角色
        role revoke-permission	Revokes a key from a role # 角色移除权限
        snapshot restore	Restores an etcd member snapshot to an etcd directory # 快照恢复
        snapshot save		Stores an etcd node backend snapshot to a given file # 快照保存
        snapshot status		Gets backend snapshot status of a given file # 快照状态
        txn			Txn processes all the requests in one transaction # dsl or sql for etcdv3
        user add		Adds a new user # 添加用户
        user delete		Deletes a user # 删除用户
        user get		Gets detailed information of a user # 获取用户
        user grant-role		Grants a role to a user # 用户授权
        user list		Lists all users # 获取所有用户
        user passwd		Changes password of user # 给用户设置密码
        user revoke-role	Revokes a role from a user # 移除用户权限
        version			Prints the version of etcdctl # 版本
        watch			Watches events stream on keys or prefixes # 监测etcd里面的值的变化

OPTIONS:
      --cacert=""				verify certificates of TLS-enabled secure servers using this CA bundle
      --cert=""					identify secure client using this TLS certificate file
      --command-timeout=5s			timeout for short running command (excluding dial timeout)
      --debug[=false]				enable client-side debug logging
      --dial-timeout=2s				dial timeout for client connections
  -d, --discovery-srv=""			domain name to query for SRV records describing cluster endpoints
      --endpoints=[127.0.0.1:2379]		gRPC endpoints
  -h, --help[=false]				help for etcdctl
      --hex[=false]				print byte strings as hex encoded strings
      --insecure-discovery[=true]		accept insecure SRV records describing cluster endpoints
      --insecure-skip-tls-verify[=false]	skip server certificate verification (CAUTION: this option should be enabled only for testing purposes)
      --insecure-transport[=true]		disable transport security for client connections
      --keepalive-time=2s			keepalive time for client connections
      --keepalive-timeout=6s			keepalive timeout for client connections
      --key=""					identify secure client using this TLS key file
      --user=""					username[:password] for authentication (prompt if password is not supplied)
  -w, --write-out="simple"			set the output format (fields, json, protobuf, simple, table) # 如果希望保持特殊格式,可以更改对应的systemctl service 文件

etcd和不少软件都一样,存在v2和v3版本的较大差异,而默认是etcdv2的,所以记得/etc/profile增加一行

1
2
echo export ETCDCTL_API=3 >> /etc/profile
export ETCDCTL_API=3

下面部分来源于官方文档,大家也可以按照官方文档操作一遍, 可参考官方demo

集群健康度查询

1
2
3
liuliancao@liuliancao:~/Documents/study$ etcdctl cluster-health
member 8e9e05c52164694d is healthy: got healthy result from http://localhost:2379
cluster is healthy

增加一个key

1
2
liuliancao@liuliancao:~/Documents/study$ etcdctl put liuliancao-os debian-bookworm
OK

查询一个key

1
2
3
liuliancao@liuliancao:~/Documents/study$ etcdctl get liuliancao-os
liuliancao-os
debian-bookworm

查看后端

etcdctl endpoint status -w table

获取当前revision版本

etcdctl get revisiontestkey -w json

如何获取历史的值

etcdctl get –rev=REVISION KEY 先看一组例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
liuliancao@liuliancao-dev:~$ etcdctl get revisiontestkey  -w json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":8961,"raft_term":2}}
liuliancao@liuliancao-dev:~$ etcdctl put a 1
OK
liuliancao@liuliancao-dev:~$ etcdctl get revisiontestkey  -w json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":8962,"raft_term":2}}
liuliancao@liuliancao-dev:~$ etcdctl put b 1
OK
liuliancao@liuliancao-dev:~$ etcdctl get revisiontestkey  -w json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":8963,"raft_term":2}}
liuliancao@liuliancao-dev:~$ etcdctl put a 0
OK
liuliancao@liuliancao-dev:~$ etcdctl get revisiontestkey  -w json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":8964,"raft_term":2}}
liuliancao@liuliancao-dev:~$ etcdctl put b 0
OK
liuliancao@liuliancao-dev:~$ etcdctl get revisiontestkey  -w json
{"header":{"cluster_id":14841639068965178418,"member_id":10276657743932975437,"revision":8965,"raft_term":2}}
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8960
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8961
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8962
a
1
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8963
a
1
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8964
a
0
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=8965
a
0
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=0
a
0
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=-1
a
0
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=-2
a
0
liuliancao@liuliancao-dev:~$ etcdctl get a --rev=-3
a
0

可以得到如下结论

  • revision是每一次库发生变化的时候都有一个版本
  • –rev=NUMBER 0代表最新的,如果之前做过compact就会无法获取提示被compact,如果是在compact以后的revsion,如果a还没被创建,

默认为空值,如果a被创建后,为当时的revision+1的时候的值 那么问题来了,如何拿到kv的上一次历史的值,我觉得有两个思路

  • 服务端增加watch etcdctl watch –prev-kv KEY或者etcdctl watch –rev=xx KEY把历史rev watch并写入log,取上次的值只需要向上检索匹配这个值即可
  • 用etcdctl获取当前revsion+1然后向上遍历,获取一个不一样的值
  • etcdctl是否有自己的逻辑能拿到上一个version的值

当key不存在时候创建

etcdctl mk KEY VALUE 如果 key 的值是 "hello, etcd",就把它替换为 "goodbye, etcd" etcdctl set –swap-with-value "hello, world" /message "goodbye, etcd"

可能出现的问题

cluster ID mismatch

1
2
12月 22 14:06:41 terraform-server-bak etcd[3943]: request sent was ignored (cluster ID mismatch: peer[e526595207a29ee6]=cdf818194e3a8c32, local=13a387fc610c3b
12月 22 14:06:41 terraform-server-bak etcd[3943]: failed to dial e526595207a29ee6 on stream Message (cluster ID mismatch)

这个时候master那边需要备份下数据目录,删掉 这个时候就能注册上去了

etcdctl member list总是不出另外一个

原因比较多,

  • 检查安全组
  • 日志检查journalctl -u etcd
  • 检查配置和官网的是否一致, 哪里是否写错了,端口2380,2379都尝试下,是否监听localhost了

执行put的时候Error: dial tcp 127.0.0.1:2379: connect: connection refused

因为没有监听在127.0.0.1,所以telnet localhost 2379也是不通的,

解决办法是改成0.0.0.0:2379(不建议)或者写两个(建议)http://your_ip:2379,http://localhost:2379

Failed at step CHDIR spawning /bin/bash: No such file or directory

目录备份不要太过火了,mv /var/lib/etcd下面的member为bak就好了,如果mv /var/lib/etcd /var/lib/etcd.bak则会报错

实在无法的终结办法

几个节点都rm -rf /var/lib/etcd/default.etcd,然后同时systemctl start etcd