ElasticSearch

ElasticSearch

ELK准备

添加源

具体可以参考https://www.elastic.co/guide/en/logstash/7.16/installing-logstash.html#_yum

debian系

1
2
3
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
sudo sh -c 'echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" > /etc/apt/sources.list.d/elastic-7.x.list'

centos系

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
cat <<EOF > /etc/yum.repos.d/elastic.repo
[logstash-7.x]
name=Elastic repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

Logstash

安装

debian系
1
sudo apt-get install logstash
centos系
1
sudo yum -y install logstash

ElasticSearch

介绍

参考https://www.elastic.co/guide/cn/elasticsearch/guide/current/getting-started.html

elasticsearch是一个基于lucene库的实时的分布式搜索分析引擎,主要用作全文检索,结构化搜索,分析以及这三种的组合

常见的应用场景有系统日志分析、应用数据分析、安全审计、关键词搜索等

es是面向文档的,对于复杂关系,比如地理信息日期等对象都可以保存,这是相比较于关系型数据库优势的地方

安装

1
2
3
4
  # if centos
  yum -y install elasticsearch
  # if debian
  apt-get install elasticsearch

启动

1
2
3
4
5
systemctl start elasticsearch
systemctl enable elasticsearch
lsof -i:9200
COMMAND   PID          USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java    32481 elasticsearch  280u  IPv4 29805357      0t0  TCP localhost:wap-wsp (LISTEN)

可能报错:

启动报错了 failed; error='Not enough space' (errno=12) 修改下es的启动参数

1
2
3
4
5
6
liuliancao@liuliancao-dev:~/projects/lion$ sudo cat /etc/elasticsearch/jvm.options|grep Xm
## -Xms4g
## -Xmx4g
-Xms200m
-Xmx200m
启动时间较长,我的虚拟机大概20s..

生产jvm参数参考

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
-Xms4g
-Xmx4g
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly
14-:-XX:+UseG1GC
14-:-XX:G1ReservePercent=25
14-:-XX:InitiatingHeapOccupancyPercent=30
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/lib/elasticsearch
-XX:ErrorFile=/data/log/elasticsearch/hs_err_pid%p.log
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/data/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/data/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m

测试

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
cat logstash-first.conf
input { stdin { } }
output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}
# logstash -f logstash-first.conf
Using bundled JDK: /usr/share/logstash/jdk
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
hello, world!
[INFO ] 2021-07-07 11:09:32.287 [Ruby-0-Thread-10: :1] elasticsearch - Installing ILM policy {"policy"=>{"phases"=>{"hot"=>{"actions"=>{"rollover"=>{"max_size"=>"50gb", "max_age"=>"30d"}}}}}} {:name=>"logstash-policy"}
{
      "@version" => "1",
       "message" => "hello, world!",
          "host" => "xxx",
    "@timestamp" => 2021-07-07T03:09:32.187Z
}

代表es数据成功写入

集群搭建

参考集群搭建

三台服务器

RESTful API with JSON over http

通过9200交互

liuliancao@liuliancao-dev:~/projects/lion$ sudo lsof -i:9200 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 50762 elasticsearch 284u IPv6 141518 0t0 TCP localhost:9200 (LISTEN) java 50762 elasticsearch 285u IPv6 141519 0t0 TCP localhost:9200 (LISTEN)

Curl, Groovy, Javascript, .NET, PHP, Perl, Python, Ruby (https://www.elastic.co/guide/en/elasticsearch/client/index.html)
Curl

curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'

查询集群中文档数量

curl -XGET 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } ' 实际执行结果是 liuliancao@liuliancao-dev:~/projects/lion$ curl -XGET 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } ' { "error" : "Content-Type header [application/x-www-form-urlencoded] is not supported", "status" : 406 } ..., 需要调整下header, 这个结果代表我们没有分片和文档存在 liuliancao@liuliancao-dev:~/projects/lion$ curl -XGET -H 'Content-Type: application/json' 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } ' { "count" : 0, "_shards" : { "total" : 0, "successful" : 0, "skipped" : 0, "failed" : 0 } }

JSON形式保存对象
一些es中的概念
索引
类型
属性
集群状态查看
1
2
# curl -XGET 'http://localhost:9200/_cluster/health'
{"cluster_name":"web","status":"red","timed_out":false,"number_of_nodes":6,"number_of_data_nodes":3,"active_primary_shards":4416,"active_shards":4416,"relocating_shards":0,"initializing_shards":12,"unassigned_shards":34046,"delayed_unassigned_shards":0,"number_of_pending_tasks":66,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":907745,"active_shards_percent_as_number":11.477881166502053}
列出所有index
1
curl -X GET "localhost:9200/_cat/indices?v"
模糊删除index
1
DELETE /your-index-pattern*

当然习惯界面的话,在kibana索引管理,里面也可以删除

index的number_of_replicas number_of_shards设置

最近发现系统的shards满了,所以和同事一起看下了参数,发现对于index的参 数设置,分为动态和静态参数 https://www.elastic.co/guide/en/elasticsearch/reference/6.5/index-modules.html#_static_index_settings

https://www.elastic.co/guide/en/elasticsearch/reference/6.5/index-modules.html#dynamic-index-settings

首先前提是logstash-开头是我的索引,如果你没有对应的template,则需要创建 我主要想降低下number_of_shards和number_of_replicas

对于number_of_shards你是无法直接PUT /索引名字 修改settings的,只能关联 template来影响后续的index, 如果需要操作老的,则需要进行reindex操作

修改template

1
2
3
4
5
6
7
8
  PUT /_template/logstash
  {
    "index_patterns": ["*"],
    "settings": {
      "number_of_replicas": 0,
      "number_of_shards": 3
    }
  }

执行reindex样例和创建别名

1
2
3
4
5
6
7
8
9
POST _reindex
{
  "source": {
    "index": "xxx-2023.05-x"
  },
  "dest": {
    "index": "xxx-2023.05-x-new"
  }
}

后来发现集群还是red,检查unassighed shards发现还有,删除掉red的index,恢 复

1
GET _cluster/allocation/explain?pretty

发现提示是有问题的

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-allocation-explain.html 基本是这几种错误

DSL

Query查询

一个典型的查询 https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  GET /_search
  {
    "query": {
      "bool": {
        "must": [
          { "match": { "title":   "Search"        }},
          { "match": { "content": "Elasticsearch" }}
        ],
        "filter": [
          { "term":  { "status": "published" }},
          { "range": { "publish_date": { "gte": "2015-01-01" }}}
        ]
      }
    }
  }
指定正则匹配

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  GET /_search
  {
      "query": {
            "regexp": {
                    "user.id": {
                              "value": "k.*y",
                              "flags": "ALL",
                              "case_insensitive": true,
                              "max_determinized_states": 10000,
                              "rewrite": "constant_score"
                    }
            }
      }
  }

聚合查询

聚合里面进行count排序
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
"aggs": {
       "hours data": {
          "date_histogram": {
            "field": "@timestamp",
            "calendar_interval": "1m",
            "time_zone": "Asia/Shanghai",
            "min_doc_count": 100,
            "order": {
              "_count": "desc"
            }
          }
       }
  }

kibana

测试使用

浏览器访问服务器地址:5601端口 建议通过nginx+ssl配置,会比较安全

FAQ

es报错 kibana无法启动

shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_inforeason=ALLOCATION_FAILED], at[2024-03-24T12:14:02.651Z], failed_attempts[5], failed_nodes[[joxyW01nTNCGvFW1IjPQMQ, JaEcQBEZTOiztZdlj-iZBw, delayed=false, details[failed shard on node [JaEcQBEZTOiztZdlj-iZBw]: failed recovery, failure RecoveryFailedExceptionlogstash-overseas-ssjj2-hall-server_accesslog-2024.03; nested: CircuitBreakingExceptionparent] Data too large, data for [internal:index/shard/recovery/start_recovery] would be [4212820374/3.9gb], which is larger than the limit of [4080218931/3.7gb], real usage: [4212805912/3.9gb], new bytes reserved: [14462/14.1kb], usages [request=0/0b, fielddata=259024/252.9kb, in_flight_requests=23636/23kb, model_inference=0/0b, eql_sequence=0/0b, accounting=449937262/429mb; ], allocation_status[no_attempt]]]

结果是kibana一直挂,es状态异常,active到不了100%

解决 elasticsearch.yml

1
2
indices.breaker.total.use_real_memory:false
indices.breaker.total.limit: 70%

增加这个以后集群状态变成green了