环境:Ubuntu 12.04, Curator 4.1.0
搭建完ELK日志系统后,随着时间的推移,Elasticsearch中存储的数据会越来越多。因为保存的是日志文件,所以我们只希望保留一定时间的日志就可以,无需一直保存。Curator这个小工具就正好可以满足这个定时删除日志数据的需求,本文就主要介绍如何使用Curator自动删除Elasticsearch数据。
安装Curator
添加公共签名key:
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
添加软件源:
# 新建文件: /etc/apt/sources.list.d/curator.list
sudo touch /etc/apt/sources.list.d/curator.list
# 把下面的内容添加到上面的文件中,并保存
deb http://packages.elastic.co/curator/4/debian stable main
然后,更新并安装:
sudo apt-get update && sudo apt-get install python-elasticsearch-curator
安装完毕后,就可以通过一下命令查看版本:
curator --version
配置Curator
Curator默认查找配置文件的的地址为:~/.curator/curator.yml
我们需要自己新建一个配置文件,可以放置在任何的目录下面,然后可以通过参数: --config
来指定。
新建配置文件如下:
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
client:
# Elasticsearch的服务地址
hosts:
- 127.0.0.1
# Elasticsearch服务的端口
port: 9200
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
aws_key:
aws_secret_key:
aws_region:
ssl_no_validate: False
http_auth:
timeout: 30
master_only: False
logging:
loglevel: INFO
logfile:
logformat: default
blacklist: ['elasticsearch', 'urllib3']
创建删除索引的任务
新建文件delete_indeces.yml
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: delete_indices
description: >-
Delete indices older than 2 days (based on index name), for topbeat-
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: topbeat-
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 2
exclude:
2:
action: delete_indices
description: >-
Delete indices older than 2 days (based on index name), for filebeat-
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: filebeat-
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 2
exclude:
上面定义了两个删除任务,分别删除超过两天的topbeat-和filebeat-的索引
执行删除任务
配置完后,我们可以通过一下命令测试一下:
curator --config /path/to/curator.yml /path/to/delete_indeces.yml
出现一下的删除log,则表明配置成功:
2016-09-27 13:11:46,913 INFO Preparing Action ID: 1, "delete_indices"
2016-09-27 13:11:46,922 INFO Trying Action ID: 1, "delete_indices": Delete indices older than 2 days (based on index name), for topbeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly.
2016-09-27 13:11:46,952 INFO Deleting selected indices: [u'topbeat-2016.09.24', u'topbeat-2016.09.25', u'topbeat-2016.09.22', u'topbeat-2016.09.23']
2016-09-27 13:11:46,952 INFO ---deleting index topbeat-2016.09.24
2016-09-27 13:11:46,952 INFO ---deleting index topbeat-2016.09.25
2016-09-27 13:11:46,952 INFO ---deleting index topbeat-2016.09.22
2016-09-27 13:11:46,952 INFO ---deleting index topbeat-2016.09.23
2016-09-27 13:11:47,100 INFO Action ID: 1, "delete_indices" completed.
2016-09-27 13:11:47,100 INFO Preparing Action ID: 2, "delete_indices"
2016-09-27 13:11:47,102 INFO Trying Action ID: 2, "delete_indices": Delete indices older than 2 days (based on index name), for filebeat- prefixed indices. Ignore the error if the filter does not result in an actionable list of indices (ignore_empty_list) and exit cleanly.
2016-09-27 13:11:47,132 INFO Skipping action "delete_indices" due to empty list: <class 'curator.exceptions.NoIndices'>
2016-09-27 13:11:47,132 INFO Action ID: 2, "delete_indices" completed.
2016-09-27 13:11:47,132 INFO Job completed.
配置自动执行,让系统自动为你保留一定时间的日志信息
- 设置自动删除脚本:
schedule_script.sh
# !/bin/sh
定时删除大于一定时间的日志数据
date=`date "+%Y-%m-%d %H:%M"` echo "============================= begin at: $date ===============================" >> /path/to/logfile.log
/usr/local/bin/curator --config /path/to/curator.yml /path/to/delete_indeces.yml >> /path/to/logfile.log
echo "==================================== end =====================================" >> /path/to/logfile.log
> 需要注意的是,这个脚本中的目录都是绝对目录,不然在crontab执行时,就会无法正确的执行
* 设置crontab
运行: crontab -e
,然后添加一下行:
```shell
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
05 0 * * * sh /home/deployeryu/curator/schedule_script.sh >/dev/null 2>&1
上面设置了一个每天00:05分进行Elasticsearch 数据删除的定时任务!
上面命令行最后的小尾巴是防止cron 执行定时任务时默认是把脚本的输出发送到你的系统邮箱,然而Ubuntu本身没有邮件服务,因此它会报错:
No MTA installed, discarding output
,所以一定得加上>/dev/null 2>&1
在Ubuntu下,crontab的log默认不是打开的,如果你希望看到其执行日志,需要手动打开,
sudo vim /etc/rsyslog.d/50-default.conf
,把文件中的#cron.* /var/log/cron.log
的注释去掉,然后重启就可以了:service rsyslog restart
总结
经过以上设置,我们就再也不用担心自己的日志会越来越多了~
发布评论