[Doris 监控] - 使用OpenResty对Stream Load监控告警

数据分析,监控

2021-01-21

406

0

在Doris中, 实时导入数据是通过Stream Load来实现的, 通过我们的一些实践发现, stream load频率和每次load的数据量对于Doris服务稳定有极大的影响.
目前doris最新版为0.13.0, 还是建议Doris以微批的方式, 以分钟级间隔通过Stream Load导入. 所以监控和告警就不得不做.

架构图

Grafana: 查看监控数据
Prometheus: 异常指标报警
Doris自身: 统计Stream Load频率, 次数等

OpenResty vs Nginx

OpenResty是基于nginx的一个应用层工具平台, 集成了许多有用的功能, 比如支持直接lua脚本. 更多介绍: https://openresty.org/cn/

哪些监控指标可以监控、告警

1. 每个用户stream load频率
2. 每个表stream load频率
3. 每个db stream load频率
4. 每个stream load耗时、load结果
5. stream load按天等维度的次数统计
6. stream load限速

安装 OpenResty

wget https://openresty.org/package/centos/openresty.repo
sudo mv openresty.repo /etc/yum.repos.d/
sudo yum check-update
sudo yum install -y openresty

默认安装目录: /usr/local/openresty/

配置OpenResty

  • 配置nginx.conf
    vim /usr/local/openresty/nginx/conf/nginx.conf
    
    内容如下:
    ```
    worker_processes auto;
    error_log /data/logs/nginx/error.log;## 需要修改
    events {
    worker_connections 10240;
    }

http {
include mime.types;
default_type application/octet-stream;
include /etc/nginx/conf.d/*.conf; ## 需要修改

sendfile           on;
keepalive_timeout  65;
server {
    listen          80;
    server_name     localhost;
    location / {
        root   html;
        index  index.html index.htm;
    }
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   html;
    }
}

}


- 配置stream load转发
在对应的conf.d/目录中添加配置文件, vim /etc/nginx/conf.d/doris_stream_load.conf

upstream normal_fe {
server fe ip:fe http端口; ## 修改
}
underscores_in_headers on;
log_format load_access_log_format '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'$upstream_response_time $request_time $content_length $http_label $resp_body';

server {
listen 9001;
access_log /data/logs/nginx/access.log load_access_log_format; ## 目录修改
error_log /data/logs/nginx/error.log; ## 目录修改
client_max_body_size 100000M;
proxy_connect_timeout 300;
proxy_send_timeout 300;
proxy_read_timeout 300;
send_timeout 300;
underscores_in_headers on;

set $resp_body "";
lua_need_request_body on;
body_filter_by_lua '
    local resp_body = string.sub(ngx.arg[1], 1, 1000)
    ngx.ctx.buffered = (ngx.ctx.buffered or "") .. resp_body
    if ngx.arg[2] then
    ngx.var.resp_body = ngx.ctx.buffered
    end
';
location / {
    proxy_pass http://normal_fe;
    proxy_set_header Expect '100-continue';
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

    proxy_intercept_errors on;
    error_page 301 302 307 = @mirrorredirect;
}

location @mirrorredirect {
    set $redirect_uri '$upstream_http_location';
    proxy_pass $redirect_uri;
    proxy_set_header Expect '100-continue';
    proxy_pass_request_body on;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

}


### 启动OpenResty


### 发送一个stream load请求, openresty的访问日志就包含监控指标, 如下一条

1.1.3.65 - admin [10/Dec/2020:22:35:36 +0800] "PUT /api/test/tbl_stream_load_mirror_02/_stream_load HTTP/1.1" 200 447 "-" "curl/7.29.0" "-" 0.003 : 0.032 3.034 390 stream_load_mirror_009 {\x0A \x22TxnId\x22: 14613236,\x0A \x22Label\x22: \x22stream_load_mirror_009\x22,\x0A \x22Status\x22: \x22Success\x22,\x0A \x22Message\x22: \x22OK\x22,\x0A \x22NumberTotalRows\x22: 10,\x0A \x22NumberLoadedRows\x22: 10,\x0A \x22NumberFilteredRows\x22: 0,\x0A \x22NumberUnselectedRows\x22: 0,\x0A \x22LoadBytes\x22: 390,\x0A \x22LoadTimeMs\x22: 31,\x0A \x22BeginTxnTimeMs\x22: 1,\x0A \x22StreamLoadPutTimeMs\x22: 1,\x0A \x22ReadDataTimeMs\x22: 0,\x0A \x22WriteDataTimeMs\x22: 14,\x0A \x22CommitAndPublishTimeMs\x22: 14\x0A}
```
每个字段和load_access_log_format一一对应, 再把这些数据load进入doris, 我们想要的各种监控数据就有啦.
最后一个json就是stream 返回结果, 我们将:
\x0A 替换为 \n
\x22 替换为 "
就可以看到解码后正常结果.

将访问日志解析后load进入doris即可统计各种指标作为监控数据

将访问日志解析后推送到prometheus, 即可实现各种指标的报警, 如何配置告警: /article/21

欢迎添加微信,互相学习↑↑↑ -_-

发表评论

全部评论:0条

白老虎

programming is not only to solve problems, ways to think