Automation and Integration with Zabbix API
Advanced Zabbix Data Pre-processing
Advanced Zabbix Security Administration
Advanced Problem and Anomaly Detection with Zabbix
Advanced Zabbix SNMP monitoring
Source:
https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/influxdb_http?at=release/6.4
InfluxDB by HTTP
Overview
This template is designed for the effortless deployment of InfluxDB monitoring by Zabbix via HTTP and doesn't require any external scripts.
Requirements
Zabbix version: 6.4 and higher.
Tested versions
This template has been tested on:
InfluxDB 2.0
Configuration
Zabbix should be configured according to instructions in the
Templates out of the box
section.
Setup
This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint.
For organization discovery template need to use Authorization via API token. See docs:
https://docs.influxdata.com/influxdb/v2.0/security/tokens/
Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.
NOTE.
Some metrics may not be collected depending on your InfluxDB instance version and configuration.
Macros used
Description
Default
{$INFLUXDB.ORG_NAME.NOT_MATCHES}
Filter to exclude discovered organizations
CHANGE_IF_NEEDED
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Maximum number of tasks runs failures for trigger expression.
{$INFLUXDB.REQ.FAIL.MAX.WARN}
Maximum number of query requests failures for trigger expression.
Items
Description
Key and additional info
HTTP agent
influx.get_metrics
Preprocessing
InfluxDB: Instance status
Get the health of an instance.
HTTP agent
influx.healthcheck
Preprocessing
-
Check for not supported value
⛔️Custom on fail: Set value to:
{"status":"fail"}]}
-
JavaScript:
return JSON.parse(value).status == 'pass' ? 1: 0
-
Discard unchanged with heartbeat:
30m
InfluxDB: Boltdb reads, rate
Total number of boltdb reads per second.
Dependent item
influxdb.boltdb_reads.rate
Preprocessing
InfluxDB: Boltdb writes, rate
Total number of boltdb writes per second.
Dependent item
influxdb.boltdb_writes.rate
Preprocessing
InfluxDB: Buckets, total
Number of total buckets on the server.
Dependent item
influxdb.buckets.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_buckets_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Dashboards, total
Number of total dashboards on the server.
Dependent item
influxdb.dashboards.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_dashboards_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Organizations, total
Number of total organizations on the server.
Dependent item
influxdb.organizations.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_organizations_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Scrapers, total
Number of total scrapers on the server.
Dependent item
influxdb.scrapers.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_scrapers_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Telegraf plugins, total
Number of individual telegraf plugins configured.
Dependent item
influxdb.telegraf_plugins.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Telegrafs, total
Number of total telegraf configurations on the server.
Dependent item
influxdb.telegrafs.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_telegrafs_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Tokens, total
Number of total tokens on the server.
Dependent item
influxdb.tokens.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_tokens_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Users, total
Number of total users on the server.
Dependent item
influxdb.users.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_users_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Version
Version of the InfluxDB instance.
Dependent item
influxdb.version
Preprocessing
InfluxDB: Uptime
InfluxDB process uptime in seconds.
Dependent item
influxdb.uptime
Preprocessing
InfluxDB: Workers currently running
Total number of workers currently running tasks.
Dependent item
influxdb.task_executor_runs_active.total
Preprocessing
InfluxDB: Workers busy, pct
Percent of total available workers that are currently busy.
Dependent item
influxdb.task_executor_workers_busy.pct
Preprocessing
InfluxDB: Task runs failed, rate
Total number of failure runs across all tasks.
Dependent item
influxdb.task_executor_complete.failed.rate
Preprocessing
InfluxDB: Task runs successful, rate
Total number of runs successful completed across all tasks.
Dependent item
influxdb.task_executor_complete.successful.rate
Preprocessing
Triggers
Description
Expression
Severity
Dependencies and additional info
InfluxDB: Health check was failed
The InfluxDB instance is not available or unhealthy.
last(/InfluxDB by HTTP/influx.healthcheck)=0
InfluxDB: Version has changed
InfluxDB version has changed. Acknowledge to close the problem manually.
last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0
Manual close
: Yes
InfluxDB: has been restarted
Uptime is less than 10 minutes.
last(/InfluxDB by HTTP/influxdb.uptime)<10m
Manual close
: Yes
InfluxDB: Too many tasks failure runs
"Number of failure runs completed across all tasks is too high."
min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Warning
InfluxDB: [{#ORG_NAME}] Query requests bytes, success
Count of bytes received with status 200 per second.
Dependent item
influxdb.org.query_request_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests bytes, failed
Count of bytes received with status not 200 per second.
Dependent item
influxdb.org.query_request_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests, failed
Total number of query requests with status not 200 per second.
Dependent item
influxdb.org.query_request.failed.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests, success
Total number of query requests with status 200 per second.
Dependent item
influxdb.org.query_request.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query response bytes, success
Count of bytes returned with status 200 per second.
Dependent item
influxdb.org.http_query_response_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query response bytes, failed
Count of bytes returned with status not 200 per second.
Dependent item
influxdb.org.http_query_response_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
Trigger prototypes for Organizations discovery
Description
Expression
Severity
Dependencies and additional info
InfluxDB: [{#ORG_NAME}]: Too many requests failures
Too many query requests failed.
min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN}
Warning
Source:
https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/influxdb_http?at=release/6.2
InfluxDB by HTTP
Overview
For Zabbix version: 6.2 and higher
The template to monitor InfluxDB by Zabbix that works without any external scripts.
Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Template
InfluxDB by HTTP
— collects metrics by HTTP agent from InfluxDB /metrics endpoint.
This template was tested on:
InfluxDB, version 2.0
Setup
See
Zabbix template operation
for basic instructions.
This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint.
For organization discovery template need to use Authorization via API token. See docs:
https://docs.influxdata.com/influxdb/v2.0/security/tokens/
Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.
NOTE.
Some metrics may not be collected depending on your InfluxDB instance version and configuration.
Zabbix configuration
No specific Zabbix configuration is required.
Macros used
Description
Default
{$INFLUXDB.ORG_NAME.NOT_MATCHES}
Filter to exclude discovered organizations
CHANGE_IF_NEEDED
{$INFLUXDB.REQ.FAIL.MAX.WARN}
Maximum number of query requests failures for trigger expression.
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Maximum number of tasks runs failures for trigger expression.
{$INFLUXDB.URL}
InfluxDB instance URL
http://localhost:8086
Template links
There are no template links in this template.
Discovery rules
Description
Key and additional info
Get the health of an instance.
HTTP_AGENT
influx.healthcheck
Preprocessing
:
- CHECK_NOT_SUPPORTED
⛔️ON_FAIL:
CUSTOM_VALUE -> {"status":"fail"}]}
- JAVASCRIPT:
return JSON.parse(value).status == 'pass' ? 1: 0
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Boltdb reads, rate
Total number of boltdb reads per second.
DEPENDENT
influxdb.boltdb_reads.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="boltdb_reads_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Boltdb writes, rate
Total number of boltdb writes per second.
DEPENDENT
influxdb.boltdb_writes.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="boltdb_writes_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Buckets, total
Number of total buckets on the server.
DEPENDENT
influxdb.buckets.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_buckets_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Dashboards, total
Number of total dashboards on the server.
DEPENDENT
influxdb.dashboards.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_dashboards_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Organizations, total
Number of total organizations on the server.
DEPENDENT
influxdb.organizations.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_organizations_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Scrapers, total
Number of total scrapers on the server.
DEPENDENT
influxdb.scrapers.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_scrapers_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Telegraf plugins, total
Number of individual telegraf plugins configured.
DEPENDENT
influxdb.telegraf_plugins.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Telegrafs, total
Number of total telegraf configurations on the server.
DEPENDENT
influxdb.telegrafs.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_telegrafs_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Tokens, total
Number of total tokens on the server.
DEPENDENT
influxdb.tokens.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_tokens_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Users, total
Number of total users on the server.
DEPENDENT
influxdb.users.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_users_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Version
Version of the InfluxDB instance.
DEPENDENT
influxdb.version
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_info")].labels.version.first()
- DISCARD_UNCHANGED_HEARTBEAT:
3h
InfluxDB
InfluxDB: Uptime
InfluxDB process uptime in seconds.
DEPENDENT
influxdb.uptime
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_uptime_seconds")].value.first()
InfluxDB
InfluxDB: Workers currently running
Total number of workers currently running tasks.
DEPENDENT
influxdb.task_executor_runs_active.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_active")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
InfluxDB
InfluxDB: Workers busy, pct
Percent of total available workers that are currently busy.
DEPENDENT
influxdb.task_executor_workers_busy.pct
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_workers_busy")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
InfluxDB
InfluxDB: Task runs failed, rate
Total number of failure runs across all tasks.
DEPENDENT
influxdb.task_executor_complete.failed.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "failed")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Task runs successful, rate
Total number of runs successful completed across all tasks.
DEPENDENT
influxdb.task_executor_complete.successful.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "success")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests bytes, success
Count of bytes received with status 200 per second.
DEPENDENT
influxdb.org.query_request_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests bytes, failed
Count of bytes received with status not 200 per second.
DEPENDENT
influxdb.org.query_request_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests, failed
Total number of query requests with status not 200 per second.
DEPENDENT
influxdb.org.query_request.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_count" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests, success
Total number of query requests with status 200 per second.
DEPENDENT
influxdb.org.query_request.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_count" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query response bytes, success
Count of bytes returned with status 200 per second.
DEPENDENT
influxdb.org.http_query_response_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_response_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query response bytes, failed
Count of bytes returned with status not 200 per second.
DEPENDENT
influxdb.org.http_query_response_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_response_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
Zabbix raw items
InfluxDB: Get instance metrics
HTTP_AGENT
influx.get_metrics
Preprocessing
:
- CHECK_NOT_SUPPORTED
⛔️ON_FAIL:
DISCARD_VALUE ->
- PROMETHEUS_TO_JSON
Triggers
Description
Expression
Severity
Dependencies and additional info
InfluxDB: Health check was failed
The InfluxDB instance is not available or unhealthy.
last(/InfluxDB by HTTP/influx.healthcheck)=0
InfluxDB: Version has changed
InfluxDB version has changed. Ack to close.
last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0
Manual close: YES
InfluxDB: has been restarted
Uptime is less than 10 minutes.
last(/InfluxDB by HTTP/influxdb.uptime)<10m
Manual close: YES
InfluxDB: Too many tasks failure runs
"Number of failure runs completed across all tasks is too high."
min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
WARNING
InfluxDB: [{#ORG_NAME}]: Too many requests failures
Too many query requests failed.
min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN}
WARNING
Source:
https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/influxdb_http?at=release/6.0
InfluxDB by HTTP
Overview
This template is designed for the effortless deployment of InfluxDB monitoring by Zabbix via HTTP and doesn't require any external scripts.
Requirements
Zabbix version: 6.0 and higher.
Tested versions
This template has been tested on:
InfluxDB 2.0
Configuration
Zabbix should be configured according to instructions in the
Templates out of the box
section.
Setup
This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint.
For organization discovery template need to use Authorization via API token. See docs:
https://docs.influxdata.com/influxdb/v2.0/security/tokens/
Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.
NOTE.
Some metrics may not be collected depending on your InfluxDB instance version and configuration.
Macros used
Description
Default
{$INFLUXDB.ORG_NAME.NOT_MATCHES}
Filter to exclude discovered organizations
CHANGE_IF_NEEDED
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Maximum number of tasks runs failures for trigger expression.
{$INFLUXDB.REQ.FAIL.MAX.WARN}
Maximum number of query requests failures for trigger expression.
Items
Description
Key and additional info
HTTP agent
influx.get_metrics
Preprocessing
InfluxDB: Instance status
Get the health of an instance.
HTTP agent
influx.healthcheck
Preprocessing
-
Check for not supported value
⛔️Custom on fail: Set value to:
{"status":"fail"}]}
-
JavaScript:
return JSON.parse(value).status == 'pass' ? 1: 0
-
Discard unchanged with heartbeat:
30m
InfluxDB: Boltdb reads, rate
Total number of boltdb reads per second.
Dependent item
influxdb.boltdb_reads.rate
Preprocessing
InfluxDB: Boltdb writes, rate
Total number of boltdb writes per second.
Dependent item
influxdb.boltdb_writes.rate
Preprocessing
InfluxDB: Buckets, total
Number of total buckets on the server.
Dependent item
influxdb.buckets.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_buckets_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Dashboards, total
Number of total dashboards on the server.
Dependent item
influxdb.dashboards.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_dashboards_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Organizations, total
Number of total organizations on the server.
Dependent item
influxdb.organizations.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_organizations_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Scrapers, total
Number of total scrapers on the server.
Dependent item
influxdb.scrapers.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_scrapers_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Telegraf plugins, total
Number of individual telegraf plugins configured.
Dependent item
influxdb.telegraf_plugins.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Telegrafs, total
Number of total telegraf configurations on the server.
Dependent item
influxdb.telegrafs.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_telegrafs_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Tokens, total
Number of total tokens on the server.
Dependent item
influxdb.tokens.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_tokens_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Users, total
Number of total users on the server.
Dependent item
influxdb.users.total
Preprocessing
-
JSON Path:
$[?(@.name=="influxdb_users_total")].value.first()
⛔️Custom on fail: Discard value
-
Discard unchanged with heartbeat:
30m
InfluxDB: Version
Version of the InfluxDB instance.
Dependent item
influxdb.version
Preprocessing
InfluxDB: Uptime
InfluxDB process uptime in seconds.
Dependent item
influxdb.uptime
Preprocessing
InfluxDB: Workers currently running
Total number of workers currently running tasks.
Dependent item
influxdb.task_executor_runs_active.total
Preprocessing
InfluxDB: Workers busy, pct
Percent of total available workers that are currently busy.
Dependent item
influxdb.task_executor_workers_busy.pct
Preprocessing
InfluxDB: Task runs failed, rate
Total number of failure runs across all tasks.
Dependent item
influxdb.task_executor_complete.failed.rate
Preprocessing
InfluxDB: Task runs successful, rate
Total number of runs successful completed across all tasks.
Dependent item
influxdb.task_executor_complete.successful.rate
Preprocessing
Triggers
Description
Expression
Severity
Dependencies and additional info
InfluxDB: Health check was failed
The InfluxDB instance is not available or unhealthy.
last(/InfluxDB by HTTP/influx.healthcheck)=0
InfluxDB: Version has changed
InfluxDB version has changed. Acknowledge to close the problem manually.
last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0
Manual close
: Yes
InfluxDB: has been restarted
Uptime is less than 10 minutes.
last(/InfluxDB by HTTP/influxdb.uptime)<10m
Manual close
: Yes
InfluxDB: Too many tasks failure runs
"Number of failure runs completed across all tasks is too high."
min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Warning
InfluxDB: [{#ORG_NAME}] Query requests bytes, success
Count of bytes received with status 200 per second.
Dependent item
influxdb.org.query_request_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests bytes, failed
Count of bytes received with status not 200 per second.
Dependent item
influxdb.org.query_request_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests, failed
Total number of query requests with status not 200 per second.
Dependent item
influxdb.org.query_request.failed.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query requests, success
Total number of query requests with status 200 per second.
Dependent item
influxdb.org.query_request.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query response bytes, success
Count of bytes returned with status 200 per second.
Dependent item
influxdb.org.http_query_response_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
InfluxDB: [{#ORG_NAME}] Query response bytes, failed
Count of bytes returned with status not 200 per second.
Dependent item
influxdb.org.http_query_response_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
Trigger prototypes for Organizations discovery
Description
Expression
Severity
Dependencies and additional info
InfluxDB: [{#ORG_NAME}]: Too many requests failures
Too many query requests failed.
min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN}
Warning
Source:
https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/db/influxdb_http?at=release/5.4
InfluxDB by HTTP
Overview
For Zabbix version: 5.4 and higher
The template to monitor InfluxDB by Zabbix that works without any external scripts.
Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.
Template
InfluxDB by HTTP
— collects metrics by HTTP agent from InfluxDB /metrics endpoint.
This template was tested on:
InfluxDB, version 2.0
Setup
See
Zabbix template operation
for basic instructions.
This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint.
For organization discovery template need to use Authorization via API token. See docs:
https://docs.influxdata.com/influxdb/v2.0/security/tokens/
Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}.
Also, see the Macros section for a list of macros used to set trigger values.
NOTE.
Some metrics may not be collected depending on your InfluxDB instance version and configuration.
Zabbix configuration
No specific Zabbix configuration is required.
Macros used
Description
Default
{$INFLUXDB.ORG_NAME.NOT_MATCHES}
Filter to exclude discovered organizations
CHANGE_IF_NEEDED
{$INFLUXDB.REQ.FAIL.MAX.WARN}
Maximum number of query requests failures for trigger expression.
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
Maximum number of tasks runs failures for trigger expression.
{$INFLUXDB.URL}
InfluxDB instance URL
http://localhost:8086
Template links
There are no template links in this template.
Discovery rules
Description
Key and additional info
Get the health of an instance.
HTTP_AGENT
influx.healthcheck
Preprocessing
:
- CHECK_NOT_SUPPORTED
⛔️ON_FAIL:
CUSTOM_VALUE -> {"status":"fail"}]}
- JAVASCRIPT:
return JSON.parse(value).status == 'pass' ? 1: 0
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Boltdb reads, rate
Total number of boltdb reads per second.
DEPENDENT
influxdb.boltdb_reads.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="boltdb_reads_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Boltdb writes, rate
Total number of boltdb writes per second.
DEPENDENT
influxdb.boltdb_writes.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="boltdb_writes_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Buckets, total
Number of total buckets on the server.
DEPENDENT
influxdb.buckets.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_buckets_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Dashboards, total
Number of total dashboards on the server.
DEPENDENT
influxdb.dashboards.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_dashboards_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Organizations, total
Number of total organizations on the server.
DEPENDENT
influxdb.organizations.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_organizations_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Scrapers, total
Number of total scrapers on the server.
DEPENDENT
influxdb.scrapers.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_scrapers_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Telegraf plugins, total
Number of individual telegraf plugins configured.
DEPENDENT
influxdb.telegraf_plugins.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Telegrafs, total
Number of total telegraf configurations on the server.
DEPENDENT
influxdb.telegrafs.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_telegrafs_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Tokens, total
Number of total tokens on the server.
DEPENDENT
influxdb.tokens.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_tokens_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Users, total
Number of total users on the server.
DEPENDENT
influxdb.users.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_users_total")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- DISCARD_UNCHANGED_HEARTBEAT:
30m
InfluxDB
InfluxDB: Version
Version of the InfluxDB instance.
DEPENDENT
influxdb.version
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_info")].labels.version.first()
- DISCARD_UNCHANGED_HEARTBEAT:
3h
InfluxDB
InfluxDB: Uptime
InfluxDB process uptime in seconds.
DEPENDENT
influxdb.uptime
Preprocessing
:
- JSONPATH:
$[?(@.name=="influxdb_uptime_seconds")].value.first()
InfluxDB
InfluxDB: Workers currently running
Total number of workers currently running tasks.
DEPENDENT
influxdb.task_executor_runs_active.total
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_active")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
InfluxDB
InfluxDB: Workers busy, pct
Percent of total available workers that are currently busy.
DEPENDENT
influxdb.task_executor_workers_busy.pct
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_workers_busy")].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
InfluxDB
InfluxDB: Task runs failed, rate
Total number of failure runs across all tasks.
DEPENDENT
influxdb.task_executor_complete.failed.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "failed")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: Task runs successful, rate
Total number of runs successful completed across all tasks.
DEPENDENT
influxdb.task_executor_complete.successful.rate
Preprocessing
:
- JSONPATH:
$[?(@.name=="task_executor_total_runs_complete" && @.labels.status == "success")].value.sum()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests bytes, success
Count of bytes received with status 200 per second.
DEPENDENT
influxdb.org.query_request_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests bytes, failed
Count of bytes received with status not 200 per second.
DEPENDENT
influxdb.org.query_request_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests, failed
Total number of query requests with status not 200 per second.
DEPENDENT
influxdb.org.query_request.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_count" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query requests, success
Total number of query requests with status 200 per second.
DEPENDENT
influxdb.org.query_request.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_request_count" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query response bytes, success
Count of bytes returned with status 200 per second.
DEPENDENT
influxdb.org.http_query_response_bytes.success.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_response_bytes" && @.labels.status == "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
InfluxDB
InfluxDB: [{#ORG_NAME}] Query response bytes, failed
Count of bytes returned with status not 200 per second.
DEPENDENT
influxdb.org.http_query_response_bytes.failed.rate["{#ORG_NAME}"]
Preprocessing
:
- JSONPATH:
$[?(@.name=="http_query_response_bytes" && @.labels.status != "200" && @.labels.endpoint == "/api/v2/query" && @.labels.org_id == "{#ORG_ID}") ].value.first()
⛔️ON_FAIL:
DISCARD_VALUE ->
- CHANGE_PER_SECOND
Zabbix_raw_items
InfluxDB: Get instance metrics
HTTP_AGENT
influx.get_metrics
Preprocessing
:
- CHECK_NOT_SUPPORTED
⛔️ON_FAIL:
DISCARD_VALUE ->
- PROMETHEUS_TO_JSON
Triggers
Description
Expression
Severity
Dependencies and additional info
InfluxDB: Health check was failed
The InfluxDB instance is not available or unhealthy.
last(/InfluxDB by HTTP/influx.healthcheck)=0
InfluxDB: Version has changed (new version: {ITEM.VALUE})
InfluxDB version has changed. Ack to close.
last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0
Manual close: YES
InfluxDB: has been restarted (uptime < 10m)
Uptime is less than 10 minutes
last(/InfluxDB by HTTP/influxdb.uptime)<10m
Manual close: YES
InfluxDB: Too many tasks failure runs (over {$INFLUXDB.TASK.RUN.FAIL.MAX.WARN} for 5m)
"Number of failure runs completed across all tasks is too high."
min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}
WARNING
InfluxDB: [{#ORG_NAME}]: Too many requests failures (over {$INFLUXDB.REQ.FAIL.MAX.WARN} for 5m)
Too many query requests failed.
min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN}
WARNING
Request custom integration
Zabbix integration team will develop custom integration based on your requirements and Zabbix best practices.
Request
Propose your integration
Have you already developed high quality integration and want to submit to Zabbix integration repository?
Propose