Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Analysing some log files using AWS CloudWatch Insights, I can plot a count aggregated in time bins with:
| stats count(*) by bin(1h)
This produces a graph, as expected, aggregating all logs in each time bin.
I want to split this data by a 'group' field, with values A and B.
| stats count(*) by group, bin(1h)
This returns log counts across time bins as expected, but the visualisation tab says 'No visualisation available.' I would like it to return a time series plot with a series for group A and a series for group B.
Where am I going wrong, or is this simply not possible?
Alright, I found a very janky way to solve this. It appears that no, you cannot do
| stats count(*) by group, bin(1h)
However, you can use parse to artificially create new variables, like so:
parse "[E*]" as @error
| parse "[I*]" as @info
| parse "[W*]" as @warning
| filter ispresent(@warning) or ispresent(@error) or ispresent(@info)
| stats count(@error) as error, count(@info) as info, count(@warning) as warning by bin(15m)
Here, I'm trying to look at the distribution of log types over time. I have three types of log messages with formats: "[ERROR]", "[INFO]" and "[WARNING]"
–
–
–
Expanding on the sample of @smth, i usually do it a little different,
with this query i follow the trend of status codes aggregated over time on a standard nginx access log
fields @timestamp, @message
| parse @message '* - * [*] "* * *" * * "-" "*"' as host, identity, dateTimeString, httpVerb, url, protocol, status, bytes, useragent
| stats count (*) as all, sum ( status < 299 ) as c_s200, sum ( status > 299 and status < 399 ) as c_s300, sum ( status > 399 and status < 499 ) as c_s400, sum ( status > 499 ) as c_s500 by bin (1m)
The trick on this is that expressions like "status > 499" returns 0 if false and 1 if true, and so, adding it up on the time bucket allows to simulate something like a 'count if [condition]'
And so does the sample generated graph look like, on the visualisation tab.
–
–
–
Here is another variation based on the answers of @smth et al. Not being able to build a chart automatically based on the values is very annoying, so you need to mention each value explicitly.
fields @timestamp, @message
| filter namespace like "audit-log"
| fields coalesce(`audit.client-version`, "legacy") as version
| parse version "legacy" as v0
| parse version "886ac066*" as v1
| parse version "93e021e2*" as v2
| stats count(*), count(v0), count(v1), count(v2) by bin(5m)
Result:
–
There is also an extension to the workaround described by @smth that can support more complicated statistics than count()
. Here's an example that graphs the CPU usage across different instances over time:
| fields (instance_id like "i-instance_1") as is_instance_1, (instance_id like "i-instance_2") as is_instance_2
| stats sum(cpu * is_instance_1) as cpu_1, sum(cpu * is_instance_2) as cpu_2 by bin(5m)
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.