相关文章推荐
微醺的墨镜  ·  SciPy 插值 | 菜鸟教程·  2 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am parsing a set of data into an ELK stack for some non-tech folks to view. As part of this, I want to remove all fields except a specific known subset of fields from the events before sending into ElasticSearch.

I can explicitly specify each field to drop in a mutate filter like so:

filter {
    mutate {
        remove_field => [ "throw_away_field1", "throw_away_field2" ]

In this case, anytime a new field gets added to the input data (which can happen often since the data is pulled from a queue and used by multiple systems for multiple purposes) it would require an update to the filtering, which is extra overhead that's not needed. Not to mention if some sensitive data made it through between when the input streams were updated and when the filtering was updated, that could be bad.

Is there a way using the logstash filter to iterate over each field of an object, and remove_field if it is not in a provided list of field names? Or would I have to write a custom filter to do this? Basically, for every single object, I just want to keep 8 specific fields, and toss absolutely everything else.

It looks like very minimal if ![field] =~ /^value$/ type logic is available in the logstash.conf file, but I don't see any examples that would iterate over the fields themselves in a for each style and compare the field name to a list of values.

Answer:

After upgrading logstash to 1.5.0 to be able to use plugin extensions such as prune, the solution ended up looking like this:

filter {
    prune {
        interpolate => true
        whitelist_names => ["fieldtokeep1","fieldtokeep2"]
                I had to upgrade logstash to get this working, hence the delay, but this does exactly what I am looking for. Thanks for the quick answer! Accepted :)
– redstonemercury
                Oct 29, 2015 at 17:45

Another option would be to move parsed json into new a field and than use mutate,e.g:

filter {
   json {
      source => "json"
      target => "parsed_json"
   mutate {
      add_field => {"nested_field" => "%{[parsed_json][nested_field]}"}
      remove_field => [ "json", "parsed_json" ]
                This is a great alternate solution, and would have prevented me from having to upgrade logstash to get the prune filter installed.
– redstonemercury
                Aug 25, 2016 at 16:39
                @redstonemercury I think you can install plugin instead of making upgrade for logstash logstash-filter-prune
– oivoodoo
                Nov 2, 2016 at 14:16
                if there is a problem about 'prune', this can be quite good plan-b. like me... mutate is core-package.
– horoyoi o
                Jun 29, 2020 at 5:29
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.