[kəmˈpɑːzət],组合聚合。属于分桶聚合。
基于不同的源(source)来创建组合聚合(composite aggregation)桶。可以对多级的聚合的结果进行分页。该聚合方式提供了一种方式来流化某种聚合的所有桶,类似于文档的滚动(scroll)。
组合聚合目前不兼容 pipeline aggregation。
组合聚合基于文档的值来创建一个组合,每个组合可以看作是一个组合桶。
比如,文档的内容如下:
{"keyword": ["foo", "bar"],"number": [23, 65, 76]
}
通过使用组合聚合的方式,会产生如下几种组合桶。
{ "keyword": "foo", "number": 23 }
{ "keyword": "foo", "number": 65 }
{ "keyword": "foo", "number": 76 }
{ "keyword": "bar", "number": 23 }
{ "keyword": "bar", "number": 65 }
{ "keyword": "bar", "number": 76 }
sources:定义聚合源的列表。每个聚合源的名称需要唯一。
missing_bucket :默认 false,即如果某个聚合源的结果为空,则整体的组合聚合的结果会输出 []。如果设置 true,只有结果为空的聚合源输出 null,其它聚合源正常输出。
size:限制组合聚合的结果输出多少条数据。默认 10。
after:设置当前页的起点,即上一页的最后一条数据。
terms、histogram、date_histogram、geotile_grid 四种聚合可以作为聚合源。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_FlightTimeMin": {"terms": {"field": "FlightTimeMin"}}}
}
这种方式等价于直接使用 terms 聚合。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"runtime_mappings": {"FlightTimeMinChanged": {"type": "double","script": {"source": """emit(doc['FlightTimeMin'].value / 10)"""}}},"aggs": {"composite_FlightTimeMinChanged": {"composite": {"sources": [{"terms_FlightTimeMinChanged": {"terms": {"field": "FlightTimeMinChanged"}}}]}}}
}
支持运行时字段来创建组合桶。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_FlightTimeMin": {"composite": {"sources": [{"histogram_FlightTimeMin": {"histogram": {"field": "FlightTimeMin","interval": 10}}}]}}}
}
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp": {"composite": {"sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd"}}}]}}}
}
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp_FlightTimeMin": {"composite": {"sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd"}}},{"terms_FlightTimeMin": {"terms": {"field": "FlightTimeMin"}}}]}}}
}
先按照第一个聚合源进行排序,然后第二个。。以此类推。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp_FlightTimeMin": {"composite": {"sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd","order": "desc"}}},{"terms_FlightTimeMin": {"terms": {"field": "FlightTimeMin","order": "asc"}}}]}}}
}
首先使用组合聚合的方式,按照 OriginCountry、DestCountry 两个字段进行词项聚合。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_OriginCountry_DestCountry": {"composite": {"sources": [{"terms_OriginCountry": {"terms": {"field": "OriginCountry"}}},{"terms_DestCountry": {"terms": {"field": "DestCountry"}}}]}}}
}
聚合结果如下:
"aggregations" : {"composite_OriginCountry_DestCountry" : {"after_key" : {"terms_OriginCountry" : "AE","terms_DestCountry" : "CA"},"buckets" : [{"key" : {"terms_OriginCountry" : "AE","terms_DestCountry" : "AE"},"doc_count" : 9},{"key" : {"terms_OriginCountry" : "AE","terms_DestCountry" : "AR"},"doc_count" : 10},。。。。。。
作为对比,我们再使用 terms 子聚合的方式。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"terms_OriginCountry": {"terms": {"field": "OriginCountry"},"aggs": {"terms_DestCountry": {"terms": {"field": "DestCountry"}}}}}
}
聚合结果如下:
"aggregations" : {"terms_OriginCountry" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 4114,"buckets" : [{"key" : "IT","doc_count" : 2278,"terms_DestCountry" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 513,"buckets" : [{"key" : "IT","doc_count" : 459},{"key" : "US","doc_count" : 328},{"key" : "CN","doc_count" : 195},{"key" : "CA","doc_count" : 192},
在第二个聚合源中,我们指定一个不存在的字段 FlightTimeMin2。通过修改 missing_bucket 参数的值,对比它的作用。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp_FlightTimeMin": {"composite": {"sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd","order": "desc"}}},{"terms_FlightTimeMin": {"terms": {"field": "FlightTimeMin2","order": "asc","missing_bucket": false}}}]}}}
}
从上一页的 after_key 中,可以得到最后一条数据的内容。
"after_key" : {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 32.9625244140625
}
接下来将 after 参数的内容修改为上述 after_key 的内容,也就是基于上一页来展示下一页的数据内容。
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp_FlightTimeMin": {"composite": {"size": 5, "after": {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 13.010112762451172}, "sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd"}}},{"terms_FlightTimeMin": {"terms": {"field": "FlightTimeMin","missing_bucket": true}}}]}}}
}
GET kibana_sample_data_flights/_search
{"track_total_hits": true,"size": 0,"aggs": {"composite_timestamp_FlightTimeMin": {"composite": {"size": 2, "after": {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 13.010112762451172}, "sources": [{"date_histogram_timestamp": {"date_histogram": {"field": "timestamp","calendar_interval": "1d","format": "yyyy-MM-dd"}}},{"terms_FlightTimeMin": {"terms": {"field": "FlightTimeMin","missing_bucket": true}}}]},"aggs": {"stats_FlightTimeMin": {"stats": {"field": "FlightTimeMin"}}}}}
}
聚合结果输出如下:
"aggregations" : {"composite_timestamp_FlightTimeMin" : {"after_key" : {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 17.2014217376709},"buckets" : [{"key" : {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 16.21676254272461},"doc_count" : 1,"stats_FlightTimeMin" : {"count" : 1,"min" : 16.21676254272461,"max" : 16.21676254272461,"avg" : 16.21676254272461,"sum" : 16.21676254272461}},{"key" : {"date_histogram_timestamp" : "2022-08-28","terms_FlightTimeMin" : 17.2014217376709},"doc_count" : 1,"stats_FlightTimeMin" : {"count" : 1,"min" : 17.2014217376709,"max" : 17.2014217376709,"avg" : 17.2014217376709,"sum" : 17.2014217376709}}]}}
下一篇:web-商城项目测试-发货流程