반응형
Recent Posts
Recent Comments
관리 메뉴

개발잡부

[es] aggregation test 2 본문

ElasticStack/Elasticsearch

[es] aggregation test 2

닉의네임 2022. 7. 17. 17:47
반응형

 float, array, string, nested 데이터의 집계를 해보자

 

아래와 같은 맵핑 구조로 인덱스 생성

 

index 생성

PUT aggs_doo
{
  "settings": {
    "index": {
      "number_of_shards": 3,
      "number_of_replicas": 0
    }
  },
  "mappings": {
    "properties": {
      "benefit": {
        "type": "keyword"
      },
      "grade": {
        "type": "float"
      },
      "mallType": {
        "type": "keyword"
      },
      "resellers": {
        "type": "nested",
        "properties": {
          "reseller": {
            "type": "keyword"
          },
          "price": {
            "type": "double"
          }
        }
      }
    }
  }
}

데이터 색인

POST _bulk?refresh
{"index":{"_index":"aggs_doo","_id":"1"}}
{"benefit":["AAA","BBB"],"grade":1.2,"mallType":"TD","resellers":[{"reseller":"companyA","price":350},{"reseller":"companyB","price":100}]}
{"index":{"_index":"aggs_doo","_id":"2"}}
{"benefit":["CCC","EEE"],"grade":2.5,"mallType":"TD","resellers":[{"reseller":"companyA","price":360},{"reseller":"companyC","price":210}]}
{"index":{"_index":"aggs_doo","_id":"3"}}
{"benefit":["AAA","BBB"],"grade":3.1,"mallType":"TD","resellers":[{"reseller":"companyC","price":450},{"reseller":"companyD","price":100}]}
{"index":{"_index":"aggs_doo","_id":"4"}}
{"benefit":["AAA","CCC"],"grade":4.8,"mallType":"TD","resellers":[{"reseller":"companyE","price":500},{"reseller":"companyF","price":400}]}
{"index":{"_index":"aggs_doo","_id":"5"}}
{"benefit":["AAA","BBB"],"grade":1.7,"mallType":"TD","resellers":[{"reseller":"companyG","price":150},{"reseller":"companyA","price":260}]}
{"index":{"_index":"aggs_doo","_id":"6"}}
{"benefit":["EEE","CCC"],"grade":2.9,"mallType":"TD","resellers":[{"reseller":"companyH","price":50},{"reseller":"companyP","price":210}]}
{"index":{"_index":"aggs_doo","_id":"7"}}
{"benefit":["AAA","CCC"],"grade":1.1,"mallType":"DS","resellers":[{"reseller":"companyI","price":230},{"reseller":"companyB","price":430}]}
{"index":{"_index":"aggs_doo","_id":"8"}}
{"benefit":["DDD","EEE"],"grade":5.3,"mallType":"TD","resellers":[{"reseller":"companyK","price":190},{"reseller":"companyD","price":220}]}
{"index":{"_index":"aggs_doo","_id":"9"}}
{"benefit":["EEE","AAA"],"grade":4.1,"mallType":"TD","resellers":[{"reseller":"companyM","price":300},{"reseller":"companyF","price":200}]}
{"index":{"_index":"aggs_doo","_id":"10"}}
{"benefit":["AAA","BBB"],"grade":1,"mallType":"DS","resellers":[{"reseller":"companyP","price":450},{"reseller":"companyB","price":190}]}

 

생성된 인덱스

색인된 문서는 10개지만 docs 는 30개..  nested 구조 때문인듯 한데..

확인해봐야 겠다

 

kibana 에서 아래 DSL로 테스트

GET aggs_doo/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match_all": {
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "aggregations": {
    "MALL_TYPE": {
      "terms": {
        "field": "mallType",
        "size": 10,
        "min_doc_count": 1,
        "shard_min_doc_count": 0,
        "show_term_doc_count_error": false,
        "order": [
          {
            "_count": "desc"
          },
          {
            "_key": "asc"
          }
        ]
      }
    },
    "BENEFIT": {
      "terms": {
        "field": "benefit",
        "size": 10,
        "min_doc_count": 1,
        "shard_min_doc_count": 0,
        "show_term_doc_count_error": false,
        "order": [
          {
            "_count": "desc"
          },
          {
            "_key": "asc"
          }
        ]
      }
    },
    "GRADE": {
      "range": {
        "field": "grade",
        "ranges": [
          {
          	"key": "1",
            "to": 1
          },
          {
          	"key": "2",
            "from": 1,
            "to": 2
          },
          {
          	"key": "2",
            "from": 2,
            "to": 3
          },
          {
            "from": 3,
            "to": 4
          },
          {
            "from": 4,
            "to": 5
          },
          {
            "from": 5
          }
        ],
        "keyed": false
      }
    },
    "RESELLERS": {
      "nested": {
        "path": "resellers"
      },
      "aggregations": {
        "RESELLERS_RESELLER": {
          "terms": {
            "field": "resellers.reseller",
            "size": 10,
            "min_doc_count": 1,
            "shard_min_doc_count": 0,
            "show_term_doc_count_error": false,
            "order": [
              {
                "_count": "desc"
              },
              {
                "_key": "asc"
              }
            ]
          }
        }
      }
    }
  }
}

위의 DSL 을 high level client - java로 변경

 

package com.bbongdoo.doo.service;

import com.bbongdoo.doo.model.response.CommonResult;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.Setter;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.nested.Nested;
import org.elasticsearch.search.aggregations.bucket.nested.NestedAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.range.Range;
import org.elasticsearch.search.aggregations.bucket.range.RangeAggregationBuilder;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

@Service
@RequiredArgsConstructor
public class AggsService {

    private final ResponseService responseService;
    private final RestHighLevelClient client;
    private final String ALIAS = "aggs_doo";

    public CommonResult getProducts() {

        SearchRequest searchRequest = new SearchRequest();
        searchRequest.indices(ALIAS);
        try {
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

            QueryBuilder queryBuilder = QueryBuilders.boolQuery().must(
                    QueryBuilders.matchAllQuery()
            );

            searchSourceBuilder.query(queryBuilder);
            searchSourceBuilder.size(0);

            TermsAggregationBuilder mallType = new TermsAggregationBuilder("MALL_TYPE", null).field("mallType");
            searchSourceBuilder.aggregation(mallType);
            TermsAggregationBuilder benefit = new TermsAggregationBuilder("BENEFIT", null).field("benefit");
            searchSourceBuilder.aggregation(benefit);
            RangeAggregationBuilder grade = new RangeAggregationBuilder("GRADE").field("grade")
                    .addUnboundedTo(1)
                    .addRange(1, 2)
                    .addRange(2, 3)
                    .addRange(3, 4)
                    .addRange(4, 5)
                    .addUnboundedFrom(5);
            searchSourceBuilder.aggregation(grade);
            NestedAggregationBuilder reseller = new NestedAggregationBuilder("RESELLERS", "resellers");
                reseller.subAggregation(new TermsAggregationBuilder("RESELLERS_RESELLER", null).field("resellers.reseller"));
            searchSourceBuilder.aggregation(reseller);

            searchRequest.source(searchSourceBuilder);

            System.out.println(searchRequest.source());
            SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
            RestStatus status = searchResponse.status();

            List<ReturnAggs> returnAggs = new ArrayList<>();
            if (status == RestStatus.OK) {
                Aggregations aggregations = searchResponse.getAggregations();

                Terms benefitAggs = aggregations.get("BENEFIT");
                for (Terms.Bucket bucketBenefit : benefitAggs.getBuckets()) {
                    returnAggs.add(new ReturnAggs(bucketBenefit.getKey().toString(), bucketBenefit.getDocCount()));
                }

                Terms mallTypeAggs = aggregations.get("MALL_TYPE");
                for (Terms.Bucket bucketMallType : mallTypeAggs.getBuckets()) {
                    returnAggs.add(new ReturnAggs(bucketMallType.getKey().toString(), bucketMallType.getDocCount()));
                }

                Range gradeAggs = aggregations.get("GRADE");
                for (Range.Bucket bucketGrade : gradeAggs.getBuckets()) {
                    returnAggs.add(new ReturnAggs(bucketGrade.getKey().toString(), bucketGrade.getDocCount()));
                }

                Nested resellersAggs=aggregations.get("RESELLERS");
                Terms resellerAggs=resellersAggs.getAggregations().get("RESELLERS_RESELLER");
                for (Terms.Bucket bucketReseller : resellerAggs.getBuckets()) {
                    returnAggs.add(new ReturnAggs(bucketReseller.getKey().toString(), bucketReseller.getDocCount()));
                }

            }

            return responseService.getListResult(returnAggs);

        } catch (IOException e) {
            e.printStackTrace();
        }

        return new CommonResult();
    }

}

@Getter
@Setter
@RequiredArgsConstructor
class ReturnAggs {
    private final String key;
    private final long count;
}

 

DSL 결과 물 

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "BENEFIT" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "AAA",
          "doc_count" : 7
        },
        {
          "key" : "BBB",
          "doc_count" : 4
        },
        {
          "key" : "CCC",
          "doc_count" : 4
        },
        {
          "key" : "EEE",
          "doc_count" : 4
        },
        {
          "key" : "DDD",
          "doc_count" : 1
        }
      ]
    },
    "RESELLERS" : {
      "doc_count" : 20,
      "RESELLERS_RESELLER" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 2,
        "buckets" : [
          {
            "key" : "companyA",
            "doc_count" : 3
          },
          {
            "key" : "companyB",
            "doc_count" : 3
          },
          {
            "key" : "companyC",
            "doc_count" : 2
          },
          {
            "key" : "companyD",
            "doc_count" : 2
          },
          {
            "key" : "companyF",
            "doc_count" : 2
          },
          {
            "key" : "companyP",
            "doc_count" : 2
          },
          {
            "key" : "companyE",
            "doc_count" : 1
          },
          {
            "key" : "companyG",
            "doc_count" : 1
          },
          {
            "key" : "companyH",
            "doc_count" : 1
          },
          {
            "key" : "companyI",
            "doc_count" : 1
          }
        ]
      }
    },
    "MALL_TYPE" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "TD",
          "doc_count" : 8
        },
        {
          "key" : "DS",
          "doc_count" : 2
        }
      ]
    },
    "GRADE" : {
      "buckets" : [
        {
          "key" : "*-1.0",
          "to" : 1.0,
          "doc_count" : 0
        },
        {
          "key" : "1.0-2.0",
          "from" : 1.0,
          "to" : 2.0,
          "doc_count" : 4
        },
        {
          "key" : "2.0-3.0",
          "from" : 2.0,
          "to" : 3.0,
          "doc_count" : 2
        },
        {
          "key" : "3.0-4.0",
          "from" : 3.0,
          "to" : 4.0,
          "doc_count" : 1
        },
        {
          "key" : "4.0-5.0",
          "from" : 4.0,
          "to" : 5.0,
          "doc_count" : 2
        },
        {
          "key" : "5.0-*",
          "from" : 5.0,
          "doc_count" : 1
        }
      ]
    }
  }
}
반응형

'ElasticStack > Elasticsearch' 카테고리의 다른 글

[es] _update_by_query  (0) 2022.07.20
[es] aggregation test 3  (0) 2022.07.19
[es] nested aggregation  (0) 2022.07.16
[es] aggregation test 1  (0) 2022.07.16
[es] Array Aggregations  (0) 2022.07.15
Comments