일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |
- zip 암호화
- aggs
- MySQL
- zip 파일 암호화
- flask
- sort
- API
- license delete
- Python
- Elasticsearch
- 파이썬
- Mac
- analyzer test
- 차트
- docker
- matplotlib
- aggregation
- ELASTIC
- 900gle
- Kafka
- high level client
- License
- TensorFlow
- springboot
- Java
- licence delete curl
- token filter test
- plugin
- query
- Test
- Today
- Total
개발잡부
Elasticsearch 이미지 유사도 검색 본문
검색 페이지
이미지 선택
검색 결과
개발환경
- MacOS
- java 14
- springboot
- thymereaf
- axios
- OpevCV 4.5.0
- Mysql
- Elasticsearch 7.13
- kibana 7.13
- docker
- bootstrap
색인
- 이미지 백터 추출
- dense_vector type 으로 색인
검색
- 이미지 백터 추출
- cosineSimilarity 검색
색인
크롤링을 통해 상품의 정보와 이미지 + 이미지백터 추출
- 크롤링은 블락걸릴 위험이 있어서 리스트에서 1초 상품정보에서 1초 재웠다가 실행시킴
크롤링을 통해 얻은 이미지 URL 파일을 내려받아 discripter 추출
SIFT.create().detectAndCompute(image, mask, keyPointOfImages, discripters);
추출된 디스크립터를 Double Vecter 형태로 이미지 백터를 추출 dense_vector 타입으 2048차원 까지 가능하지만
128차원으로 색인할 예정이기때문에
DB (Mysql) 에 저장
[1.096066591E9, -4.86193633E8, -7.84079329E8, -4.54113761E8, 1.470957087E9, 1.79930655E8, 4.14565919E8, -6.95482849E8, -1.348975073E9, 1.279280671E9, -1.664539105E9, 5.51831071E8, -7.43569889E8, 1.605699103E9, -1.602460129E9, 1.988363807E9, 5.96960799E8, 7.3360927E7, -1.36198625E8, -1.454496225E9, -1.13572321E8, 1.733838367E9, -1.239661025E9, -2.5803233E7, 5.15868191E8, -3.00906977E8, 2.052744735E9, 1.535551007E9, 1.777026591E9, 9.61332767E8, 1.337984543E9, 2.100684319E9, -1.778145761E9, -1.778956769E9, 7.49889055E8, 5.98181407E8, 6.10280991E8, 1.454466591E9, 1.070728735E9, -9.17658081E8, 1.996834335E9, -2.23615457E8, 2.065516063E9, -7.26800865E8, 1.770120735E9, 1.233159711E9, -8.83128801E8, 2.137490975E9, 6.51404831E8, -1.267382753E9, 1.97576223E8, -1.653832161E9, 1.179018783E9, -1.061067233E9, -1.048836577E9, -1.499126241E9, -8.59216353E8, -1.27457761E8, 6.24510495E8, -8.27013601E8, -1.624111585E9, -1.155578337E9, -6.6337249E7, 3.94233375E8, -6.97604577E8, 1.30410015E8, -3.65025761E8, 7.86638367E8, -1.912715745E9, -4.2400225E7, -1.033976289E9, -1.926085089E9, -8.27906529E8, 2.040710687E9, -1.853233633E9, 1.934632479E9, 4.04284959E8, -1.82122977E8, 1.842693663E9, 6145567.0, 3.02188063E8, 1.350575647E9, -1.30980321E8, 1.295123999E9, 1.779664415E9, -1.883429345E9, 1.024665119E9, 8.90988063E8, 2.009630239E9, -2.021489121E9, -1.304197601E9, -1.063795169E9, -3.86349537E8, -2.107062753E9, -1.815124449E9, 1.652860447E9, -1.78518497E8, -1.406712289E9, 1.979803167E9, -1.874098657E9, 1.853933087E9, 9.21126431E8, 6.69042207E8, -1.975835105E9, -1.842403809E9, -1.816402401E9, 1.597883935E9, -1.018485217E9, -7.24711905E8, 9.09829663E8, 1.334281759E9, -2.24213473E8, -3.07968481E8, 7.82280223E8, -4.35689953E8, 3.42885919E8, 2.10642463E8, 2.22938655E8, 2.091910687E9, 6.98279455E8, -2.5016801E7, -3.19642081E8, 5.06840607E8, -1.119623649E9, -8.95867361E8, -2.90765281E8, -9.46928097E8, -2.08329185E8]
Elasticsearch index 생성
맵핑구조 GET shop/_mapping
{
"shop-2021-09-18" : {
"mappings" : {
"properties" : {
"brand" : {
"type" : "keyword"
},
"category" : {
"type" : "text"
},
"category1" : {
"type" : "keyword"
},
"category2" : {
"type" : "keyword"
},
"category3" : {
"type" : "keyword"
},
"category4" : {
"type" : "keyword"
},
"category5" : {
"type" : "keyword"
},
"created_time" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis||strict_date_hour_minute_second_millis||strict_date_optional_time"
},
"id" : {
"type" : "long"
},
"image" : {
"type" : "text"
},
"image_vector" : {
"type" : "dense_vector",
"dims" : 128
},
"keyword" : {
"type" : "keyword"
},
"name" : {
"type" : "text",
"analyzer" : "sample-nori-analyzer"
},
"price" : {
"type" : "long"
},
"type" : {
"type" : "keyword"
},
"updated_time" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis||strict_date_hour_minute_second_millis||strict_date_optional_time"
}
}
}
}
}
색인 데이터
{
"_index": "shop-2021-09-18",
"_type": "_doc",
"_id": "51",
"_version": 1,
"_score": 1,
"_source": {
"id": 51,
"keyword": "프라다",
"name": "프라다 레더 숄더백 ZO6 V OOO 1BD294",
"brand": "",
"price": 2090000,
"category": "패션잡화 여성가방 숄더백",
"category1": "패션잡화",
"category2": "여성가방",
"category3": "숄더백",
"category4": null,
"category5": null,
"image": "https://shopping-phinf.pstatic.net/main_2632500/26325008854.20210311194441.jpg?type=f640",
"image_vector": [
941370719
,
1675029855
,
684723551
,
-362771105
,
-45511329
,
-2071958177
,
-578392737
,
-1699009185
,
1547734367
,
853519711
,
-637325985
,
-421704353
,
82333023
,
141192543
,
-727913121
,
-909972129
,
947711327
,
-764949153
,
-1679200929
,
1351380319
,
1585663327
,
497233247
,
1040223583
,
1803906399
,
-1597559457
,
795192671
,
-2050036385
,
1899769183
,
-1418875553
,
-1665766049
,
949931359
,
933023071
,
1065012575
,
1507831135
,
-1855918753
,
-1475302049
,
510381407
,
-1792815777
,
472395103
,
-62026401
,
-1795977889
,
-588239521
,
-1930523297
,
-824873633
,
1530678623
,
-1553273505
,
-1603883681
,
-197079713
,
-1782641313
,
1830120799
,
-79925921
,
-553633
,
263703903
,
2021191007
,
116239711
,
1325518175
,
-1713967777
,
1001885023
,
1097403743
,
-332796577
,
1940925791
,
805072223
,
10267999
,
-1645179553
,
-336761505
,
1033956703
,
1474424159
,
-307974817
,
-2090807969
,
-1096454817
,
-1801687713
,
-1397093025
,
1826344287
,
-719073953
,
1080339807
,
-2045727393
,
-2111156897
,
1576979807
,
-849089185
,
543862111
,
-343159457
,
1605324127
,
-396300961
,
-1795363489
,
-408482465
,
148548959
,
2111941983
,
-2137698977
,
1069149535
,
1826606431
,
48409951
,
475311455
,
-743969441
,
-747147937
,
-1047597729
,
-186258081
,
358526303
,
1835167071
,
-2119406241
,
552004959
,
248663391
,
196906335
,
149892447
,
-408441505
,
2082295135
,
-1241764513
,
-1800950433
,
-369119905
,
-876941985
,
-867758753
,
867298655
,
-529879713
,
-280613537
,
1717259615
,
-422556321
,
1932324191
,
1843015007
,
-205656737
,
1988767071
,
-1635160737
,
1479912799
,
733539679
,
315837791
,
308456799
,
1496788319
,
1661578591
,
1142353247
,
-1774441121
],
"type": "",
"created_time": "2021-09-05T00:21:09.000Z",
"updated_time": "2021-09-05T00:21:09.000Z"
}
}
검색
전달받은 이미지를 저장 색인과 같은 알고리즘으로 같은형태의 백터값 추출
cosineSimilarity 검색
Vector<Double> vectors = ImageToVectorOpenCV.getVector(VectorDTO.builder().dirPath(tempDir).file(imageSearchDTO.getFile()).build());
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(ALIAS);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
Map<String, Object> map = new HashMap<>();
map.put("imageVector", vectors);
ScriptScoreQueryBuilder functionScoreQueryBuilder = new ScriptScoreQueryBuilder(
QueryBuilders.matchAllQuery(),
new Script(
Script.DEFAULT_SCRIPT_TYPE,
Script.DEFAULT_SCRIPT_LANG,
"cosineSimilarity(params.imageVector, 'image_vector') + 1.0", map)
);
searchSourceBuilder.query(functionScoreQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
'ElasticStack' 카테고리의 다른 글
[es] random sort (0) | 2022.10.20 |
---|---|
Memo (0) | 2022.04.03 |
시스템 설정 구성 (0) | 2020.08.11 |
Nori plugin (0) | 2020.06.09 |
Elasticsearch script 정리 (0) | 2020.02.27 |