'ElasticStack8' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Tags more

Archives

Today

Total

관리 메뉴

목록ElasticStack8 (38)

개발잡부

[BERT] 외래어 구분

koBert 를 사용 loanword_classifier 생성 process로그 추출 후 필터링공백기준 1단어이상 조합, 숫자로만 이루어진 단어, 특수문자 들어간 단어 제외/Users/doo/doo_py/homeplus/season_keyword/log_extrect_clean.py (로그추출 스크립트)result/log_result_clean.txt (로그 파일)외래어 분류/Users/doo/doo_py/homeplus/new_nlp/loanword_inference.py (외래어 분류)/result/loanword/loanword_list.csv (외래어)native_list.csv (일반어)API 조회/Users/doo/doo_py/homeplus/homeplus_api/search.py (..

ElasticStack8/NLP 2025. 7. 3. 08:30

[NLP] BERT - 식품 비식품 / 사계절 추론

BERTklue/bert-base (한국어 사전으로 학습된 모델) 모델 로드Device : Local CPU Fine-Tunning식품/비식품“text”와 “label(1=음식, 0=비음식)” 컬럼으로 구성으로 학습식품 비식품 1031개 단어학습 중 - 1000개 학습 약 2분 10초 소요 (epoch 5)추론 10000개 6분 30초 소요 테스트추출 기간: 2024-06-01 00:00:00 ~ 2024-08-31 00:00:00추출 키워드: 59,973개 (Type 에러 발생시키는 키워드 제거)결과식품 : 48,837개비식품: 11,136개 시즌 키워드사계절 및 비시즌 파일로 추론 결과 생성0: "비시즌" - nonseason_list.csv1: "봄" - spring_list.csv2: "여름" -..

ElasticStack8/NLP 2025. 7. 3. 08:25

[NLP] 식품, 비식품 키워드 분리

흐미 오래걸린다. 허깅페이스에 있는걸 그대로 쓰려고 하니까 방화벽에 막혀서 암것도 안되는 상황 그래서 로컬에 구축 허깅페이스에서 이 파일들 다운 받음 그리고 트레이너 선생님 import reimport pandas as pdimport numpy as npimport torchfrom torch.utils.data import Dataset, DataLoader # Dataset 추가from torch.optim import AdamWfrom transformers import ( BertTokenizer, BertForSequenceClassification, DataCollatorWithPadding, get_linear_schedule_with_warmu..

ElasticStack8/NLP 2025. 6. 24. 15:21

[es] Elasticsearch data node 의 shard 정보

난 샤드 크기와 도큐먼트 사이즈가 알고싶다. kibana 명령어 GET /_cat/shards?v&h=index,shard,prirep,state,docs,store,node 결과

ElasticStack8/Elasticsearch 2024. 7. 31. 11:00

[es] analyzer, token filter test

GET _analyze { "text": "The quick brown fox jumps over the lazy dog", "analyzer": "snowball" } GET _analyze { "text": "The quick brown fox jumps over the lazy dog", "tokenizer": "homeplus_tokenizer", "filter": [ "lowercase", "stop", "snowball" ] }

ElasticStack8/Elasticsearch 2023. 8. 29. 11:24

[es8] nori analyzer

nori 형태소분석기의 사전파일 테스트 프로젝트 경로 /Users/doo/docker/es8.8.1 프로젝트를 활용할 예정 docker-compose.yml 파일을 열어보면 900gle 에서 쓰고있는 컨테이너들이 잔뜩 들어 있다.. pc 가 성능이 좋았으면 다돌려도 상관없는데.. 내껀 아니라 es, kibana 를 제거한 .yml 파일 생성 docker-compose.yml version: '3.7' services: # The 'setup' service runs a one-off script which initializes the # 'logstash_internal' and 'kibana_system' users inside Elasticsearch with the # values of the pa..

ElasticStack8/Elasticsearch 2023. 8. 19. 18:04

[es8] HighLevelClient, LowLevelClient

900gle 의 개발 환경을 es8 로 바꾸고 나서 부터 문제가 발생했다. high level client 의 버전은 7.17 버전이후 8버전이 알파상태라 써도 되나 .. 싶은.. 900gle 이 맛이 갔는데 이게 다 업데이트 때문이라는... es8.8.1 도 카피를 떳더니 충돌나서 데몬이 올라오지도 않고 암튼 ann 쿼리로 900gle 을 업데이트 하려고 했는데 이 쿼리가 es 8 부터 실행되는... 8.6 이상이였나.. 아무튼 7.15 를 8.8.1 로 업데이트 하니.. 끝.. (해결이 아닌 맛이감) 문제의 쿼리 { "query": { "match_all": {} }, "knn": { "field": "name_vector", "query_vector": ${query_vector}, "k": 5, ..

ElasticStack8/Elasticsearch 2023. 8. 6. 19:34

[es8] aggregation - Pipeline Aggregations

Elasticsearch aggregation 을 테스트 해보려고 한다. 그중에서도 Pipeline Aggregations 우선 내 신상 ES 로 이동 /Users/doo/docker/es8.8.1 실행해보자 (base) ➜ es8.8.1 docker compose up -d --build 우선 키바나를 접속해보자 http://localhost:5601/app/home#/ 오케이 인덱스는 언제더라.. 어젠가 그젠가 만들어 놓은 인덱스 820만건의 location 정보 mapping 구조 더보기 { "location-index": { "mappings": { "dynamic": "true", "properties": { "addr1": { "type": "keyword" }, "city": { "type"..

ElasticStack8/Elasticsearch 2023. 6. 14. 21:33

[es8] elasticsearch stable-esplugin

900gle 에서 사용할 tokenizer 를 만들고 싶은데 https://www.elastic.co/guide/en/elasticsearch/plugins/current/example-text-analysis-plugin.html Example text analysis plugin | Elasticsearch Plugins and Integrations [8.7] | Elastic Example text analysis pluginedit This example shows how to create a simple "Hello world" text analysis plugin using the stable plugin API. The plugin provides a custom Lucene token f..

ElasticStack8/Elasticsearch 2023. 5. 21. 02:10

[es8] Elasticsearch Plugin 8.6.2

base 는 아래 프로젝트에서 버전업을 하고 디벨롭 한다. https://ldh-6019.tistory.com/394?category=1096525 [es8] Elasticsearch Plugin 8.4.1 Elasticsearch 8.4 Plugin Build & Install TEST 작업 요약 github.com (https://github.com/elastic/elasticsearch.git) 에서 elasticsearch 소스 다운로드 소스에서 plugin > example > rest-handler 복사 build.gradle 수정 및 plugin 빌드 elastic ldh-6019.tistory.com build.gradle 에서 8.6.2로 버전을 올리고 빌드 해보자. /* * Copyri..

ElasticStack8/Elasticsearch 2023. 5. 20. 22:03

Prev 1 2 3 4 Next

목록ElasticStack8 (38)

개발잡부

티스토리툴바