Elasticsearch (二) - 快速搭建與 Document 的建立、更新和刪除

# Elasticsearch (二) - 快速搭建與 Document 的建立、更新和刪除 [上一篇](https://tienyulin.github.io/elasticsearch-concept/) 介紹了 Elasticsearch 的一些基本概念，這一篇就要來介紹如何操作 Elasticsearch。本篇會介紹從搭建 Elasticsearch 到建立、更新和刪除 Document，查詢的部份因為內容較多將在 [下一篇](https://tienyulin.github.io/elasticsearch-query-filter/) 來做介紹。  ## 建立 Elasticsearch + Kibana Kibana 是 Elasticsearch 的視覺化工具，本篇操作會以 Kibana 為主。如果想用其他 API 的工具，例如 Postman，後面介紹會舉一個例子示範。首先我們使用 [Docker-Compose](https://tienyulin.github.io/docker-compose/) 來快速搭建出一個 Elasticsearch + Kibana 的 Server。如下 : ```yaml= version: '3.7' services: elasticsearch: image: elasticsearch:6.5.4 container_name: elasticsearch environment: - discovery.type=single-node ports: - 9200:9200 - 9300:9300 networks: - esnet kibana: image: kibana:6.5.4 container_name: kibana ports: - 5601:5601 networks: - esnet depends_on: - elasticsearch networks: esnet: driver: bridge name: elasticsearch_esnet ``` 若 Elasticsearch 啟動成功可以在輸出找到這一行，可以看到 message 的值為 started。 ```json= {"type": "server", "timestamp": "2020-07-29T05:51:37,414Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "2f019cbcdb2b", "message": "started", "cluster.uuid": "4PjAfq-HT32ZnH-FDgtnsA", "node.id": "0aqSpwcDSnixl4NfMVRpYA" } ``` 另外在輸出還可以找到 Gateway 這一行，Getway 是用來讓資料持久化的組件，以避免資料在節點故障時遺失。而啟動時 Gateway 會檢查硬碟內是否有資料被保存過，如果有可以用來恢復資料。 ```json= {"type": "server", "timestamp": "2020-07-29T05:51:37,505Z", "level": "INFO", "component": "o.e.g.GatewayService", "cluster.name": "docker-cluster", "node.name": "2f019cbcdb2b", "message": "recovered [0] indices into cluster_state", "cluster.uuid": "4PjAfq-HT32ZnH-FDgtnsA", "node.id": "0aqSpwcDSnixl4NfMVRpYA" } ``` 而 Kibana 會自動去尋找 Elasticsearch 並和他進行連接，所以只要讓 Kibana 在 Elasticsearch 啟動後再啟動即可，不需要再特別對 Kibana 設定。 ### Port 設定啟動 Elasticsearh 的 Container 時我們有指定了兩組 Port，分別是 9300 和 9200。 * 9300 預設用於節點之間的通訊，稱為 Transport。 ```json= {"type": "server", "timestamp": "2020-07-29T05:51:37,147Z", "level": "INFO", "component": "o.e.t.TransportService", "cluster.name": "docker-cluster", "node.name": "2f019cbcdb2b", "message": "publish_address {172.17.0.3:9300}, bound_addresses {0.0.0.0:9300}" } ``` * 9200 預設用於 Http 的通訊。 ```json= {"type": "server", "timestamp": "2020-07-29T05:51:37,412Z", "level": "INFO", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "docker-cluster", "node.name": "2f019cbcdb2b", "message": "publish_address {172.17.0.3:9200}, bound_addresses {0.0.0.0:9200}", "cluster.uuid": "4PjAfq-HT32ZnH-FDgtnsA", "node.id": "0aqSpwcDSnixl4NfMVRpYA" } ``` ## 連接 Elasticsearch 以上啟動完成後就可以在瀏覽器輸入 http://localhost:9200 來連接 Elasticsearch。如果 Elasticsearch 正常運作的話，會回傳一個 Json 作為 Response。如下 : ```json= { "name" : "aheFnoR", "cluster_name" : "docker-cluster", "cluster_uuid" : "q8glcZZRRe-I-ESRz-9ZCg", "version" : { "number" : "6.5.4", "build_flavor" : "default", "build_type" : "tar", "build_hash" : "d2ef93d", "build_date" : "2018-12-17T21:17:40.758843Z", "build_snapshot" : false, "lucene_version" : "7.5.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" } ``` ## 連接 Kibana 在瀏覽器輸入 http://localhost:5601 來連接 Kibana，成功打開可以看到以下的畫面。 ![Kibana](https://i.imgur.com/fZPsbe0.png) 接著切換到 Dev Tools 就可以開始執行指令操作了。Dev Tools 的左邊是 Console 可以輸入 Request，執行後右邊會輸出結果。 ![Kibana Dev Tools](https://i.imgur.com/SkIce0z.png) Elasticsearch 採用 Query DSL (查詢表達式) 來描述查詢的條件，Qeury DSL 是以 Json 格式來撰寫。 ## 常用指令 ### Cluster 狀態 Cluster 的狀態可以用 `_cat` 命令來查看。 #### health 檢查 Cluster 的健康狀態，若回傳 green 則代表健康，若顯示 yellow 或 red 則代表不健康，詳細各個狀態代表甚麼請參考[上一篇](https://tienyulin.github.io/elasticsearch-concept/)。 ```json= GET _cat/health ``` 如果使用 Postman 的話，Request Method 就依照範例給的對應即可。Uri 就以 Elasticsearch 的 ip + port + 範例給的路徑即可。如下 : ```console= http://localhost:9200/_cat/health ``` 回傳結果如下 : ``` 1596805350 13:02:30 docker-cluster green 1 1 1 1 0 0 0 0 - 100.0% ``` ![health](https://i.imgur.com/JVElGrI.png) ### 建立 Index 直接指定 Index 名稱即可建立 Index。 ```json= PUT <IndexName> ``` #### 範例建立一個名為 sport 的 Index。 ```json= PUT sport ``` 若建立成功會回傳如下 : ```json= { "acknowledged": true, "shards_acknowledged": true, "index": "sport" } ``` ### 建立 Document 在 Index 後加上 Type，再加上 Document ID，並且加入要新增到 Document 的內容。這裡要注意的是前一篇有提到相同 Index 下的 Field 如果名稱相同必須要是相同的 Data Type。 ```json= POST /<Index>/<Type>/<Doc ID>/_create { "<FieldName>": <Value> } ``` 如果使用 Postman，則 `{ "<FieldName>": <Value> }` 就放在 Body，並且格式要使用 Json。這一段就是上方所提到的 DSL 查詢表達式，下方所有的操作都是一樣的方法，就不再針對 Postman 或其他 API 工具贅述。 ![create document](https://i.imgur.com/2FY635l.png) #### 範例建立一個 Document 在 basketball 這個 Type 下，且這個 Type 在 sport 這個 Index 下。而這個 Document 的 ID 是 1。 ```json= POST /sport/basketball/1/_create { "team": "Lakers", "location": "Los Angelas", "assets": 100 } ``` 建立成功會回傳下面這些訊息，可以看到 _index、_type、_id 跟上面指定的一樣。 ```json= { "_index" : "sport", "_type" : "basketball", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 } ``` 建立好後我們先用簡單的查詢指令來看一下剛剛建好的 Document。 ```json= POST /sport/basketball/_search ``` 可以看到輸出的結果會有這一段剛剛建好的 Document，更進一步的查詢下面會再介紹。 ```json= "hits": [ { "_index": "sport", "_type": "basketball", "_id": "1", "_score": 1.0, "_source": { "team": "Lakers", "location": "Los Angelas", "assets": 100 } } ] ``` ### Mapping 建立好 Document 後，你可能會想為什麼不用先定義 Schema 就可以直接寫資料進去呢 ? 事實上，非關聯式資料庫不需要事先定義哪個欄位要放什麼，在 Elasticsearch 這個定義叫做 Mapping。所以當你直接建立 Document 時，Elasticsearch 就會自動幫你建立 Mapping。要查看和設定 Mapping 可以使用 `_mapping` 這個指令。 #### 取得 Index Mapping ```json= GET /<Index>/_mapping ``` **範例** 取得 Sport 的 Mapping。 ```json= GET /sport/_mapping ``` 可以看到回傳結果裡列出了每一個 Field 的 DataType。 ```json= { "sport" : { "mappings" : { "basketball" : { "properties" : { "assets" : { "type" : "long" }, "location" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "team" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } } ``` 這裡有一個要特別注意的地方是，可以看到 team 的 field type 是 text，也就是字串，這個是沒問題的。但是下面又還有一個 `fields` -- keyword，而他的 type 是 `keyword`。這是 Elasticsearch 自動建立的一個 Field，叫做 `team.keyword`。這個 Field 的存在是因為 Elasticsearch 會對 text 做分析並拆解，而 Keyword 會保留原始的內容，例如完整的句子。當要查詢時如果是查 keyword 這個 Field，一定要完全符合內容。因為 keyword 這個 type 就是定義不拆解內容。 #### 自訂義 Mapping 除了讓 Elasticsearch 自己建立 Mapping，也可以自己定義。自己定義要在建立 Document 之前定義，如同關聯式資料庫要先定義 Table Schema。 ```json= POST /<Index>/_mapping/<Type> { "properties": { "<Field>": { "type": "<data type>" }, "<Field>": { "type": "<data type>" }, ... } } ``` 要特別注意當 Mapping 已經定義好之後，不論是自動定義還是自己定義的，都不能去修改。唯一能做的是加入新的 Field。 **範例** 這裡我們示範加入一個新的 Field，加入的寫法和要自定義是一樣的，所以可以想成都是要新增 Field 就好。 ```json= POST /sport/_mapping/basketball { "properties": { "champion": { "type": "long" } } } ``` 新增完 Field 後，再次取得 Mapping 來看。可以看到多了一個剛剛加入的 champion。 ```json= { "sport" : { "mappings" : { "basketball" : { "properties" : { "assets" : { "type" : "long" }, "champion" : { "type" : "long" }, "location" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "team" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } } ``` ### 更新 Document 更新可以使用 `_update` 指令，但是要搭配 `doc` 將更新的 Field 和值包起來。更新的時候 Type 後面要加上 Document 的 ID 系統才會知道要改哪個 Document。 ```json= POST /<Index>/<Type>/<Doc ID>/_update { "doc": { "<Field>": <Value>, "<Field>": <Value>, ... } } ``` #### 範例前面我們新增了一個 Field，現在我們就利用更新的方式來補上這個 Field 的值。 ```json= POST sport/basketball/1/_update { "doc": { "champion": 16 } } ``` 更新成功會回傳 result 為 updated。 ```json= { "_index" : "sport", "_type" : "basketball", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 } ``` 再次取得 Document 來看，可以發現 champion 已經有值了。 ```json= "hits" : [ { "_index" : "sport", "_type" : "basketball", "_id" : "1", "_score" : 1.0, "_source" : { "team" : "Lakers", "location" : "Los Angelas", "assets" : 100, "champion" : 16 } } ] ``` ### 刪除 Document 刪除 Document 只要指定 Index、Type 和 Document ID 即可刪除。 ```json= DELETE <Index>/<Type>/<Doc ID> ``` #### 範例我們目前只有建立一個 Document，就刪除這個 Document。 ```json= DELETE sport/basketball/1 ``` 刪除成功 result 會回傳 deleted。 ```json= { "_index" : "sport", "_type" : "basketball", "_id" : "1", "_version" : 3, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 2, "_primary_term" : 1 } ``` ### 批次處理 Elasticsearch 提供以批次的方式來大量處理資料，可以使用 `_bulk` 指令來進行批次處理。 #### Create Create 用於建立 Document，若 Document 已存在則會回傳錯誤。 ```json= POST <Index>/<Type>/_bulk { "create" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} { "create" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} ... ``` 如果每次要處理的 Index 或是 Type 不一樣的話，也可以個別指定。如下 : ```json= POST /_bulk { "create" : { "_index": "<Index>", "_type": "<Type>", "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} { "create" : { "_index": "<Index>", "_type": "<Type>", "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} ... ``` **範例** ```json= POST sport/basketball/_bulk { "create" : { "_id": 1 } } { "team":"Celtics", "location":"Boston", "assets": 100,"champion": 17} { "create" : { "_id": 2 } } { "team": "Lakers", "location":"Los Angelas","assets": 150,"champion": 16} { "create" : { "_id": 3 } } { "team": "Bulls", "location":"Chicago", "assets": 120, "champion": 6} ``` #### Index Index 用於建立或更新 Document，如果 Document 不存在則建立，存在則更新。 ```json= POST <Index>/<Type>/_bulk { "index" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} { "index" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} ... ``` **範例** 這個範例我們更新 ID 為 3 的 Docuemt，並且新增一個 Document。 ```json= POST sport/basketball/_bulk { "index" : { "_id": 3 } } { "team": "Bulls", "location":"Chicago", "assets": 130, "champion": 6} { "index" : { "_id": 4 } } { "team": "Spurs", "location":"San Antonio", "assets": 160, "champion": 5} ``` 可以看到下面的結果，一個是 updated、另一個是 created。 ```json= { "took" : 79, "errors" : false, "items" : [ { "index" : { "_index" : "sport", "_type" : "basketball", "_id" : "3", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 2, "status" : 200 } }, { "index" : { "_index" : "sport", "_type" : "basketball", "_id" : "4", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 2, "status" : 201 } } ] } ``` #### Update Update 用於更新 Document，如果 Document 不存在會回傳錯誤。 ```json= POST <Index>/<Type>/_bulk { "update" : { "_id": <Doc ID> } } { "doc":{ "<Field>": <Value>, "<Field>": <Value>, ...}} { "update" : { "_id": <Doc ID> } } { "doc":{ "<Field>": <Value>, "<Field>": <Value>, ...}} ... ``` **範例** ```json= POST sport/basketball/_bulk { "update" : { "_id": 3 } } { "doc":{ "assets": 180}} { "update" : { "_id": 4 } } { "doc":{ "assets": 100}} ``` #### Delete Delete 用於刪除 Document，直接指定 Document ID 即可，如果 Document 不存在會回傳錯誤。 ```json= POST <Index>/<Type>/_bulk { "delete" : { "_id": <Doc ID> } } { "delete" : { "_id": <Doc ID> } } ... ``` **範例** ```json= POST sport/basketball/_bulk { "delete" : { "_id": 3 } } { "delete" : { "_id": 4 } } ``` #### 批次合併處理上述介紹的這些批次指令可以合在一起執行。 ```json= POST <Index>/<Type>/_bulk { "create" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} { "index" : { "_id": <Doc ID> } } { "<Field>":<Value>, "<Field>":<Value>, ...} { "update" : { "_id": <Doc ID> } } { "doc":{ "<Field>": <Value>, "<Field>": <Value>, ...}} { "delete" : { "_id": <Doc ID> } } ``` **範例** ```json= POST sport/basketball/_bulk { "create" : { "_id": 3 } } { "team": "Bulls", "location":"Chicago", "assets": 150, "champion": 6} { "index" : { "_id": 4 } } { "team": "Spurs", "location":"San Antonio", "assets": 160, "champion": 5} { "update" : { "_id": 3 } } { "doc":{ "assets": 180}} { "delete" : { "_id": 4 } } ``` 結果如下 : ```json= { "took" : 24, "errors" : false, "items" : [ { "create" : { "_index" : "sport", "_type" : "basketball", "_id" : "3", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 6, "_primary_term" : 2, "status" : 201 } }, { "index" : { "_index" : "sport", "_type" : "basketball", "_id" : "4", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 4, "_primary_term" : 2, "status" : 201 } }, { "update" : { "_index" : "sport", "_type" : "basketball", "_id" : "3", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 7, "_primary_term" : 2, "status" : 200 } }, { "delete" : { "_index" : "sport", "_type" : "basketball", "_id" : "4", "_version" : 2, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 5, "_primary_term" : 2, "status" : 200 } } ] } ``` ## Summary 本篇介紹了如何快速搭建 Elasticsearch 和 Kibana，也介紹了如何建立、更新和刪除 Document。 [下一篇](https://tienyulin.github.io/elasticsearch-query-filter/) 我們將介紹如何查詢。 ## 參考 [1] [使用 Docker 建立 ElasticSearch (1) 建立 ES](https://medium.com/%E7%A8%8B%E5%BC%8F%E8%A3%A1%E6%9C%89%E8%9F%B2/%E4%BD%BF%E7%94%A8-docker-%E5%BB%BA%E7%AB%8B-elasticsearch-1-56521b942263) [2] [Elasticsearch Basic Operation](https://soarlin.github.io/2016/11/13/elasticsearch-note-operation/#%E5%9F%BA%E6%9C%AC%E6%96%B9%E6%B3%95) [3] [Removal of mapping types](https://www.elastic.co/guide/en/elasticsearch/reference/6.0/removal-of-types.html#_migrating_multi_type_indices_to_single_type) [4] [ElasticSearch 中的索引與類型的前生今世](https://www.do1618.com/archives/1276/elasticsearch-%E4%B8%AD%E7%9A%84%E7%B4%A2%E5%BC%95%E4%B8%8E%E7%B1%BB%E5%9E%8B%E7%9A%84%E5%89%8D%E7%94%9F%E4%BB%8A%E4%B8%96/) [5] [Elasticsearch.Net](https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/elasticsearch-net.html) [6] [Elasticsearch.Net and NEST](https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/nest.html) [7] [Elasticsearch: 權威指南](https://www.elastic.co/guide/cn/elasticsearch/guide/current/getting-started.html) [8] [Docker-Compose 建立 Elasticsearch 與 Kibana 服務](https://kevintsengtw.blogspot.com/2018/07/docker-compose-elasticsearch-kibana.html) [9] [ES Mapping、字段類型 Field type 詳解](https://blog.csdn.net/ZYC88888/article/details/83059040) [10] [ElasticSearch Missing](https://www.jianshu.com/p/39fabf7b5484) ###### tags: `Elasticsearch` `NoSQL`