Elasticsearch 空值处理实战指南
1、引言
DELETE my-index-000001
PUT my-index-000001
{
"mappings": {
"properties": {
"status_code": {
"type": "keyword"
},
"title": {
"type": "text"
}
}
}
}
PUT my-index-000001/_bulk
{"index":{"_id":1}}
{"status_code":null,"title":"just test"}
{"index":{"_id":2}}
{"status_code":"","title":"just test"}
{"index":{"_id":3}}
{"status_code":[],"title":"just test"}
POST my-index-000001/_search
POST my-index-000001/_search
{
"query": {
"term": {
"status_code": null
}
}
}
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "field name is null or empty"
}
],
"type": "illegal_argument_exception",
"reason": "field name is null or empty"
},
"status": 400
}
2、null_value 的含义
DELETE my-index-000001
PUT my-index-000001
{
"mappings": {
"properties": {
"status_code": {
"type": "keyword",
"null_value": "NULL"
}
}
}
}
PUT my-index-000001/_bulk
{"index":{"_id":1}}
{"status_code":null}
{"index":{"_id":2}}
{"status_code":[]}
{"index":{"_id":3}}
{"status_code":"NULL"}
GET my-index-000001/_search
{
"query": {
"term": {
"status_code": "NULL"
}
}
}
相当于我们在 Mapping 定义阶段指定了空的默认值,用“NULL”来代替,这样做的好处:类似如上的_id = 1 的文档,空字段也可以被索引、检索。 不会再报 "field name is null or empty" 的错误了。
3、null_value 使用注意
null_value 必须和定义的数据类型匹配,举例:long 类型字段不能有string 类型的 null value。
PUT my-index-000001
{
"mappings": {
"properties": {
"status_code": {
"type": "keyword"
},
"title": {
"type": "long",
"null_value": "NULL"
}
}
}
}
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [_doc]: For input string: \"NULL\""
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [_doc]: For input string: \"NULL\"",
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: \"NULL\""
}
},
"status": 400
}
null_value 只影响了数据的索引,不会修改_source 文档。
4、哪些字段有null_value, 哪些字段没有null_value?
Arrays Boolean Date geo_point IP Keyword Numeric point
4.1 问题1:text 类型不支持 null_value 吗?
DELETE my-index-000001
PUT my-index-000001
{
"mappings": {
"properties": {
"status_code": {
"type": "keyword"
},
"title": {
"type": "text",
"null_value": "NULL"
}
}
}
}
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [title] has unsupported parameters: [null_value : NULL]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [_doc]: Mapping definition for [title] has unsupported parameters: [null_value : NULL]",
"caused_by": {
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [title] has unsupported parameters: [null_value : NULL]"
}
},
"status": 400
}
问题2:如果 text 类型也想设置空值,怎么搞呢?
PUT my-index-000001
{
"mappings": {
"properties": {
"status_code": {
"type": "keyword"
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"null_value": "NULL"
}
}
}
}
}
}
5、线上问题探讨
老哥们,请教一个问题 ,我现在数据中有content这个字段,我想查询这个字段不为空字符串,我用must_not不行。我贴下我的sql 死磕 Elasticsearch 技术交流群
POST test_001/_search
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "cont"
}
},
{
"term": {
"content.keyword": {
"value": ""
}
}
}
]
}
}
}
}
}
POST test_001/_search
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['content.keyword'].length == 1",
"lang": "painless"
}
}
}
}
}
}
6、小结
7、加餐-讨论
评论