Elasticsearch的查询语言(DSL)真是不好写,偏偏查询的功能千奇百怪,filter/query/match/agg/geo各种各样,不管你是通过封装JSON还是通过python/java的api进行封装,都非常不方便。
最近发现了一个插件,Elasticsearch-SQL可以用sql查询Elasticsearch,感觉这个轮子造的真是好。
Elasticsearch-sql的项目地址:https://github.com/NLPchina/elasticsearch-sql
1、简介
Elasticsearch-sql实现的功能:
1)插件式的安装
2)SQL查询
3)超越SQL之外的查询
4)对JDBC方式的支持
2、插件式的安装
安装方法和elasticsearch-head的安装方法类似:
我们使用的es版本是2.1.1,如果你用的是不同的版本,可以在https://github.com/NLPchina/elasticsearch-sql找到支持。
$ cd ~/elasticsearch-2.1.1$./bin/plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/2.1.1.1/elasticsearch-sql-2.1.1.1.zip
如果成功,命令行打印如下东东:
[bigdata-dw@bigdata-arch-client10 es2.1.1]$ ./bin/plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/2.1.1.1/elasticsearch-sql-2.1.1.1.zip -> Installing from https://github.com/NLPchina/elasticsearch-sql/releases/download/2.1.1.1/elasticsearch-sql-2.1.1.1.zip...Trying https://github.com/NLPchina/elasticsearch-sql/releases/download/2.1.1.1/elasticsearch-sql-2.1.1.1.zip ...Downloading .................................................................................................................................................................................................................................................................................................................................................................................................................................................................DONEVerifying https://github.com/NLPchina/elasticsearch-sql/releases/download/2.1.1.1/elasticsearch-sql-2.1.1.1.zip checksums if available ...NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify)Installed sql into /home/bigdata-dw/es2.1.1/plugins/sql
3、SQL查询
安装成功以后我们就可以通过sql查询ES了。
es-sql还提供了web页面,访问方式是http://10.93.18.34:9200/_plugin/sql/(如果你使用head,那么你的head访问应该是http://10.93.18.34:9200/_plugin/head/)
这里的ip和port是你安装es的主机和http端口。
访问到的页面是这样的
那么你现在有两种方式可以执行你的SQL:
1)在搜索框里直接输入你的sql了。(我的版本行尾不要写“;”否则会解析不了SQL)
2)通过http请求如
curl -XPOST http://10.93.18.34:8049/_sql -d 'SELECT * FROM audit where dDelay=-2053867461'
你会收到一个json格式的返回
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":12.549262,"hits":[{"_index":"audit","_type":"kafka","_id":"AVzzK-h_V9seINxbZ2Ox","_score":12.549262,"_source":{"timestamp":"1498726500000","dCount":680008,"dDelay":-2053867461,"cDelay":0,"clanName":"DJ_elk_common","checkTime":1498728360063,"cCount":0,"pCount":680008,"topicName":"DJ_elk_common_clean","pDelay":370356423}}]}}
下面我们简单说4种类型的sql的书写方式:
1)query
SELECT * FROM bank WHERE age >30 AND gender = 'm'
2)aggregation
select COUNT(*),SUM(age),MIN(age) as m, MAX(age),AVG(age) FROM bank GROUP BY gender ORDER BY SUM(age), m DESC
3)delete
DELETE FROM bank WHERE age >30 AND gender = 'm'
4)geo
SELECT * FROM locations WHERE GEO_BOUNDING_BOX(fieldname,100.0,1.0,101,0.0)
5)需要指定index+type
SELECT * FROM indexName/type
6)如何指定路由
select /*! ROUTINGS(salary) */ sum(count) from index where type="salary"
4、对JDBC的支持
上述查询方式不管是直接在web上输入sql还是通过http请求。elasticsearch-sql还支持通过jdbc进行编程。
这个还没有研究,抽空研究一下再回来。