Full Text Search
Although the standard
engine supports LIKE and REGEXP regular expression matching, these fuzzy matches have several key limitations. One is performance issues: wildcard and regular expression matching usually search all rows in the table, so these searches can be very time-consuming if the table has a large amount of data. Another limitation is explicit matching patterns: it is difficult to explicitly control what to match and what not to match using wildcards and regular expressions. The query results may not be intelligent enough. These limitations can be overcome with full text search.
To begin, it should be noted that not all engines enable full text search. Only the full_text
engine currently supports Full-Text Search, while the standard
engine does not. So, if you require Full-Text Search functionality, please create your table using the full_text
engine.
Although the standard
engine allows regular expression matching using LIKE
and REGEXP
, these fuzzy matches have numerous significant disadvantages. One issue is performance: wildcard and regular expression matching often search all rows in the table, which can be time-consuming if the table contains a big quantity of data. Another disadvantage is the difficulty in specifically controlling what to match and what not to match when utilizing wildcards and regular expressions. The query results may be insufficiently intelligent. Full-Text Search can circumvent these constraints.
The LIKE
keyword is supported by the full_text
engine, as is the g_match()
function with succinct query syntax and query_string()
with versatile and feature-rich query syntax.
[!TIP] The book table in this section needs to be created using the
full_text
engine.
LIKE
The most commonly used wildcard is %
, which represents any character appearing any number of times in the search string.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT * FROM bookshop.book WHERE name LIKE \"%Physics%\""
}'
sql = "SELECT * FROM bookshop.book WHERE name LIKE \"%Physics%\""
db = client.db(mk)
db.query(sql)
const sql = 'SELECT * FROM bookshop.book WHERE name LIKE "%Physics%"';
const db = client.db(mk);
await db.query(sql);
sql = "SELECT * FROM bookshop.book WHERE name LIKE \"%Physics%\""
resp, err := glitterClient.Query(ctx, sql)
Another useful wildcard is _
, which serves the same purpose as %
, but only matches a single character instead of multiple characters.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT * FROM bookshop.book WHERE name LIKE \"_uantum physics%\""
}'
sql = "SELECT * FROM bookshop.book WHERE name LIKE \"_uantum physics%\""
db = client.db(mk)
db.query(sql)
const sql = 'SELECT * FROM bookshop.book WHERE name LIKE "_uantum physics%"';
const db = client.db(mk);
await db.query(sql);
sql = "SELECT * FROM bookshop.book WHERE name LIKE \"_uantum physics%\""
resp, err := glitterClient.Query(ctx, sql)
G_MATCH
The g_match()
function supports field boosts and result scoring.
- Result Scoring
The _score
field is a dynamic field that automatically fills in the scoring results if selected.
-
Field Boost In
g_match()
, the first parameter is the column name, and you can set weights for the column using the formatcolumn_name
orcolumn_name
+^
+weight_value
. -
Example SQL
select *,_score from bookshop.book
where
(
g_match('name^5','Economy')
or
g_match('category^2','web3')
)
and (public_at > 1582120979 or public_at < 1602120979)
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT *, _score FROM bookshop.book WHERE (g_match(\"name^5\", \"Economy\") or g_match(\"category^2\", \"web3\"))"
}'
sql = "SELECT *, _score FROM bookshop.book WHERE (g_match(\"name^5\", \"Economy\") or g_match(\"category^2\", \"web3\"))"
db = client.db(mk)
rst = db.query(sql)
const sql =
'SELECT *, _score FROM bookshop.book WHERE (g_match("name^5", "Economy") or g_match("category^2", "web3"))';
const db = client.db(mk);
await db.query(sql);
sql = "SELECT *, _score FROM bookshop.book WHERE (g_match(\"name^5\", \"Economy\") or g_match(\"category^2\", \"web3\"))"
resp, err := glitterClient.Query(ctx, sql)
QueryString
The full_text
engine's query_string()
function allows users to define sophisticated queries using a simple syntax.
Terms
Plain terms without any other syntax are interpreted as a match query for the term in the default field. The default field is _all
unless overridden in the index mapping.
To locate books with "web3" among many books, for example.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT *, _score FROM bookshop.book WHERE query_string(\"web3\")"
}'
sql = "SELECT * FROM bookshop.book WHERE (query_string(\"web3\"))"
db = client.db(mk)
rst = db.query(sql)
const sql = 'SELECT * FROM bookshop.book WHERE (query_string("web3"))';
const db = client.db(mk);
await db.query(sql);
sql = "SELECT * FROM bookshop.book WHERE (query_string(\"web3\"))"
resp, err := glitterClient.Query(ctx, sql)
Phrases
Phrase queries can be accomplished by placing the phrase in quotes.
Example: "Princeton University Press"
will perform a Match Phrase Query for the phrase Princeton University Press
.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT *, _score FROM bookshop.book WHERE query_string(\"Princeton University Press\")"
}'
sql = "SELECT * FROM bookshop.book WHERE (query_string(\"Princeton University Press\"))"
db = client.db(mk)
rst = db.query(sql)
const sql =
'SELECT * FROM bookshop.book WHERE (query_string("Princeton University Press"))';
const db = client.db(mk);
await db.query(sql);
sql = "SELECT * FROM bookshop.book WHERE (query_string(\"Princeton University Press\"))"
resp, err := glitterClient.Query(ctx, sql)
Field Scoping
You can qualify the field for these searches by prefixing them with the name of the field separated by a colon.
Example: category:web3
will perform a Match Query for the term web3
, in the category
field.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT *, _score FROM bookshop.book WHERE query_string('\''category: \"web3\"'\'')"
}'
sql = "SELECT * FROM bookshop.book WHERE (query_string('category: \"web3\"'))"
db = client.db(mk)
rst = db.query(sql)
const sql =
"SELECT * FROM bookshop.book WHERE (query_string('category: \"web3\"'))";
const db = client.db(mk);
await db.query(sql);
sql = "SELECT * FROM bookshop.book WHERE (query_string('category: \"web3\"'))"
resp, err := glitterClient.Query(ctx, sql)
Regular Expressions
You can use regular expressions in addition to using terms by wrapping the expression in forward slashes (/).
Example: /Alfred (Price|Renyi)/
will perform a regular expression against the terms in document.
The regular expressions can also be used with field scoping.
Example: author:/Alfred.*/
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT * FROM bookshop.book WHERE query_string('\''author:/Alfred.*/'\'')"
}'
db = client.db(mk)
sql = "select * from bookshop.book where query_string('author:/Alfred.*/')"
rst = db.query(sql)
const sql =
"select * from bookshop.book where query_string('author:/Alfred.*/')";
const db = client.db(mk);
await db.query(sql);
sql = "select * from bookshop.book where query_string('author:/Alfred.*/')"
resp, err := glitterClient.Query(ctx, sql)
Required, Optional, and Exclusion
When your query string includes multiple items, by default these are placed into the SHOULD clause of a Boolean Query.
You can change this by prefixing your items with a + or ‘-’. _ ‘+’ Prefixing with plus places that item in the MUST portion of the boolean query. _ ‘-’ Prefixing with a minus places that item in the MUST NOT portion of the boolean query.
Example: +name:"Come Prima" -author:"Simona Vinci"
will perform a Boolean Query that MUST satisfy the Match Query for the term Come Prima
in the name
field, MUST NOT satisfy the Match Query for the term Simona Vinc
in the author
field. Result documents satisfying the SHOULD clause will score higher than those that do not.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT * FROM bookshop.book WHERE query_string('\''+name:\"Come Prima\" -author:\"Simona Vinci\"'\'')"
}'
db = client.db(mk)
sql = "select * from bookshop.book where query_string('+name:\"Come Prima\" -author:\"Simona Vinci\"')"
rst = db.query(sql)
const sql =
'select * from bookshop.book where query_string(\'+name:"Come Prima" -author:"Simona Vinci"\')';
const db = client.db(mk);
await db.query(sql);
sql = "select * from bookshop.book where query_string('+name:\"Come Prima\" -author:\"Simona Vinci\"')"
resp, err := glitterClient.Query(ctx, sql)
Boosting
You can influence the relative importance of the clauses by suffixing clauses with the ^ operator followed by a number.
Example: name:Token^1.0 author:Token^0.5
will perform Match queries for Token
in both the name
and author
fields, but documents having the term in the name
field will score higher.
- cURL
- Python
- JavaScript
- Golang
curl --location 'http://api.xian.glitter.link/blockved/glitterchain/index/sql/query' \
--header 'Content-Type: application/json' \
--data '{
"sql":"SELECT * FROM bookshop.book WHERE query_string('\''name:\"Token\"^1.0 author:\"Token\"^0.5'\'')"
}'
db = client.db(mk)
queries = []
word = "Token"
queries.append(MatchPhraseQuery("name", word, 1.0))
queries.append(MatchPhraseQuery("author", word, 0.5))
query_str = query_string_prepare(queries)
sql = "select ipfs_cid,name,author,extension,language,publisher,year,filesize from bookshop.book where query_string(%s)"
sql = prepare_sql(sql, [query_str])
rst = db.query(sql)
const sql =
'SELECT * FROM bookshop.book WHERE (query_string(\'name:"Token"^1.0 author:"Token"^0.5\'))';
const db = client.db(mk);
await db.query(sql);
qs := utils.NewCandyQueryString("bookshop", "book", "ipfs_cid,name,author,extension,language,publisher,year,filesize", 0, 0)
qs.AddMatchPhraseQuery("name", "Token", 1.0)
qs.AddMatchPhraseQuery("author", "Token", 0.5)
sql, err := qs.Build()
resp, err := glitterClient.Query(ctx, sql)
Numeric Ranges
You can perform numeric ranges by using the > , >= , < , and <= operators, followed by a numeric value.
Example: abv:>10
will perform an Numeric Range Query on the abv
field for values greater than ten.
Date Ranges
You can perform date range searches by using the > , >= , < , and <= operators, followed by a date value in quotes.
Example: created:>"2016-09-21"
will perform an Date Range Query on the created
field for values after September 21, 2016.
Escaping
The following quoted string enumerates the characters which may be escaped:
"+-=&|><!(){}[]^\"~*?:\\/ "
NOTE: this list contains the space character.
In order to escape these characters, they are prefixed with the \ (backslash) character. In all cases, using the escaped version produces the character itself and is not interpreted by the lexer.
Example: my\ name
will be interpreted as a single argument to a match query with the value “my name”.
Example: "contains a\" character"
will be interpreted as a single argument to a phrase query with the value contains a " character
.
Highlight
Result highlighting is achieved through SQL Hints, and the highlighted columns will automatically create columns with the prefix _highlight_
in the returned result set. For example, if you need to highlight the name column, in addition to the columns selected in the SELECT statement, a _highlight_name
column will also be returned in the result set.
Hint Syntax
You need to add a comment in the following format between the SELECT keyword and the fields to be selected, starting with /*+
and ending with */
. Call SET_VAR in between, and its parameter must be full_text_option='JSON Object'
. This JSON Object currently includes an option for highlight, with style as the highlighter, currently supporting html
and ansi
, and fields as the column names to be highlighted.
The following hint will highlight the name
and category
fields:
/*+ SET_VAR(full_text_option='{"highlight":{"style":"html","fields":["name","category"]}}') */
- Example SQL
select /*+ SET_VAR(full_text_option='{"highlight":{"style":"html","fields":["name","category"]}}') */
*,_score from bookshop.book
where
(
g_match('name^5','Economy')
or
g_match('category^2','web3')
)
and (public_at > 1582120979 or public_at < 1602120979)
- Python
- JavaScript
- Golang
db = client.db(mk)
sql = 'select /*+ SET_VAR(full_text_option=\'{"highlight":{"style":"html","fields":["name","category"]}}\') */ * from bookshop.book limit 1'
rst = db.query(sql)
const sql =
'select /*+ SET_VAR(full_text_option=\'{"highlight":{"style":"html","fields":["name","category"]}}\') */ * from bookshop.book limit 1';
const db = client.db(mk);
await db.query(sql);
sql = 'select /*+ SET_VAR(full_text_option=\'{"highlight":{"style":"html","fields":["name","category"]}}\') */ * from bookshop.book limit 1'
resp, err := glitterClient.Query(ctx, sql)
- Example Return
{
"result": [
{
"row": {
"_highlight_name": {
"value": "<mark>红楼梦</mark>",
"column_value_type": "StringColumn"
},
"_highlight_author": {
"value": "曹雪芹",
"column_value_type": "StringColumn"
},
"author": {
"value": "曹雪芹",
"column_value_type": "StringColumn"
},
"name": {
"value": "红楼梦",
"column_value_type": "StringColumn"
}
}
}
]
}
Explain
When we have concerns regarding the methodology used to produce the results, we might use explain to do an analysis.
We are aware that a full-text
engine sorts its results according to the quality of each content for each result. The dynamic field _explanation
, which carries the grading process for each document.
select *,_explanation from bookshop.book
where
(
g_match('name^5','Economy')
or
g_match('category^2','web3')
)
and (public_at > 1582120979 or public_at < 1602120979)
The format of _explanation
is JSON.