sql-statements
SQL Statements
Content
Mysql
The SQL queries for the Mysql Datasource are directly forwarded to the Mysql Server, so standard SQL syntax is supported. Mysql Select Statements
- The table name in the query must include the Database name.
- Only
SELECT
queries are supported. - Syntax is identical to standard Mysql
SELECT
statements. - Cross-database joins are not supported.
The following SQL examples use Mysql 5.7.31 as the Datasource.
Sample Data
There are two tables in Mysql.
Table: Books
id | title | author | year | create_time |
---|---|---|---|---|
1 | The Great Adventure | John Smith | 1999 | 1672531199 |
2 | Mystery of the Lost City | Jane Doe | 2000 | 1672617599 |
3 | Exploring the Universe | Albert Johnson | 1965 | 1672703999 |
4 | The Secrets of Time | Emily Davis | 1999 | 1672790399 |
5 | History of the Ancient World | Michael Brown | 2001 | 1672876799 |
6 | Journey Through Space | Sophia Wilson | 2002 | 1672963199 |
7 | Mastering the Art of Coding | James Miller | 2002 | 1673049599 |
8 | Understanding AI | Olivia Garcia | 1999 | 1673135999 |
9 | The Future of Technology | Lucas Martinez | 2005 | 1673222399 |
10 | Tales of the Forgotten Realm | Amelia Lee | 2006 | 1673308799 |
Talbe: Authors
id | name | gender |
---|---|---|
1 | John Smith | Male |
2 | Jane Doe | Female |
3 | Albert Johnson | Male |
4 | Emily Davis | Female |
5 | Michael Brown | Male |
6 | Sophia Wilson | Female |
7 | James Miller | Male |
8 | Olivia Garcia | Female |
9 | Lucas Martinez | Male |
10 | Amelia Lee | Female |
Simple Query
# simple query
curl --location 'http://localhost:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "select * from eng_test.books limit 5;"}'
{
"results": [
{
"columns": [
{
"name": "id",
"type": "Int",
"description": ""
},
{
"name": "title",
"type": "String",
"description": ""
},
{
"name": "author",
"type": "String",
"description": ""
},
{
"name": "create_time",
"type": "Int",
"description": ""
}
],
"rows": [
{
"columnValues": [
"1",
"The Great Adventure",
"John Smith",
"1672531199"
]
},
{
"columnValues": [
"2",
"Mystery of the Lost City",
"Jane Doe",
"1672617599"
]
},
{
"columnValues": [
"3",
"Exploring the Universe",
"Albert Johnson",
"1672703999"
]
},
{
"columnValues": [
"4",
"The Secrets of Time",
"Emily Davis",
"1672790399"
]
},
{
"columnValues": [
"5",
"History of the Ancient World",
"Michael Brown",
"1672876799"
]
}
]
}
],
"tookTimes": 0.008407835
}
Conditional Query
# conditional query
curl --location 'http://localhost:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "select * from eng_test.books where year > 2000 limit 5;"}'
{
"results": [
{
"columns": [
{
"name": "id",
"type": "Int",
"description": ""
},
{
"name": "title",
"type": "String",
"description": ""
},
{
"name": "author",
"type": "String",
"description": ""
},
{
"name": "create_time",
"type": "Int",
"description": ""
},
{
"name": "year",
"type": "Int",
"description": ""
}
],
"rows": [
{
"columnValues": [
"5",
"History of the Ancient World",
"Michael Brown",
"1672876799",
"2001"
]
},
{
"columnValues": [
"6",
"Journey Through Space",
"Sophia Wilson",
"1672963199",
"2002"
]
},
{
"columnValues": [
"7",
"Mastering the Art of Coding",
"James Miller",
"1673049599",
"2002"
]
},
{
"columnValues": [
"9",
"The Future of Technology",
"Lucas Martinez",
"1673222399",
"2005"
]
},
{
"columnValues": [
"10",
"Tales of the Forgotten Realm",
"Amelia Lee",
"1673308799",
"2006"
]
}
]
}
],
"tookTimes": 0.028270975
}
Function Query
# function query
curl --location 'http://localhost:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "select count(*) as c from eng_test.books where year > 2000 limit 10;"}'
{
"results": [
{
"columns": [
{
"name": "c",
"type": "Int",
"description": ""
}
],
"rows": [
{
"columnValues": [
"5"
]
}
]
}
],
"tookTimes": 0.007943698
}
Join Query
# join query
curl --location 'http://localhost:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "SELECT books.id, books.title, books.author, books.year, authors.gender FROM eng_test.books JOIN eng_test.authors ON books.author = authors.name WHERE books.year > 2000 limit 10;"}'
{
"results": [
{
"columns": [
{
"name": "id",
"type": "Int",
"description": ""
},
{
"name": "title",
"type": "String",
"description": ""
},
{
"name": "author",
"type": "String",
"description": ""
},
{
"name": "year",
"type": "Int",
"description": ""
},
{
"name": "gender",
"type": "String",
"description": ""
}
],
"rows": [
{
"columnValues": [
"5",
"History of the Ancient World",
"Michael Brown",
"2001",
"Male"
]
},
{
"columnValues": [
"6",
"Journey Through Space",
"Sophia Wilson",
"2002",
"Female"
]
},
{
"columnValues": [
"7",
"Mastering the Art of Coding",
"James Miller",
"2002",
"Male"
]
},
{
"columnValues": [
"9",
"The Future of Technology",
"Lucas Martinez",
"2005",
"Male"
]
},
{
"columnValues": [
"10",
"Tales of the Forgotten Realm",
"Amelia Lee",
"2006",
"Female"
]
}
]
}
],
"tookTimes": 0.00676448
}
Aggregate Query
# aggregate query
curl --location 'http://localhost:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "SELECT gender, count(*) as c FROM eng_test.books JOIN eng_test.authors ON books.author = authors.name WHERE books.year > 2000 group by gender limit 10;"}'
{
"results": [
{
"columns": [
{
"name": "gender",
"type": "String",
"description": ""
},
{
"name": "c",
"type": "Int",
"description": ""
}
],
"rows": [
{
"columnValues": [
"Female",
"2"
]
},
{
"columnValues": [
"Male",
"3"
]
}
]
}
],
"tookTimes": 0.011582429
}
ElasticSearch
TODO
Milvus
The Milvus Datasource SQL queries are parsed and then transformed into a request sent to the Milvus Server via the Milvus Golang SDK.
The SQL examples below utilize the Milvus Docker Image
milvusdb/milvus:v2.4.13-hotfix
as the Datasource. For setup instructions, refer to Run Milvus Server.For embedding, a lightweight model,
paraphrase-albert-small-v2
, is used as the Embedding Provider.
Sample Data
There is a table with three fields in Milvus.
Table: demo_collection
id | text | vector | subject |
---|---|---|---|
1 | Artificial intelligence was founded as an academic discipline in 1956. | <vector index> | history |
2 | Alan Turing was the first person to conduct substantial research in AI. | <vector index> | history |
3 | Born in Maida Vale, London, Turing was raised in southern England. | <vector index> | history |
4 | Machine learning has been used for drug design. | <vector index> | biology |
5 | Computational synthesis with AI algorithms predicts molecular properties. | <vector index> | biology |
6 | DDR1 is involved in cancers and fibrosis. | <vector index> | biology |
Vector Search
Vector search requires the use of TEXT_TO_VEC
to perform embedding on the query.
The following distance algorithms are available:
- VECTOR_L2_DISTANCE: Euclidean distance (L2)
- VECTOR_IP_DISTANCE: Inner product (IP)
- VECTOR_COSINE_DISTANCE: Cosine similarity (COSINE)
The choice of distance algorithm in the search SQL depends on the distance algorithm used to build the index when writing data to the Milvus table. In the examples provided, VECTOR_COSINE_DISTANCE is used.
# vector search
curl --location 'http://127.0.0.1:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "SELECT id,text,subject,VECTOR_COSINE_DISTANCE(vector,TEXT_TO_VEC(?)) AS distance FROM vec.demo_collection LIMIT 10", "arguments": [{"type": "STRING", "value": "Who is Alan Turing?"}]}'
{
"results": [
{
"columns": [
{
"name": "id",
"type": "Int",
"description": ""
},
{
"name": "text",
"type": "String",
"description": ""
},
{
"name": "subject",
"type": "String",
"description": ""
},
{
"name": "distance",
"type": "Float",
"description": ""
}
],
"rows": [
{
"columnValues": [
"2",
"Born in Maida Vale, London, Turing was raised in southern England.",
"history",
"0.5859944"
]
},
{
"columnValues": [
"1",
"Alan Turing was the first person to conduct substantial research in AI.",
"history",
"0.51182544"
]
},
{
"columnValues": [
"0",
"Artificial intelligence was founded as an academic discipline in 1956.",
"history",
"0.12895365"
]
},
{
"columnValues": [
"5",
"DDR1 is involved in cancers and fibrosis.",
"biology",
"0.058018543"
]
},
{
"columnValues": [
"4",
"Computational synthesis with AI algorithms predicts molecular properties.",
"biology",
"-0.018361788"
]
},
{
"columnValues": [
"3",
"Machine learning has been used for drug design.",
"biology",
"-0.04282199"
]
}
]
}
],
"tookTimes": 0.24089202
}
Vector Search with Metadata Filtering
In vector search, scalar fields (non-vector data) can also be filtered simultaneously.
When the query SQL includes the condition WHERE subject = 'history'
, it filters out only the corresponding results.
# vector search with metadata fitering
curl --location 'http://127.0.0.1:8080/api/v1/query' \
--header 'Content-Type: application/json' \
--data '{"sql": "SELECT id,text,subject,VECTOR_COSINE_DISTANCE(vector,TEXT_TO_VEC(?)) AS distance FROM vec.demo_collection where subject = '\''history'\'' LIMIT 10", "arguments": [{"type": "STRING", "value": "Who is Alan Turing?"}]}'
{
"results": [
{
"columns": [
{
"name": "id",
"type": "Int",
"description": ""
},
{
"name": "text",
"type": "String",
"description": ""
},
{
"name": "subject",
"type": "String",
"description": ""
},
{
"name": "distance",
"type": "Float",
"description": ""
}
],
"rows": [
{
"columnValues": [
"2",
"Born in Maida Vale, London, Turing was raised in southern England.",
"history",
"0.5859944"
]
},
{
"columnValues": [
"1",
"Alan Turing was the first person to conduct substantial research in AI.",
"history",
"0.51182544"
]
},
{
"columnValues": [
"0",
"Artificial intelligence was founded as an academic discipline in 1956.",
"history",
"0.12895365"
]
}
]
}
],
"tookTimes": 1.0737197
}