Skip to content

Commit c4f7d1a

Browse files
committed
DOC: Add a sql guide
1 parent 85f4b69 commit c4f7d1a

1 file changed

Lines changed: 74 additions & 0 deletions

File tree

README.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,80 @@ FALSE_SERIES = Series(data=[False for _ in range(0, dataframe_size)]))
4141
NONE_SERIES = Series(data=[None for _ in range(0, dataframe_size)]))
4242
```
4343

44+
### SQL Syntax
45+
The sql syntax for dataframe_sql is as follows:
46+
47+
Select statement:
48+
49+
```SQL
50+
SELECT [{ ALL | DISTINCT }]
51+
{ [ <expression> ] | <expression> [ [ AS ] <alias> ] } [, ...]
52+
[ FROM <from_item> [, ...] ]
53+
[ WHERE <bool_expression> ]
54+
[ GROUP BY { <expression> [, ...] } ]
55+
[ HAVING <bool_expression> ]
56+
```
57+
58+
Set operations:
59+
60+
```SQL
61+
<select_statement1>
62+
{UNION [DISTINCT] | UNION ALL | INTERSECT [DISTINCT] | EXCEPT [DISTINCT] | EXCEPT ALL}
63+
<select_statment2>
64+
```
65+
66+
Joins:
67+
68+
```SQL
69+
INNER, CROSS, FULL OUTER, LEFT OUTER, RIGHT OUTER, FULL, LEFT, RIGHT
70+
```
71+
72+
Order by and limit:
73+
74+
```SQL
75+
set1
76+
[ORDER BY <expression>]
77+
[LIMIT <number>]
78+
```
79+
80+
Supported expressions and function:
81+
```SQL
82+
+, -, *, /
83+
```
84+
```SQL
85+
CASE WHEN <condition> THEN <result> [WHEN ...] ELSE <result> END
86+
```
87+
```SQL
88+
SUM, AVG, MIN, MAX
89+
```
90+
```SQL
91+
(RANK | DENSE_RANK) OVER([PARTITION BY (<expresssion> [, <expression>...)])
92+
```
93+
```SQL
94+
CAST <expression> AS <data_type>
95+
```
96+
*Anything in <> is meant to be some string
97+
*Anything in [] is optional
98+
*Anything in {} is grouped together
99+
100+
### Supported Data Types for cast expressions include:
101+
-VARCHAR, STRING
102+
-INT16, SMALLINT
103+
-INT32, INT
104+
-INT64, BIGINT
105+
-FLOAT16
106+
-FLOAT32
107+
-FLOAT, FLOAT64
108+
-BOOL
109+
-DATETIME64, TIMESTAMP
110+
-CATEGORY
111+
-OBJECT
112+
113+
*Data types in dataframe SQL support many different name for certain datatypes becuase
114+
popular SQL data types are not implemented with common names in pandas and other
115+
dataframe frameworks
116+
**To make this less confusing all data types that are of the same size on the
117+
backend are grouped together in this list
44118

45119
## Issues that come from Pandas
46120

0 commit comments

Comments
 (0)