|
2 | 2 |
|
3 | 3 |  |
4 | 4 |
|
| 5 | + |
5 | 6 | ## Installation |
6 | 7 |
|
7 | 8 | ```bash |
@@ -41,6 +42,81 @@ FALSE_SERIES = Series(data=[False for _ in range(0, dataframe_size)])) |
41 | 42 | NONE_SERIES = Series(data=[None for _ in range(0, dataframe_size)])) |
42 | 43 | ``` |
43 | 44 |
|
| 45 | +### SQL Syntax |
| 46 | +The sql syntax for dataframe_sql is as follows: |
| 47 | + |
| 48 | +Select statement: |
| 49 | + |
| 50 | +```SQL |
| 51 | +SELECT [{ ALL | DISTINCT }] |
| 52 | + { [ <expression> ] | <expression> [ [ AS ] <alias> ] } [, ...] |
| 53 | +[ FROM <from_item> [, ...] ] |
| 54 | +[ WHERE <bool_expression> ] |
| 55 | +[ GROUP BY { <expression> [, ...] } ] |
| 56 | +[ HAVING <bool_expression> ] |
| 57 | +``` |
| 58 | + |
| 59 | +Set operations: |
| 60 | + |
| 61 | +```SQL |
| 62 | +<select_statement1> |
| 63 | +{UNION [DISTINCT] | UNION ALL | INTERSECT [DISTINCT] | EXCEPT [DISTINCT] | EXCEPT ALL} |
| 64 | +<select_statment2> |
| 65 | +``` |
| 66 | + |
| 67 | +Joins: |
| 68 | + |
| 69 | +```SQL |
| 70 | +INNER, CROSS, FULL OUTER, LEFT OUTER, RIGHT OUTER, FULL, LEFT, RIGHT |
| 71 | +``` |
| 72 | + |
| 73 | +Order by and limit: |
| 74 | + |
| 75 | +```SQL |
| 76 | +<set> |
| 77 | +[ORDER BY <expression>] |
| 78 | +[LIMIT <number>] |
| 79 | +``` |
| 80 | + |
| 81 | +Supported expressions and functions: |
| 82 | +```SQL |
| 83 | ++, -, *, / |
| 84 | +``` |
| 85 | +```SQL |
| 86 | +CASE WHEN <condition> THEN <result> [WHEN ...] ELSE <result> END |
| 87 | +``` |
| 88 | +```SQL |
| 89 | +SUM, AVG, MIN, MAX |
| 90 | +``` |
| 91 | +```SQL |
| 92 | +{RANK | DENSE_RANK} OVER([PARTITION BY (<expresssion> [, <expression>...)]) |
| 93 | +``` |
| 94 | +```SQL |
| 95 | +CAST (<expression> AS <data_type>) |
| 96 | +``` |
| 97 | +*Anything in <> is meant to be some string <br> |
| 98 | +*Anything in [] is optional <br> |
| 99 | +*Anything in {} is grouped together |
| 100 | + |
| 101 | +### Supported Data Types for cast expressions include: |
| 102 | +* VARCHAR, STRING |
| 103 | +* INT16, SMALLINT |
| 104 | +* INT32, INT |
| 105 | +* INT64, BIGINT |
| 106 | +* FLOAT16 |
| 107 | +* FLOAT32 |
| 108 | +* FLOAT, FLOAT64 |
| 109 | +* BOOL |
| 110 | +* DATETIME64, TIMESTAMP |
| 111 | +* CATEGORY |
| 112 | +* OBJECT |
| 113 | + |
| 114 | +*Data types in dataframe SQL support many different name for certain datatypes becuase |
| 115 | +popular SQL data types are not implemented with common names in pandas and other |
| 116 | +dataframe frameworks |
| 117 | +<br> |
| 118 | +**To make this less confusing all data types that are of the same size on the |
| 119 | +backend are grouped together in this list |
44 | 120 |
|
45 | 121 | ## Issues that come from Pandas |
46 | 122 |
|
|
0 commit comments