Commit d0237fb
IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column
Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly.
It needs to transform a BetweenPredicate into a CompoundPredicate
consisting of upper bound and lower bound BinaryPredicate through
BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down
or rewritten into other form by another expression rewrite rule.
However, the selectivity of BetweenPredicate or its derivatives remains
unassigned and often collapses with other unknown selectivity predicates
to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1).
This patch adds a narrow optimization of BetweenPredicate selectivity
when the following criteria are met:
1. The BetweenPredicate is bound to a slot reference of a single column
of a table.
2. The column type is discrete, such as INTEGER or DATE.
3. The column stats are available.
4. The column is sufficiently unique based on available stats.
5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value
<= upper bound value).
6. The final calculated selectivity is less than or equal to
Expr.DEFAULT_SELECTIVITY.
If these criteria are unmet, the Planner will revert to the old
behavior, which is letting the selectivity unassigned.
Since this patch only target BetweenPredicate over unique column, the
following query will still have the default scan selectivity (0.1):
select count(*) from tpch.customer c
where c.c_custkey >= 1234 and c.c_custkey <= 2345;
While this equivalent query written with BETWEEN predicate will have
lower scan selectivity:
select count(*) from tpch.customer c
where c.c_custkey between 1234 and 2345;
This patch calculates the BetweenPredicate selectivity during
transformation at BetweenToCompoundRule.java. The selectivity is
piggy-backed into the resulting CompoundPredicate and BinaryPredicate as
betweenSelectivity_ field, separate from the selectivity_ field.
Analyzer.getBoundPredicates() is modified to prioritize the derived
BinaryPredicate over ordinary BinaryPredicate in its return value to
prevent the derived BinaryPredicate from being eliminated by a matching
ordinary BinaryPredicate.
Testing:
- Add table functional_parquet.unique_with_nulls.
- Add FE tests in ExprCardinalityTest#testBetweenSelectivity,
ExprCardinalityTest#testNotBetweenSelectivity, and
PlannerTest#testScanCardinality.
- Pass core tests.
Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10
Reviewed-on: http://gerrit.cloudera.org:8080/21377
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>1 parent 8a6f282 commit d0237fb
27 files changed
Lines changed: 3726 additions & 3320 deletions
File tree
- fe/src
- main/java/org/apache/impala
- analysis
- catalog
- planner
- rewrite
- test/java/org/apache/impala/analysis
- testdata
- bin
- datasets/functional
- workloads/functional-planner/queries/PlannerTest
- tpcds_cpu_cost
Lines changed: 26 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2460 | 2460 | | |
2461 | 2461 | | |
2462 | 2462 | | |
| 2463 | + | |
| 2464 | + | |
2463 | 2465 | | |
2464 | 2466 | | |
2465 | 2467 | | |
| |||
2610 | 2612 | | |
2611 | 2613 | | |
2612 | 2614 | | |
2613 | | - | |
2614 | | - | |
| 2615 | + | |
| 2616 | + | |
| 2617 | + | |
| 2618 | + | |
| 2619 | + | |
| 2620 | + | |
| 2621 | + | |
| 2622 | + | |
| 2623 | + | |
| 2624 | + | |
| 2625 | + | |
| 2626 | + | |
| 2627 | + | |
| 2628 | + | |
| 2629 | + | |
| 2630 | + | |
| 2631 | + | |
| 2632 | + | |
| 2633 | + | |
| 2634 | + | |
| 2635 | + | |
2615 | 2636 | | |
| 2637 | + | |
| 2638 | + | |
| 2639 | + | |
2616 | 2640 | | |
2617 | 2641 | | |
2618 | 2642 | | |
| |||
Lines changed: 18 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
49 | 52 | | |
50 | 53 | | |
51 | 54 | | |
| |||
155 | 158 | | |
156 | 159 | | |
157 | 160 | | |
| 161 | + | |
| 162 | + | |
158 | 163 | | |
159 | 164 | | |
160 | 165 | | |
| |||
193 | 198 | | |
194 | 199 | | |
195 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
196 | 205 | | |
197 | 206 | | |
198 | 207 | | |
| |||
415 | 424 | | |
416 | 425 | | |
417 | 426 | | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
418 | 436 | | |
Lines changed: 28 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
57 | 63 | | |
58 | 64 | | |
59 | 65 | | |
| |||
67 | 73 | | |
68 | 74 | | |
69 | 75 | | |
70 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
71 | 83 | | |
72 | 84 | | |
73 | 85 | | |
| |||
84 | 96 | | |
85 | 97 | | |
86 | 98 | | |
| 99 | + | |
87 | 100 | | |
88 | 101 | | |
| 102 | + | |
| 103 | + | |
89 | 104 | | |
90 | 105 | | |
91 | 106 | | |
92 | 107 | | |
93 | 108 | | |
94 | 109 | | |
| 110 | + | |
95 | 111 | | |
96 | 112 | | |
97 | 113 | | |
| |||
103 | 119 | | |
104 | 120 | | |
105 | 121 | | |
106 | | - | |
| 122 | + | |
107 | 123 | | |
108 | | - | |
109 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
110 | 127 | | |
111 | 128 | | |
112 | 129 | | |
| |||
163 | 180 | | |
164 | 181 | | |
165 | 182 | | |
166 | | - | |
167 | | - | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
168 | 188 | | |
169 | 189 | | |
170 | 190 | | |
| |||
238 | 258 | | |
239 | 259 | | |
240 | 260 | | |
| 261 | + | |
| 262 | + | |
241 | 263 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1851 | 1851 | | |
1852 | 1852 | | |
1853 | 1853 | | |
1854 | | - | |
| 1854 | + | |
| 1855 | + | |
1855 | 1856 | | |
1856 | | - | |
| 1857 | + | |
| 1858 | + | |
1857 | 1859 | | |
1858 | 1860 | | |
1859 | 1861 | | |
| |||
1865 | 1867 | | |
1866 | 1868 | | |
1867 | 1869 | | |
1868 | | - | |
| 1870 | + | |
1869 | 1871 | | |
1870 | 1872 | | |
1871 | 1873 | | |
| |||
1875 | 1877 | | |
1876 | 1878 | | |
1877 | 1879 | | |
| 1880 | + | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
| 1888 | + | |
| 1889 | + | |
1878 | 1890 | | |
1879 | 1891 | | |
1880 | 1892 | | |
| |||
1884 | 1896 | | |
1885 | 1897 | | |
1886 | 1898 | | |
| 1899 | + | |
| 1900 | + | |
1887 | 1901 | | |
1888 | | - | |
| 1902 | + | |
1889 | 1903 | | |
1890 | 1904 | | |
1891 | 1905 | | |
1892 | 1906 | | |
| 1907 | + | |
| 1908 | + | |
| 1909 | + | |
| 1910 | + | |
1893 | 1911 | | |
1894 | 1912 | | |
1895 | 1913 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
234 | 234 | | |
235 | 235 | | |
236 | 236 | | |
| 237 | + | |
237 | 238 | | |
238 | 239 | | |
239 | 240 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
| 28 | + | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| 34 | + | |
32 | 35 | | |
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
| 39 | + | |
36 | 40 | | |
37 | 41 | | |
38 | 42 | | |
| |||
751 | 755 | | |
752 | 756 | | |
753 | 757 | | |
| 758 | + | |
| 759 | + | |
754 | 760 | | |
755 | 761 | | |
756 | 762 | | |
757 | 763 | | |
758 | 764 | | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
759 | 772 | | |
760 | 773 | | |
761 | 774 | | |
762 | 775 | | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
763 | 780 | | |
764 | | - | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
765 | 802 | | |
766 | | - | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
767 | 811 | | |
768 | 812 | | |
769 | 813 | | |
| |||
781 | 825 | | |
782 | 826 | | |
783 | 827 | | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
784 | 837 | | |
785 | 838 | | |
786 | 839 | | |
| |||
0 commit comments