|
| 1 | +<?xml version="1.0" encoding="UTF-8"?> |
| 2 | +<!-- |
| 3 | +Licensed to the Apache Software Foundation (ASF) under one |
| 4 | +or more contributor license agreements. See the NOTICE file |
| 5 | +distributed with this work for additional information |
| 6 | +regarding copyright ownership. The ASF licenses this file |
| 7 | +to you under the Apache License, Version 2.0 (the |
| 8 | +"License"); you may not use this file except in compliance |
| 9 | +with the License. You may obtain a copy of the License at |
| 10 | +
|
| 11 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | +
|
| 13 | +Unless required by applicable law or agreed to in writing, |
| 14 | +software distributed under the License is distributed on an |
| 15 | +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 16 | +KIND, either express or implied. See the License for the |
| 17 | +specific language governing permissions and limitations |
| 18 | +under the License. |
| 19 | +--> |
| 20 | +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| 21 | +<concept id="impala_queryingarrays" rev="4.1.0"> |
| 22 | + <title>Querying arrays (<keyword keyref="impala41"/> or higher only)</title> |
| 23 | + <titlealts audience="PDF"> |
| 24 | + <navtitle>Querying arrays</navtitle> |
| 25 | + </titlealts> |
| 26 | + <prolog> |
| 27 | + <metadata> |
| 28 | + <data name="Category" value="Impala"/> |
| 29 | + <data name="Category" value="Impala Data Types"/> |
| 30 | + <data name="Category" value="SQL"/> |
| 31 | + <data name="Category" value="Data Analysts"/> |
| 32 | + <data name="Category" value="Developers"/> |
| 33 | + <data name="Category" value="Schemas"/> |
| 34 | + </metadata> |
| 35 | + </prolog> |
| 36 | + <conbody> |
| 37 | + <p rev="4.1.0"> |
| 38 | + <indexterm audience="hidden">Querying arrays</indexterm> Describes how to use UNNEST function |
| 39 | + to query arrays. ARRAY data types represent collections with arbitrary numbers of elements, |
| 40 | + where each element is the same type.</p> |
| 41 | + <section id="section_yl4_2qb_3cc"> |
| 42 | + <title>Querying arrays using JOIN and UNNEST</title> |
| 43 | + <p>You can query arrays by making a join between the table and the array inside the table. |
| 44 | + This approach is improved with the introduction of the <codeph>UNNEST</codeph> function in |
| 45 | + the <codeph>SELECT</codeph> list or in the <codeph>FROM</codeph> clause in the |
| 46 | + <codeph>SELECT</codeph> statement. When you use <codeph>UNNEST</codeph>, you can provide |
| 47 | + more than one array in the <codeph>SELECT</codeph> statement. If you use JOINs for querying |
| 48 | + arrays it will yield a <term>joining unnest</term> however the latter will provide a |
| 49 | + <term>zipping unnest</term>.</p> |
| 50 | + </section> |
| 51 | + <section id="section_hmf_hqb_3cc"> |
| 52 | + <title>Example of querying arrays using JOIN</title> |
| 53 | + <p>Use <codeph>JOIN</codeph> in cases where you must join unnest of multiple arrays. However |
| 54 | + if you must zip unnest then use the newly implemented <codeph>UNNEST</codeph> function.</p> |
| 55 | + <p>Here is an example of a <codeph>SELECT</codeph> statement that uses JOINs to query an |
| 56 | + array.</p> |
| 57 | + <codeblock id="codeblock_uqy_rqb_3cc">SELECT id, arr1.item, arr2.item FROM tbl_name tbl, tbl.arr1, tbl.arr2; |
| 58 | + |
| 59 | +ID, ARR1.ITEM, ARR2.ITEM |
| 60 | +[1, 1, 10] |
| 61 | +[1, 1, 11] |
| 62 | +[1, 2, 10] |
| 63 | +[1, 2, 11] |
| 64 | +[1, 3, 10] |
| 65 | +[1, 3, 11] |
| 66 | +</codeblock> |
| 67 | + <note id="note_tpb_wln_htb"> |
| 68 | + <p>The test data used in this example is ID: 1, arr1: {1, 2, 3}, arr2: {10, 11}</p> |
| 69 | + </note> |
| 70 | + </section> |
| 71 | + <section id="section_ipq_tqb_3cc"> |
| 72 | + <title>Examples of querying arrays using UNNEST</title> |
| 73 | + <p>You can use one of the two different syntaxes shown here to unnest multiple arrays in one |
| 74 | + query. This results in the items of the arrays being zipped together instead of joining.</p> |
| 75 | + <ul id="ul_jpq_tqb_3cc"> |
| 76 | + <li>ISO:SQL 2016 compliant syntax: |
| 77 | + <codeblock id="codeblock_kpq_tqb_3cc">SELECT a1.item, a2.item |
| 78 | +FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2); |
| 79 | +</codeblock></li> |
| 80 | + <li>Postgres compatible |
| 81 | + syntax:<codeblock id="codeblock_yl5_3mn_htb">SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;</codeblock></li> |
| 82 | + </ul> |
| 83 | + <p><b>Unnest operator in SELECT list</b></p> |
| 84 | + <codeblock id="codeblock_lpq_tqb_3cc">SELECT id, unnest(arr1), unnest(arr2) FROM tbl_name;</codeblock> |
| 85 | + <p><b>Unnest operator in FROM clause</b></p> |
| 86 | + <codeblock id="codeblock_mpq_tqb_3cc">SELECT id, arr1.item, arr2.item FROM tbl_name tbl_alias, UNNEST(tbl_alias.arr1, tbl_alias.arr2);</codeblock> |
| 87 | + <p>This new functionality would zip the arrays next to each other as shown here. </p> |
| 88 | + <codeblock id="codeblock_npq_tqb_3cc">ID, ARR1.ITEM, ARR2.ITEM |
| 89 | +[1, 1, 10] |
| 90 | +[1, 2, 11] |
| 91 | +[1, 3, NULL] |
| 92 | +</codeblock> |
| 93 | + <p>Note, that arr2 is shorter than arr1 so the "missing" items in its column will be filled |
| 94 | + with NULLs.</p> |
| 95 | + <note id="note_opq_tqb_3cc">The test data used in this example is ID: 1, arr1: {1, 2, 3}, |
| 96 | + arr2: {10, 11}</note> |
| 97 | + </section> |
| 98 | + <section id="section_i1g_wqb_3cc"> |
| 99 | + <title>Limitations in Using UNNEST</title> |
| 100 | + <p> |
| 101 | + <ul id="ul_j1g_wqb_3cc"> |
| 102 | + <li>Only arrays from the same table can be zipping unnested</li> |
| 103 | + <li>The old (joining) and the new (zipping) unnests cannot be used together</li> |
| 104 | + <li>You can add a <codeph>WHERE</codeph> filter on an unnested item only if you add a |
| 105 | + wrapper <codeph>SELECT</codeph> and do the filtering |
| 106 | + <p>Example:</p><codeblock id="codeblock_k1g_wqb_3cc">SELECT id, arr1_unnest FROM (SELECT id, unnest(arr1) as arr1_unnest FROM tbl_name) WHERE arr1_unnest < 10;</codeblock></li> |
| 107 | + </ul> |
| 108 | + </p> |
| 109 | + </section> |
| 110 | + <section id="section_ewb_yqb_3cc"> |
| 111 | + <title>Using ARRAY columns in the SELECT list</title> |
| 112 | + <!--Removing this since zipping unnest for arrays (IMPALA-10920) and allow array type in SELECT list (IMPALA-9498) are all added in the same upstream release (Impala 4.1) |
| 113 | + <p>Prior to this release to look into the content of an array you had to unnest the array |
| 114 | + either by the joining syntax or by using the zipping <codeph>UNNEST</codeph> operator as |
| 115 | + shown in the following example:</p> |
| 116 | + <codeblock id="codeblock_fwb_yqb_3cc">SELECT unnest(IDs), unnest(NAMES) FROM table_name;</codeblock> |
| 117 | + --> |
| 118 | + <p rev="4.1">Impala 4.1 adds support to return <codeph>ARRAYs</codeph> as |
| 119 | + <codeph>STRINGs</codeph> (<term>JSON arrays</term>) in the <codeph>SELECT</codeph> list, |
| 120 | + for example: </p> |
| 121 | + <codeblock id="codeblock_gwb_yqb_3cc">select id, int_array from functional_parquet.complextypestbl where id = 1; |
| 122 | +returns: 1, “[1,2,3]” |
| 123 | +</codeblock> |
| 124 | + <p>Returning <codeph>ARRAYs</codeph> from inline or Hive Metastore views is also supported. |
| 125 | + These arrays can be used both in the select list or as relative table references.</p> |
| 126 | + <codeblock id="codeblock_hwb_yqb_3cc">select id, int_array from (select id, int_array from complextypestbl) s;</codeblock> |
| 127 | + <p>Though <codeph>STRUCTs</codeph> are already supported, <codeph>ARRAYs</codeph> and |
| 128 | + <codeph>STRUCTs</codeph> nested within each other are not supported yet. Using them as |
| 129 | + non-relative table references is also not supported yet.</p> |
| 130 | + </section> |
| 131 | + </conbody> |
| 132 | +</concept> |
0 commit comments