Querying arrays ( Impala 4.1 or higher only)
Describes how to use UNNEST function to query arrays. ARRAY data types represent collections with arbitrary numbers of elements, where each element is the same type.
Querying arrays using JOIN and UNNEST
You can query arrays by making a join between the table and the array inside the table.
This approach is improved with the introduction of the
UNNEST
function in
the
SELECT
list or in the
FROM
clause in the
SELECT
statement. When you use
UNNEST
, you can provide
more than one array in the
SELECT
statement. If you use JOINs for querying
arrays it will yield a
joining unnest
however the latter will provide a
zipping unnest
.
Example of querying arrays using JOIN
Use
JOIN
in cases where you must join unnest of multiple arrays. However
if you must zip unnest then use the newly implemented
UNNEST
function.
Here is an example of a
SELECT
statement that uses JOINs to query an
array.
SELECT id, arr1.item, arr2.item FROM tbl_name tbl, tbl.arr1, tbl.arr2;
ID, ARR1.ITEM, ARR2.ITEM
[1, 1, 10]
[1, 1, 11]
[1, 2, 10]
[1, 2, 11]
[1, 3, 10]
[1, 3, 11]
Note:
The test data used in this example is ID: 1, arr1: {1, 2, 3}, arr2: {10, 11}
Examples of querying arrays using UNNEST
You can use one of the two different syntaxes shown here to unnest multiple arrays in one
query. This results in the items of the arrays being zipped together instead of joining.
ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);
Postgres compatible
syntax:SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;
Unnest operator in SELECT list
SELECT id, unnest(arr1), unnest(arr2) FROM tbl_name;
Unnest operator in FROM clause
SELECT id, arr1.item, arr2.item FROM tbl_name tbl_alias, UNNEST(tbl_alias.arr1, tbl_alias.arr2);
This new functionality would zip the arrays next to each other as shown here.
ID, ARR1.ITEM, ARR2.ITEM
[1, 1, 10]
[1, 2, 11]
[1, 3, NULL]
Note, that arr2 is shorter than arr1 so the "missing" items in its column will be filled
with NULLs.
Note: The test data used in this example is ID: 1, arr1: {1, 2, 3},
arr2: {10, 11}
Limitations in Using UNNEST
Only arrays from the same table can be zipping unnested
The old (joining) and the new (zipping) unnests cannot be used together
You can add a WHERE filter on an unnested item only if you add a
wrapper SELECT and do the filtering
Example:
SELECT id, arr1_unnest FROM (SELECT id, unnest(arr1) as arr1_unnest FROM tbl_name) WHERE arr1_unnest < 10;
Using ARRAY columns in the SELECT list
Impala 4.1 adds support to return ARRAYs as
STRINGs (JSON arrays) in the SELECT list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, “[1,2,3]”
Returning ARRAYs from inline or Hive Metastore views is also supported.
These arrays can be used both in the select list or as relative table references.
select id, int_array from (select id, int_array from complextypestbl) s;