Lateral View(行转列)
通过Lateral View与UDTF(表生成函数)结合,将单行数据拆成多行数据,即行转列。
命令格式
lateralView: lateral view [outer] () as (',' )
fromClause: from (lateralView) [(lateralView) ...]
- udtf_name:必填。将一行数据拆成多行数据的UDTF,请参见其他函数。
- expression:必填。待拆分行数据所属列名。
- table_alias:必填。UDTF结果集的别名。
- columnAlias:必填。拆分后得到的列的别名。
- baseTable:必填。数据源表。
此外,from后可以有多个Lateral View语句,后面的Lateral View语句能够引用它前面的所有表和列名,实现对不同列的行数据进行拆分。
示例数据
假设已有一张表pageAds,它有三列数据,第一列是pageid string,第二列是col1 array
| pageid | col1 | col2 |
| front_page | [1, 2, 3] | [“a”, “b”, “c”] |
| contact_page | [3, 4, 5] | [“d”, “e”, “f”] |
使用示例
单个Lateral View语句
示例1:拆分col1。命令示例如下:
select pageid, col1_new, col2 from pageAds lateral view explode(col1) adTable as col1_new;
返回结果如下:
+------------+------------+------------+
| pageid | col1_new | col2 |
+------------+------------+------------+
| front_page | 1 | ["a","b","c"] |
| front_page | 2 | ["a","b","c"] |
| front_page | 3 | ["a","b","c"] |
| contact_page | 3 | ["d","e","f"] |
| contact_page | 4 | ["d","e","f"] |
| contact_page | 5 | ["d","e","f"] |
+------------+------------+------------+
示例2:拆分col1并执行聚合统计。命令示例如下:
select col1_new, count(1) as count from pageAds lateral view explode(col1) adTable as col1_new group by col1_new;
返回结果如下:
+------------+------------+
| col1_new | count |
+------------+------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 1 |
| 5 | 1 |
+------------+------------+
多个Lateral View语句
拆分col1和col2。命令示例如下:
select pageid,mycol1, mycol2 from pageAds lateral view explode(col1) myTable1 as mycol1 lateral view explode(col2) myTable2 as mycol2;
返回结果如下:
+------------+------------+------------+
| pageid | mycol1 | mycol2 |
+------------+------------+------------+
| front_page | 1 | a |
| front_page | 1 | b |
| front_page | 1 | c |
| front_page | 2 | a |
| front_page | 2 | b |
| front_page | 2 | c |
| front_page | 3 | a |
| front_page | 3 | b |
| front_page | 3 | c |
| contact_page | 3 | d |
| contact_page | 3 | e |
| contact_page | 3 | f |
| contact_page | 4 | d |
| contact_page | 4 | e |
| contact_page | 4 | f |
| contact_page | 5 | d |
| contact_page | 5 | e |
| contact_page | 5 | f |
+------------+------------+------------+
实际工作案例
原始表:
处理,将上述表中grid_ids字段按照#进行分割并转化成多行,程序如下:
SELECT zs_business_district_id,grid_id
FROM business_recognize
LATERAL VIEW explode(split(grid_ids,'#')) adTable AS grid_id
WHERE ds = '20210618'
结果输出:

本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
