Lateral View(行转列)

通过Lateral View与UDTF(表生成函数)结合,将单行数据拆成多行数据,即行转列。

命令格式

lateralView: lateral view [outer] ()  as  (',' ) 
fromClause: from  (lateralView) [(lateralView) ...]
  • udtf_name:必填。将一行数据拆成多行数据的UDTF,请参见其他函数。
  • expression:必填。待拆分行数据所属列名。
  • table_alias:必填。UDTF结果集的别名。
  • columnAlias:必填。拆分后得到的列的别名。
  • baseTable:必填。数据源表。

此外,from后可以有多个Lateral View语句,后面的Lateral View语句能够引用它前面的所有表和列名,实现对不同列的行数据进行拆分。

示例数据

假设已有一张表pageAds,它有三列数据,第一列是pageid string,第二列是col1 array,第三列是col2 array,详细数据如下。

pageidcol1col2
front_page[1, 2, 3][“a”, “b”, “c”]
contact_page[3, 4, 5][“d”, “e”, “f”]

使用示例

单个Lateral View语句

示例1:拆分col1。命令示例如下:

select pageid, col1_new, col2 from pageAds lateral view explode(col1) adTable as col1_new;

返回结果如下:

+------------+------------+------------+
| pageid     | col1_new   | col2       |
+------------+------------+------------+
| front_page | 1          | ["a","b","c"] |
| front_page | 2          | ["a","b","c"] |
| front_page | 3          | ["a","b","c"] |
| contact_page | 3          | ["d","e","f"] |
| contact_page | 4          | ["d","e","f"] |
| contact_page | 5          | ["d","e","f"] |
+------------+------------+------------+

示例2:拆分col1并执行聚合统计。命令示例如下:

select col1_new, count(1) as count from pageAds lateral view explode(col1) adTable as col1_new group by col1_new;

返回结果如下:

+------------+------------+
| col1_new   | count      |
+------------+------------+
| 1          | 1          |
| 2          | 1          |
| 3          | 2          |
| 4          | 1          |
| 5          | 1          |
+------------+------------+

多个Lateral View语句

拆分col1和col2。命令示例如下:

select pageid,mycol1, mycol2 from pageAds lateral view explode(col1) myTable1 as mycol1 lateral view explode(col2) myTable2 as mycol2;

返回结果如下:

+------------+------------+------------+
| pageid     | mycol1     | mycol2     |
+------------+------------+------------+
| front_page | 1          | a          |
| front_page | 1          | b          |
| front_page | 1          | c          |
| front_page | 2          | a          |
| front_page | 2          | b          |
| front_page | 2          | c          |
| front_page | 3          | a          |
| front_page | 3          | b          |
| front_page | 3          | c          |
| contact_page | 3          | d          |
| contact_page | 3          | e          |
| contact_page | 3          | f          |
| contact_page | 4          | d          |
| contact_page | 4          | e          |
| contact_page | 4          | f          |
| contact_page | 5          | d          |
| contact_page | 5          | e          |
| contact_page | 5          | f          |
+------------+------------+------------+

实际工作案例

原始表:

处理,将上述表中grid_ids字段按照#进行分割并转化成多行,程序如下:

SELECT  zs_business_district_id,grid_id
FROM    business_recognize
LATERAL VIEW explode(split(grid_ids,'#')) adTable AS grid_id
WHERE   ds = '20210618'

结果输出:


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部