赞
踩
hive explode操作
import pyspark
from pyspark.sql import SparkSession
# 创建SparkSession对象,调用.builder类
# .appName("testapp")方法给应用程序一个名字;.getOrCreate()方法创建或着获取一个已经创建的SparkSession
spark = SparkSession.builder.appName("pysaprk").getOrCreate()
import pyspark.sql.functions
df = spark.createDataFrame([(1, "A,B"),
(2, "C,D"),
(3, "E")],
["id", "split_str"])
df.show(20,truncate=False)
+---+---------+
|id |split_str|
+---+---------+
|1 |A,B |
|2 |C,D |
|3 |E |
+---+---------+
df.createOrReplaceTempView("temp")
sql = """
select id,split_str,explode(split(split_str,',')) as letter
from temp
"""
spark.sql(sql).show()
+---+---------+---+
| id|split_str|col|
+---+---------+---+
| 1| A,B| A|
| 1| A,B| B|
| 2| C,D| C|
| 2| C,D| D|
| 3| E| E|
+---+---------+---+
LATERAL VIEW posexplode(data) t2 as pos,j_column
2022-03-29 于南京市江宁区九龙湖
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。