赞
踩
select address,count(1) from people group by address; 等价的算子如下
- scala> peopleDF.show()
- +--------+---+--------+
- | name|age| address|
- +--------+---+--------+
- |zhangsan| 22| chengdu|
- | wangwu| 33| beijing|
- | lisi| 28|shanghai|
- |xiaoming| 28| beijing|
- | mm| 21| chengdu|
- |xiaoming| 18| beijing|
- | mm| 11| chengdu|
- +--------+---+--------+
-
-
- scala> peopleDF.groupBy("address").count().show
- +--------+-----+
- | address|count|
- +--------+-----+
- | beijing| 3|
- | chengdu| 3|
- |shanghai| 1|
- +--------+-----+
-
- //等价于select address,max(age) from people group by address;
- scala> peopleDF.groupBy("address").max("age").show
- +--------+--------+
- | address|max(age)|
- +--------+--------+
- | beijing| 33|
- | chengdu| 22|
- |shanghai| 28|
- +--------+--------+
-
-
- //等价于select address,avg(age) from people group by address;
- scala> peopleDF.groupBy("address").avg("age").show
- +--------+------------------+
- | address| avg(age)|
- +--------+------------------+
- | beijing|26.333333333333332|
- | chengdu| 18.0|
- |shanghai| 28.0|
- +--------+------------------+
-
- //等价于select address,min(age) from people group by address;
- scala> peopleDF.groupBy("address").min("age").show
- +--------+--------+
- | address|min(age)|
- +--------+--------+
- | beijing| 18|
- | chengdu| 11|
- |shanghai| 28|
- +--------+--------+
-
- //等价于select address,sum(age) from people group by address;
- scala> peopleDF.groupBy("address").sum("age").show
- +--------+--------+
- | address|sum(age)|
- +--------+--------+
- | beijing| 79|
- | chengdu| 54|
- |shanghai| 28|
- +--------+--------+
- //等价于select address,count(age),max(age),min(age),avg(age),sum(age) from people group by address;
- scala> peopleDF.groupBy("address").agg(count("age"),max("age"),min("age"),avg("age"),sum("age")).show
- +--------+----------+--------+--------+------------------+--------+
- | address|count(age)|max(age)|min(age)| avg(age)|sum(age)|
- +--------+----------+--------+--------+------------------+--------+
- | beijing| 3| 33| 18|26.333333333333332| 79|
- | chengdu| 3| 22| 11| 18.0| 54|
- |shanghai| 1| 28| 28| 28.0| 28|
- +--------+----------+--------+--------+------------------+--------+
- scala> peopleDF.groupBy("address").agg(count("age").as("cnt"),avg("age").as("avg")).show
- +--------+---+------------------+
- | address|cnt| avg|
- +--------+---+------------------+
- | beijing| 3|26.333333333333332|
- | chengdu| 3| 18.0|
- |shanghai| 1| 28.0|
- +--------+---+------------------+
- //求同名用户在同一个地址的平均年龄
- //把name的不同值作为列名
- scala> peopleDF.groupBy("address").pivot("name").avg("age").show
- +--------+----+----+------+--------+--------+
- | address|lisi| mm|wangwu|xiaoming|zhangsan|
- +--------+----+----+------+--------+--------+
- | beijing|null|null| 33.0| 23.0| null|
- | chengdu|null|16.0| null| null| 22.0|
- |shanghai|28.0|null| null| null| null|
- +--------+----+----+------+--------+--------+
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。