提问者:小点点

Pyspark:一起计算特定列


我的Spark数据帧如下所示:

+---+----+---+---+
| a | b  | c | d |
+---+----+-------+
|13 | 43 | 67| 3 |
+---+----+---+---+

有没有可能选择特定的列一起评估,以产生以下内容?

+----+----+---+---+-----+-----+-----------+
|  a | b  | c | d | a+b | c-b | a+b / c-b |
+----+----+-------+-----+-----+-----------+
| 13 | 43 | 67| 3 |  56 |  24 |   2.33    |
+----+----+---+---+-----+-----+-----------+

共1个答案

匿名用户

from pyspark.sql.functions import expr

(
    df.withColumn("a+b", expr("a + b"))
    .withColumn("c-b", expr("c - b"))
    .withColumn("a+b / c-b", expr("(a + b) / (c - b)"))
    .show()
)