I have a DataFrame with 3 columns i.e. Id, First Name, Last Name
I want to apply GroupBy on the basis of Id and want to collect First Name, Last Name column as list.
Example :- I have a DF like this
+---+-------+--------+
|id |fName |lName |
+---+-------+--------+
|1 |Akash |Sethi |
|2 |Kunal |Kapoor |
|3 |Rishabh|Verma |
|2 |Sonu |Mehrotra|
+---+-------+--------+
and I want my output like this
+---+-------+--------+--------------------+
|id |fname |lName |
+---+-------+--------+--------------------+
|1 |[Akash] |[Sethi] |
|2 |[Kunal, Sonu] |[Kapoor, Mehrotra] |
|3 |[Rishabh] |[Verma] |
+---+-------+--------+--------------------+
Thanks in Advance
解決方案
You can aggregate multiple columns like this:
df.groupBy("id").agg(collect_list("fName"), collect_list("lName"))
It will give you the expected result.