Spark SQL similar to LISTAGG() within group OR GROUP_CONCAT

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I need to implement a function similar to redshift listagg() within group (order by x_column) but it is important to be in Spark SQL, the one here https://spark.apache.org/docs/2.4.0/api/sql/

There is this similar question but the answer is not no SQL.

My query on Redshift SQL would be:

select KEY,
listagg(CODE, '-') within group (order by DATE) as CODE
from demo_table
group by KEY
At this point it is not important the order by statement, only aggregating all the columns with the group by would be enough, I have tried with concat_ws and it does not work as expected
It won't work for me to have it on pyspark
                Does this answer your question? Spark SQL replacement for MySQL's GROUP_CONCAT aggregate function
– Thom A
                May 25, 2022 at 15:48
 array_join( -- concat the array
  collect_list(code), -- aggregate that collects the array of [code]
  ' - ' -- delimiter 
from demo_table
group by KEY
Below query will work. It includes order by as well. Please check.
spark.sql("""select key,max(code) 
             from ( select key,array_join(collect_list(code) over (partition by key order by to_date(date,'m/d/yyyy')),'-') code from view) 
             group by key""").show(100)
                Please consider providing output of the source code in the answer, so users can correlate it with the problem statement.
– Azhar Khan
                Dec 17, 2022 at 9:30
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.