sql - How to get array/bag of elements from Hive group by operator? -


i want group given field , output grouped fields. below example of trying achieve:-

imagine table named 'sample_table' 2 columns below:-

f1  f2 001 111 001 222 001 123 002 222 002 333 003 555 

i want write hive query give below output:-

001 [111, 222, 123] 002 [222, 333] 003 [555] 

in pig, can achieved this:-

grouped_relation = group sample_table f1; 

can please suggest if there simple way in hive? can think of write user defined function (udf) may time consuming option.

the built in aggregate function collect_set (doumented here) gets want. work on example input:

select f1, collect_set(f2) sample_table group f1 

unfortunately, removes duplicate elements , imagine isn't desired behavior. find odd collect_set exists, no version keep duplicates. someone else apparently thought same thing. looks top , second answer there give udaf need.


Comments

Popular posts from this blog

java - Jmockit String final length method mocking Issue -

asp.net - Razor Page Hosted on IIS 6 Fails Every Morning -

c++ - wxwidget compiling on windows command prompt -