sql - How to get array/bag of elements from Hive group by operator? -
i want group given field , output grouped fields. below example of trying achieve:-
imagine table named 'sample_table' 2 columns below:-
f1 f2 001 111 001 222 001 123 002 222 002 333 003 555
i want write hive query give below output:-
001 [111, 222, 123] 002 [222, 333] 003 [555]
in pig, can achieved this:-
grouped_relation = group sample_table f1;
can please suggest if there simple way in hive? can think of write user defined function (udf) may time consuming option.
the built in aggregate function collect_set
(doumented here) gets want. work on example input:
select f1, collect_set(f2) sample_table group f1
unfortunately, removes duplicate elements , imagine isn't desired behavior. find odd collect_set
exists, no version keep duplicates. someone else apparently thought same thing. looks top , second answer there give udaf need.
Comments
Post a Comment