mercredi 22 avril 2015

Apache Pig: Flatten column names after join

Say I do a join like this:

A = load 'inputa' as (f1,f2,f3,f4);  
B = load 'inputb' as (f1,f2,f5,f6);  

J = join A by (f1,f2) left outer, B by (f1,f2);  

What is the simplest way to arrive at J's field names being f1,f2,f3,f4,f5,f6? (as opposed to A:f1, A:f2, A:f3, A:f4, B:f1, B:f2, B:f5, B:f6)

I know I can do the following:

C = foreach J generate A:f1 as f1, A:f2 as f2, A:f3 as f3, A:f4 as f4, B:f5 as f5, B:f6 as f6;

But I run into situations where I have a lot of columns following the join, so writing that all out is impractical. Is it something I should use a UDF for? I'm a pig novice and could use some guidance.

Thanks!




Aucun commentaire:

Enregistrer un commentaire