lundi 31 août 2015

Hive table to json and upload to S3

I have a hive table in the following format:

col1 col2
1 {"test": "hi"}
2 {"test2": "hi2"}
2 {"test3": "hi3"}

I can perform all queries, such as select * from table. What would be the best way to go about transforming that table to a text file with each line being a json string such as this:

{"id": 1, "test": "hi"}
{"id": 2, "test2": "hi2"}
{"id": 3, "test2": "hi3"}

Would it be transform call with a mapper? Also, after I have the text files with the json lines, I would like to upload the text file to a S3 bucket. In python, I could use boto to upload it, but in a Hive environment, is there such functionality?

Thanks




1 commentaire: