samedi 28 mars 2015

Different file process in hadoop

I have installed Hadoop and hive. I can process and query over xls, tsv files using hive. I want to process other files such as docx, pdf, ppt. how can i do this? Is there any separate procedure to process these files in AWS? please help me.





Aucun commentaire:

Enregistrer un commentaire