lundi 27 avril 2015

How to specify file path on AWS for Pig

I'm trying to run a very basic pig script on AWS. All it does is load a file and store it again.

Wiki = LOAD 'http://ift.tt/1OZf8Eo' USING PigStorage(',') AS (from:int,namespace:int,title:chararray,from_namespace:int);`o

STORE Wiki into 'outputpig';

This script runs perfectly on local, but fails on AWS which leads me to believe that it's probably the formatting of the input path or output path that is causing the error. for the input path I've tried the relative path: 'part-00000', the absolute path: 'http://ift.tt/1OZf8Eo', and the native path: 'http://ift.tt/1ExaCLU' and it hasn't worked. I've also tried making the file public just in case. - no luck. What should the path formatting be? and how does it interact with the path defined in the custom jar commands?

currently my custom jar I'm using is this:

s3://elasticmapreduce/libs/pig/pig-script --run-pig-script --pig-versions 0.12.0 --args -f s3://mayar/April_27/pagelinks_basic.pig -p INPUT=s3://mayar/April_27/OUTPUT/ -p OUTPUT=s3://mayar/April_27/O

Thank you!




Aucun commentaire:

Enregistrer un commentaire