I am trying to run a mapreduce job on spot instances. I launch my instances by using StarClusters and its hadoop plugin. I have no problem upload the data then put it into HDFS and then copy the result back from the HDFS. My question is that is there way to load the data directly from s3 and push the result back to s3? (I don't want to manually download the data from s3 to HDFS and push the result from HDFS to s3, is there a way to do it in background)?
I am using the standard MIT starcluster ami
Thanks
Aucun commentaire:
Enregistrer un commentaire