can someone tell me how to write the mapper and reducer function for the following problem statement
Perform Flight History Analysis using Hadoop systems and parallel data processing mechanisms. The input file contains information about all the flights in the USA and is a 15 GB database available on http://ift.tt/1NSFc89. A snippet of the input file is available for analysis as part of the exercise.
Generate the following analytical reports after cleansing the data
- Average departure delay for all flights in minutes
- Departure delay standard deviation for all flights in minutes
- Delayed flights performance by state: Which states have had more relative delayed flights during this time?
Note: We will consider a delay each flight whose departure delay in minutes is greater than two times the average: (Result of the first step)
Aucun commentaire:
Enregistrer un commentaire