lundi 4 mai 2015

In AWS hadoop task log, what is the meaning of "soft limit" "bufstart" "bufvoid" "kvstart" and "length"

At AWS, Hadoop task log can provide users the memory failure information. My task is to provide further information - how much memory has additionally been consumed which caused the crash and which task caused the crash. I just ran a MapReduce job which I made it OutOfMemory in purpose. Now I got the task log as following, from which I hope I can get the memory usage information:

015-05-04 22:39:25,417 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2015-05-04 22:39:25,468 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2015-05-04 22:39:26,648 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 2015-05-04 22:39:27,638 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2015-05-04 22:39:27,974 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://172.31.18.58:9000/input/input_1.txt:0+1130 2015-05-04 22:39:28,002 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2015-05-04 22:39:28,321 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi 52428796(209715184) 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 200 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 167772160 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 209715200 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 52428796; length = 13107200 2015-05-04 22:39:28,382 INFO [main] com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library from the embedded binaries 2015-05-04 22:39:28,386 INFO [main] com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 77cfa96225d62546008ca339b7c2076a3da91578] 2015-05-04 22:39:30,000 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2015-05-04 22:39:30,018 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy] 2015-05-04 22:39:30,034 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at WordCount$Map.map(WordCount.java:25) at WordCount$Map.map(WordCount.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)

What is the meaning of "2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb: 200 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 167772160 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 209715200 2015-05-04 22:39:28,322 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 52428796; length = 13107200" ? Are those information related to memory usage?




Aucun commentaire:

Enregistrer un commentaire