I spark-submit jobs on an Amazon EMR cluster. I'd like all spark logging to be sent to redis/logstash. What is the proper way to configure spark under EMR to do this?
-
Keep log4j: Add a bootstrap action to modify /home/hadoop/spark/conf/log4j.properties to add an appender? However, this file already contains a lot of stuff and is a symlink to hadoop conf file. I don't want to fiddle too much with that as it already contains some rootLoggers. Which appender would do best? ryantenney/log4j-redis-appender + logstash/log4j-jsonevent-layout OR pavlobaron/log4j2redis ?
-
Migrate to slf4j+logback: Exclude slf4j-log4j12 from spark-core, add log4j-over-slf4j ... and use a logback.xml with a com.cwbase.logback.RedisAppender? Looks like this will be problematic with dependencies. Will it hide log4j.rootLoggers already defined in log4j.properties?
-
Anything else I missed?
What are your thoughts on this?
Aucun commentaire:
Enregistrer un commentaire