питон - скрипт запуска примера hadoop
Primary tabs
обобщённый скрипт для этой заметки:
rm -r /home/hduser/python/wordcount/out/pyoutput
hadoop fs -rm -R /user/hduser/pyoutput
bin/hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar -file /home/hduser/python/wordcount/mapper.py -mapper "python /home/hduser/python/wordcount/mapper.py" -file /home/hduser/python/wordcount/reducer.py -reducer "python /home/hduser/python/wordcount/reducer.py" -input /user/hduser/myinput/* -output /user/hduser/pyoutput
hadoop fs -copyToLocal /user/hduser/pyoutput /home/hduser/python/wordcount/out
Установить число редукторов
Чтобы установить число редукторов достаточно добавить опцию -jobconf mapred.reduce.tasks, например:
-jobconf mapred.reduce.tasks=0
rm -r /home/hduser/python/wordcount/out/pyoutput
hadoop fs -rm -R /user/hduser/pyoutput
bin/hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar -file /home/hduser/python/wordcount/mapper.py -mapper "python /home/hduser/python/wordcount/mapper.py" -file /home/hduser/python/wordcount/reducer.py -reducer "python /home/hduser/python/wordcount/reducer.py" -jobconf mapred.reduce.tasks=0 -input /user/hduser/myinput/* -output /user/hduser/pyoutput
hadoop fs -copyToLocal /user/hduser/pyoutput /home/hduser/python/wordcount/out
- vedro-compota's blog
- Log in to post comments
- 4625 reads