【hadoop】 3003-mapreduce任务的提交
阿里云国内75折 回扣 微信号:monov8 |
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6 |
一、通过Eclipse下本地运行
可以参考 【hadoop】 3002-mapreduce程序统计单词个数示例 章节的演示
二、集群方式通过jar包形式运行
1、处理数据的作业达成jar包并上传hdfs
[hadoop@cloud01 HDFSdemo]$ pwd
/home/hadoop/workspace/HDFSdemo
[hadoop@cloud01 HDFSdemo]$ ll
total 139844
drwxrwxr-x. 5 hadoop hadoop 4096 Feb 24 18:10 bin
-rw-rw-r--. 1 hadoop hadoop 440 Feb 20 06:56 core-site.xml
-rw-rw-r--. 1 hadoop hadoop 256 Feb 20 06:56 hdfs-site.xml
drwxrwxr-x. 2 hadoop hadoop 4096 Feb 20 06:34 lib
-rw-rw-r--. 1 hadoop hadoop 253 Feb 20 06:56 mapred-site.xml
drwxrwxr-x. 5 hadoop hadoop 4096 Feb 24 18:10 src
-rw-rw-r--. 1 hadoop hadoop 143167974 Feb 24 21:41
wc.jar
-rw-rw-r--. 1 hadoop hadoop 434 Feb 20 06:56 yarn-site.xml
2、启动yarn,执行start-yarn.sh 命令
[hadoop@cloud01 HDFSdemo]$ start-yarn.sh
[hadoop@cloud01 HDFSdemo]$ jps
22901 Jps
17507 DataNode
22510 NodeManager
17414 NameNode
2721
22413
ResourceManager
3、分布式执行wc.jar
[hadoop@cloud01 ~]$ hadoop jar workspace/HDFSdemo/wc.jar mapreduce.WordCount
3.1 执行过程日志情况
-- 连接ResourceManager: client.RMProxy: Connecting to ResourceManager
-- 获取分片,每个分片对应一个Map任务:input.FileInputFormat: Total input paths to process : 1
--生成本次运行的job编码:mapreduce.JobSubmitter: Submitting tokens for job: job_1424843731958_0002
--运行要执行的jar文件:mapreduce.Job: Running job: job_1424843731958_0002
--显示map和reduce执行进度
15/02/24 22:09:30 INFO mapreduce.Job: map 0% reduce 0%
15/02/24 22:09:39 INFO mapreduce.Job: map 100% reduce 0%
15/02/24 22:09:52 INFO mapreduce.Job: map 100% reduce 100%
15/02/24 22:09:53 INFO mapreduce.Job: Job job_1424843731958_0002 completed successfully
3.2 MR整个过程的进程变化情况
ResourceManage,NodeManager->RunJar->MRAppMaster->YarnChild
随着MR程序进度的执行,响应的进程也随着退出,退出的顺序为
YarnChild->MRAppMaster->RunJar
3.3 图形方式给出对应的处理流程
图1
图2
file:/tmp/hadoop-hadoop/mapred/staging/hadoop1721666591/.staging/job_local1721666591_0001
file:/tmp/hadoop-hadoop/mapred/staging/hadoop1721666591/.staging/job_local1721666591_0001/job.xml
常见问题
1、INFO ipc.Client: Retrying connect to server: cloud01/192.168.2.31:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
该问题是因为yarn没有启动,需要执行start-yarn.sh
阿里云国内75折 回扣 微信号:monov8 |
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6 |