hadoop: Is there any way to limit the concurrent running mappers per job?

After upgraded to Hadoop 2 (yarn), I found that mapred.jobtracker.taskScheduler.maxRunningTasksPerJob no longer worked, right?

One workaround is to use queue to limit it, but its not easy to control it from job submitter.

Is there any way to limit the concurrent running mappers per job? Any documents or pointer?

1 Answer

In Hadoop-2.x it is as:
mapreduce.jobtracker.taskscheduler.maxrunningtasks.perjob
The maximum number of running tasks for a job before it gets preempted. No limits if undefined.

You can see it from here:
https://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

answer Apr 22, 2015 by Sudhakar Singh

Similar Questions

+1 vote

How to stop a mapreduce job from terminal running on Hadoop Cluster?

To run a job we use the command
$ hadoop jar example.jar inputpath outputpath
If job is so time taken and we want to stop it in middle then which command is used? Or is there any other way to do that?

+1 vote

Can we run mapreduce job from eclipse IDE on fully distributed mode hadoop cluster?

A mapreduce job can be run as jar file from terminal or directly from eclipse IDE. When a job run as jar file from terminal it uses multiple jvm and all resources of cluster. Does the same thing happen when we run from IDE. I have run a job on both and it takes less time on IDE than jar file on terminal.

+2 votes

Hadoop/Yarn: Running multiple commands on container

I am using containerLaunchContext.setCommands() to add different commands that I wanted to run on container. But only first command is getting execute.Is there is something else I need to do?

List commands = new ArrayList();commands.add(cmd1);commands.add(cmd2);

I can see only cmd1 is getting executed.

+1 vote

How to set mapreduce.input.fileinputformat.split.maxsize for a specific job ?

In xmls configuration file of Hadoop-2.x, "mapreduce.input.fileinputformat.split.minsize" is given which can be set but how to set "mapreduce.input.fileinputformat.split.maxsize" in xml file. I need to set it in my mapreduce code.

+3 votes

How to find execution time of a MapReduce job?

Date date; long start, end; // for recording start and end time of job
date = new Date(); start = date.getTime(); // starting timer

job.waitForCompletion(true)

date = new Date(); end = date.getTime(); //end timer
log.info("Total Time (in milliseconds) = "+ (end-start));
log.info("Total Time (in seconds) = "+ (end-start)*0.001F);

I am not sure this is the correct way to find. Is there any other method or API to find the execution time of a MapReduce job?

hadoop: Is there any way to limit the concurrent running mappers per job?

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview