Hadoop-2.2.0 "mapred.child.java.opts"

1 Answer

Yes but the old property is yet to be entirely removed (removal of configs is graceful).

These properties were introduced to provide more fine-tuned way to configure each type of task separately, but the older value continues to be accepted if present; the current behaviour is that if the MR
runtime finds mapred.child.java.opts configured, it will override values of mapreduce.map|reduce.java.opts configs. To configure mapreduce.map|reduce.java.opts therefore, you should make sure you aren't passing mapred.child.java.opts (which is also no longer in the mapred-default.xml intentionally).

answer Dec 4, 2013 by Dewang Chaudhary

Actually, its the other way around. The presence of mapreduce.map|reduce.java.opts overrides mapred.child.java.opts, not the other way round as I had stated earlier (below).

commented Dec 4, 2013 by anonymous

Similar Questions

+2 votes

Hadoop logs and mapred.local.dir

I have following queries with hadoop, please help me?
1. The size of mapred.local.dir is big(30 GB), how many methods could clean it correctly?
2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all rolling type log? Whats their max size? I can not find the specific settings for them in log4j.properties.
3. I find the size of dfs.name.dir and dfs.data.dir is very big now, are there any files under them could be removed actually? Or all files under the two folders could not be removed at all?

+1 vote

File Permission Issue using Distributed Cache of Hadoop-2.2.0

The original local file has execution permission, and then it was distributed to multiple nodemanager nodes with Distributed Cache feature of Hadoop-2.2.0, but the distributed file has lost the execution permission.

However I did not encounter such issue in Hadoop-1.1.1.

Why this happened? Some changes about dfs.umask option or related staffs?

+1 vote

How to upgrade hadoop from 2.0 to hadoop 2.4.0?

I currently have a hadoop 2.0 cluster in production, I want to upgrade to latest release.
current version: hadoop version Hadoop 2.0.0-cdh4.6.0

Cluster has the following services:
hbase hive hue impala mapreduce oozie sqoop zookeeper

Can someone point me to how to upgrade hadoop from 2.0 to hadoop 2.4.0?

+1 vote

Best practice of migrating hadoop 1.0.1 to hadoop 2.2.3

We plan to migrate a 30 nodes hadoop 1.0.1 cluster to the version 2.3.0. We dont have extra machines to setup a separate new cluster, thus hope to do an in-place migration by replacing the components on the existing computers. So the questions are:

1) Is it possible to do an in-place migration, while keeping all data in HDFS safely?
2) If it is yes, is there any doc/guidance to do this?
3) Is the 2.0.3 MR API binary compatible with the one of 1.0.1?

+1 vote

Hadoop YARN 2.2.0 Streaming Memory Limitation?

We are currently facing a frustrating hadoop streaming memory problem. our setup:

our compute nodes have about 7 GB OF RAM
hadoop streaming starts a bash script wich uses about 4 GB OF RAM
therefore it is only possible to start one and only ONE TASK PER NODE

out of the box each hadoop instance starts about 7 hadoop containers with default hadoop settings. each hadoop task forks a bash script that need about 4 GB of RAM, the first fork works, all following fail because THEY RUN OUT OF MEMORY. so what we are looking for is to LIMIT the number of containers TO ONLY ONE. so what we found on the internet:

yarn.scheduler.maximum-allocation-mb and mapreduce.map.memory.mb is set to values such that there is at most one container. this means, mapreduce.map.memory.mb must be MORE THAN HALF of the maximum memory (otherwise there will be multiple containers).

done right, this gives us one container per node. but it produces a new problem: since our java process is now using at least half of the max memory, our child (bash) process we fork will INHERIT THE PARENT MEMORY FOOTPRINT and since the memory used by our parent was more than half of total memory, WE RUN OUT OF MEMORY AGAIN. if we lower the map memory, hadoop will allocate 2 containers per node, which will run out of memory too.

since this problem is a blocker in our current project we are evaluating adapting the source code to solve this issue. as a last resort. any ideas on this are very much welcome.

Hadoop-2.2.0 "mapred.child.java.opts"

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview