Setting the HADOOP_HOME

564 views

I am working on a home-grown Hadoop installation. I've managed to get new nodes rolled out, it takes time as we have other dependencies. One item I've not been able figure out is where to set the HADOOP_HOME_DIR variable, so I can store the actual configuration for each node separate from the binary tree.

Can anyone point me to where this gets set properly? We have an init.d script that starts the services on the master, which calls out to the slaves (as user "hadoop") -- but I'm guessing the variable can be started there, exported and inherited -- but perhaps it may be more proper to set in ~hadoop/conf/hadoop-env.sh.

The idea is to enable me to more easily roll out slaves, perhaps using Puppet, so that the CONF and LOGS directories are separate -- it's easier to manage that way.

posted Dec 8, 2013 by Anderson

Share this question

1 Answer

If am not sure, if I understood your issue correctly. Would you like to specify somehow where the configuration directory for your Hadoop cluster is located (e.g. /etc/hadoop/conf)?

If you use init scripts from CDH, they assume that config directory is CONF_DIR="/etc/hadoop/conf".
AFAIK, when you use HDP or Apache distribution, then you can specify where your configuration directory is when you start a script e.g. "sudo -u hdfs /usr/lib/hadoop/sbin/hadoop-daemon.sh --config start datanode"

PS: I grepped my configuration directory, and installation directory (/usr/lib/hadoop), but I can not see variable called: HADOOP_HOME_DIR anywhere. I see that /usr/lib/hadoop/libexec/hadoop-layout.sh contains variable HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop/conf"}.

answer Dec 8, 2013 by Satish Mishra

my apologies, you are correct. I meant to refer to the HADOOP_CONF_DIR

commented Dec 8, 2013 by anonymous

Hope that my previous post answers your question ;)

commented Dec 8, 2013 by anonymous

Setting the HADOOP_HOME_DIR

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview