top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Uneven load balance between hadoop cluster

+1 vote
356 views

I am experiencing a very uneven load balance between the different machines that compose my little cluster.

I have a cluster of four machine and only one is actually used...

This happen when I run terasort, however, if I run teragen, everything works just fine, with the load evenly distributed.

Do you have any hints? In what direction I should look ?

posted Jan 18, 2016 by anonymous

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button
Would you like to describe the configuration of your 4 machines of cluster also configuration of Hadoop cluster

Similar Questions
+3 votes

As I studied that data distribution, load balancing, fault tolerance are implicit in Hadoop. But I need to customize it, can we do that?

0 votes

Please let me know if it's feasible to have hadoop cluster with data nodes running on multiple Operating systems. For instance few data nodes running on windows server and others on linux based OS (RHEL,centOS).

If above scenario is feasible then please provide configuration settings required in various xml files(hdfs-site.xml,core-site.xml,mapred-site.xml,yarn-site.xml) and environment files(hadoop-env.sh/hadoop-cmd.sh) for windows and linux data nodes and namenode.

+2 votes

Let we change the default block size to 32 MB and replication factor to 1. Let Hadoop cluster consists of 4 DNs. Let input data size is 192 MB. Now I want to place data on DNs as following. DN1 and DN2 contain 2 blocks (32+32 = 64 MB) each and DN3 and DN4 contain 1 block (32 MB) each. Can it be possible? How to accomplish it?

+3 votes

I have setup a HDP 2.3 cluster on Linux(CentOS). Now I am trying to utilize my ETL programs to access this cluster from a windows environment.
Should I setup Apache Hadoop on Windows local/server. What setup should I do ? What goes into the core-site.xml (mention my remote HDFS url ?/)
Any pointers would be helpful.

+1 vote

I have a test cluster of two machines, on both of them hadoop is installed. I have configured the hadoop cluster but on admin UI (as in the below picture) I see that two nodes are running on the same master machine, and that the other machine has no Hadoop node.

On master machine following services are running:

~$ jps 26310 ResourceManager 27593 Jps 26216 DataNode 26135 NameNode 26557 NodeManager 26701 JobHistoryServer 

On the slave machine:

~$ jps 2614 DataNode 2920 Jps 2707 NodeManager 

I don't why the slave is not joining the cluster (It was before). I tried to shutdown all servers on both machines and format HDFS then restarting everything but that didnot help. Any help to figure whats causing that behavior is appreciated.

...