Hadoop / HBase hotspotting / overloading specific nodes

I am having a problem with Hadoop maxing out drive space on a select few nodes when I am running an HBase job. The scenario is this:

The job is a data import using Map/Reduce / HBase
The data is being imported to one table
The table only has a couple of regions
As the job runs, HBase? / Hadoop? begins placing the data in HDFS on the datanode / regionserver that is hosting the regions
As the job progresses (and more data is imported) the two datanodes hosting the regions start to get full and eventually drive space hits 100% utilization whilst the other nodes in the cluster are at 40% or less drive space utilization
The job in Hadoop then begins to hang with multiple "out of space" errors and eventually fails.

I have tried running hadoop balancer during the job run and this helped but only really succeeded in prolonging the eventual job failure.

How can I get Hadoop / HBase to distribute the data to HDFS more evenly when it is favoring the nodes that the regions are on?

Am I missing something here?

public class MaxMinReducer extends Reducer { int max_sum=0; int mean=0; int count=0; Text max_occured_key=new Text(); Text mean_key=new Text("Mean : "); Text count_key=new Text("Count : "); int min_sum=Integer.MAX_VALUE; Text min_occured_key=new Text(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable value : values) { sum += value.get(); count++; } if(sum < min_sum) { min_sum= sum; min_occured_key.set(key); } if(sum > max_sum) { max_sum = sum; max_occured_key.set(key); } mean=max_sum+min_sum/count; } @Override protected void cleanup(Context context) throws IOException, InterruptedException { context.write(max_occured_key, new IntWritable(max_sum)); context.write(min_occured_key, new IntWritable(min_sum)); context.write(mean_key , new IntWritable(mean)); context.write(count_key , new IntWritable(count)); } }

Hadoop / HBase hotspotting / overloading specific nodes

Your comment on this post:

Your answer

Preview