top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

HDFS: How quickly can I increase the number of replicas?

0 votes
824 views

My cluster version hdfs 2.2 stable ( 2 ha namenodes, 10 datanodes). I was command bin/hdfs dfs -setrep -R 2 / ( replication 1 to 2 )

I found that HDFS is actually replicating the under replicated blocks but it works very slowly. HDFS performs the replication about 1 block per second.

I have about 400000 under replicated blocks so it will take about 4 more days. Is there any way to speed it up?

posted Jun 30, 2014 by Bob Wise

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

You can set the below properties to speed up replication

dfs.balance.bandwidthPerSec 131072000
dfs.max-repl-streams        50 
answer Jun 30, 2014 by Jai Prakash
I have set all the nodes.
However, nodes 500M / s of speed occurs.
Similar Questions
0 votes

The reason behind this is I want to have my custom user who can create anything on the entire hdfs file system (/).
I tried couple of links however, none of them were useful. Is there any way by adding/modifying some property tags I can do that ?

0 votes

If there are 10 HDFS blocks to be copied from one machine to another. However, the other machine can copy only 7.5 blocks, is there a possibility for the blocks to be broken down during the time of replication?

0 votes

I was trying to implement a Hadoop/Spark audit tool, but l met a problem that I can't get the input file location and file name. I can get username, IP address, time, user command, all of these info from hdfs-audit.log. But When I submit a MapReduce job, I can't see input file location neither in Hadoop logs or Hadoop ResourceManager.

Does hadoop have API or log that contains these info through some configuration ?If it have, what should I configure?

...