top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Client and NN communication in Hadoop?

+1 vote
317 views

I have a Hadoop cluster running with 3 slaves and 1 Master. Slaves are Datanodes and running Tasktarckers. Namenode is running jobtracker and secondary namenode. I am running sample mapreduce after loading a file into HDFS.

According to Hadoop architecture ,before writing a file into HDFS , client will contact the Namenode and get the location details of DNs and then client directly write the file into DN.

What is this client ? is it an application running on Namenode ? Is user and client both are different ? How can I see the messages between client and datanodes?

posted Feb 27, 2014 by Mandeep Sehgal

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

The application that try to write the file, for example the binary "hadoop" found in the bin dir of hadoop is a hadoop client, whoever wants to read or write from to hadoop is a client, you can also write hadoop client using java apis which hadoop provide.
like following is an example of hadoop client which tries to read file from hadoop

import java.io.*; 
import java.util.*;
import java.net.*; 
import org.apache.hadoop.fs.*;
import org.apache.hadoop.conf.*; 
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*; 
import org.apache.hadoop.util.*;

public class Cat {
  public static void main (String [] args) throws Exception{
  try{
    Path pt=new Path("hdfs://xxxx:9000/home/xxxx/fileonhdfstoread.txt");
    FileSystem filesysObject = FileSystem.get(new Configuration());
    BufferedReader bufferReaderObject=new BufferedReader(new InputStreamReader(filesysObject.open(pt)));
    String line;
    line=bufferReaderObject.readLine();
    while (line != null){
      System.out.println(line); 
      line=bufferReaderObject.readLine();
    }catch(Exception e){
  } 
}}

Hope this clear some of your doubts...

answer Feb 27, 2014 by Deepak Dasgupta
Similar Questions
+1 vote

We are trying to measure performance between HTTP and HTTPS version on Hadoop DFS, Mapreduce and other related modules.

As of now, we have tested using several metrics on Hadoop HTTP Mode. Similarly we are trying to test the same metrics on HTTPS Platform. Basically our test suite cluster consists of one Master Node and two Slave Nodes.

We have configured HTTPS connection and now we need to verify whether Nodes are communicating directly through HTTPS. Tried checking logs, clusters webhdfs ui, health check information, dfs admin report but of no help. Since there is only limited documentation available in HTTPS, we are unable to verify whether Nodes are communicating through HTTPS.

Hence any experts around here can shed some light on how to confirm HTTPS communication status between nodes (might be with mapreduce/DFS).

+3 votes

I have setup a HDP 2.3 cluster on Linux(CentOS). Now I am trying to utilize my ETL programs to access this cluster from a windows environment.
Should I setup Apache Hadoop on Windows local/server. What setup should I do ? What goes into the core-site.xml (mention my remote HDFS url ?/)
Any pointers would be helpful.

...