top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Is there a way to run Mapreduce with mongodb as input and output to HDFS?

+3 votes
414 views
Is there a way to run Mapreduce with mongodb as input and output to HDFS?
posted Oct 1, 2015 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote
answer Nov 16, 2015 by anonymous
Similar Questions
0 votes

Can you please update if Mongo-Hadoop connector has the support to take aggregation query output as the input to MapReduce jobs. I know there is support for find query through mongo.input.query configuration.

+1 vote

I would like to know if you have any examples or tutorials where I can learn hadoop mapReduce on mongodb in java?

0 votes

I was trying to implement a Hadoop/Spark audit tool, but l met a problem that I can't get the input file location and file name. I can get username, IP address, time, user command, all of these info from hdfs-audit.log. But When I submit a MapReduce job, I can't see input file location neither in Hadoop logs or Hadoop ResourceManager.

Does hadoop have API or log that contains these info through some configuration ?If it have, what should I configure?

+1 vote

In xmls configuration file of Hadoop-2.x, "mapreduce.input.fileinputformat.split.minsize" is given which can be set but how to set "mapreduce.input.fileinputformat.split.maxsize" in xml file. I need to set it in my mapreduce code.

+2 votes

Is there a way to configure/increase the memory limit(100MB) on the pipeline? What if I have a lot of RAM that can be used by MongoDB?

...