Can the blocks be broken in HDFS file system?

1 Answer

As far as I know blocks cannot be broken down in HDFS file system. The Master node will be responsible for getting the actual amount of space needed before blocks are copied from one machine to another. Not only that , the master node also monitors how many blocks are in use and how much space is available.

answer Nov 25, 2015 by Aastha Joshi

Similar Questions

0 votes

How to get info about which data in hdfs or file system that a MapReduce job visits?

I was trying to implement a Hadoop/Spark audit tool, but l met a problem that I can't get the input file location and file name. I can get username, IP address, time, user command, all of these info from hdfs-audit.log. But When I submit a MapReduce job, I can't see input file location neither in Hadoop logs or Hadoop ResourceManager.

Does hadoop have API or log that contains these info through some configuration ?If it have, what should I configure?

0 votes

What happens to a read operation when the file is moved to trash in HDFS?

I have a basic question regarding the HDFS file read. I want to know what happens, when the following steps are followed:

Client opens the file for reading and starts reading the file.
In the meantime, someone deletes the file and file moves to the trash folder

Will Step 1. succeed? I feel, since the client has already opened the file and file still exists in .trash, the client should continue to read the file.

+1 vote

Partition file by content based through HDFS

When a user is uploading a file from the local disk to its HDFS, can I make it partition the file into blocks based on its content?

Meaning, if I have a file with one integer column, can i say, I want the hdfs block to have even numbers?

+4 votes

Add few record(s) to a Hive table or a HDFS file on a daily basis

My requirement is a typical Datawarehouse and ETL requirement. I need to accomplish

1) Daily Insert transaction records to a Hive table or a HDFS file. This table or file is not a big table ( approximately 10 records per day). I don't want to Partition the table / file.

In few articles It was being mentioned that we need to load to a staging table in Hive. And then insert like the below :

insert overwrite table finaltable select * from staging;

I am not getting this logic. How should I populate the staging table daily.

+1 vote

Can I open multiple files on hdfs and write data to them in parallel and then close them at the end?

Can the blocks be broken in HDFS file system?

Your comment on this post:

1 Answer

Your comment on this answer:

Your answer

Preview