My requirement is a typical Datawarehouse and ETL requirement. I need to accomplish
1) Daily Insert transaction records to a Hive table or a HDFS file. This table or file is not a big table ( approximately 10 records per day). I don't want to Partition the table / file.
In few articles It was being mentioned that we need to load to a staging table in Hive. And then insert like the below :
insert overwrite table finaltable select * from staging;
I am not getting this logic. How should I populate the staging table daily.