top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

HIVE versus SQL DB

+1 vote
229 views

I am in a project that has three databases with flat files. Our plan is to normalize these DB in one. We will need to follow the Data warehouse concept (ETL - Extraction, Transform, Load).

We are thinking to use Hadoop at the Transform step, because we need to relate datas from the three databases. Do you think this is a good option? Is there any tutorial/article about it?

We are also thinking to use HIVE to Extract the files, insert it on Hadoop and use HIVE to query these datas. At this step we are going to eliminate blank spaces, duplicate datas, transform a name register to an ID.

What are yours suggestions about this?

posted Jan 25, 2014 by Sheetal Chauhan

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button

Similar Questions
+2 votes

With the version 0.14 Hive supports Update functionality. I have tried updating a internal table and it works just like RDBMS Update command( though it takes more time to update).

Please let me know if it is possible to UPDATE the EXTERNAL TABLES IN HIVE.

+2 votes

I am trying to load JSON data into Hive using hcatalog JsonSerDe. I have created the table, but when I use LOAD DATA INPATH command to load 8 records into the table. However, SELECT * shows 16 records in the table, each record duplicated. Why is this happening?

+2 votes

I'm a freshman in hadoop world. After some struggling, i've successfully make hadoop 2.6 running on my windows 7 laptop.

However when I want to run hive 1.0.0 on my win 7 system, I found there is no cmd line script as provided for linux. It's also hard to find any useful message in google.

Anyone can provide me any clue on how to run hive on window 7?

+1 vote

I am using hive queries on structured RC file. Can someone please let me know the key performance parameters that I have to tune for better query performance (HADOOP 2.3/ YARN AND HIVE 0.13).

...