A table with a primary key doesn’t contain duplicates. But if due to some reason, the keys have to be disabled or when importing data from other sources, duplicates come up in the table data, it is often needed to get rid of such duplicates.
This can be achieved in tow ways :-
(a) Using a temporary table.
(b) Without using a temporary table.
(a) Using a temporary or staging table
Let the table employee_test1 contain some duplicate data like:-
CREATE TABLE Employee_Test1
(
Emp_ID INT,
Emp_name Varchar(100),
Emp_Sal Decimal (10,2)
)
INSERT INTO Employee_Test1 VALUES (1,'Anees',1000);
INSERT INTO Employee_Test1 VALUES (2,'Rick',1200);
INSERT INTO Employee_Test1 VALUES (3,'John',1100);
INSERT INTO Employee_Test1 VALUES (4,'Stephen',1300);
INSERT INTO Employee_Test1 VALUES (5,'Maria',1400);
INSERT INTO Employee_Test1 VALUES (6,'Tim',1150);
INSERT INTO Employee_Test1 VALUES (6,'Tim',1150);
Step 1: Create a temporary table from the main table as:-
select top 0* into employee_test1_temp from employee_test1
Step2 : Insert the result of the GROUP BY query into the temporary table as:-
insert into employee_test1_temp
select Emp_ID,Emp_name,Emp_Sal
from employee_test1
group by Emp_ID,Emp_name,Emp_Sal
Step3: Truncate the original table as:-
truncate table employee_test1
Step4: Fill the original table with the rows of the temporary table as:-
insert into employee_test1
select * from employee_test1_temp
Now, the duplicate rows from the main table have been removed.
select * from employee_test1
gives the result as:-
Emp_ID Emp_name Emp_Sal
1 Anees 1000
2 Rick 1200
3 John 1100
4 Stephen 1300
5 Maria 1400
6 Tim 1150
(b) Without using a temporary table
;with T as
(
select * , row_number() over (partition by Emp_ID order by Emp_ID) as rank
from employee_test1
)
delete
from T
where rank > 1
The result is as:-
Emp_ID Emp_name Emp_Sal
1 Anees 1000
2 Rick 1200
3 John 1100
4 Stephen 1300
5 Maria 1400
6 Tim 1150