top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

How to find all duplicate files in a folder in Linux environment?

+2 votes
525 views
How to find all duplicate files in a folder in Linux environment?
posted May 9, 2016 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

+1 vote

There are so many but I suggest you to use fdupes -r /home/username command which recursively search all subdirectories inside /home/username for duplicate files and list them.
it is written in perl and does its job fast and efficiently.Here is list of such type of commands from askubuntu
1.dupedit - Compares many files at once without checksumming. Avoids comparing files against themselves when multiple paths point to the same file.
2.dupmerge - runs on various platforms (Win32/64 with Cygwin, *nix, Linux etc.)

  1. dupseek - Perl with algorithm optimized to reduce reads.
    fdf - Perl/c based and runs across most platforms (Win32, *nix and probably others). Uses MD5, SHA1 and other checksum algorithms

  2. freedups - shell script, that searches through the directories you specify. When it finds two identical files, it hard links them together. Now the two or more files still exist in their respective directories, but only one copy of the data is stored on disk; both directory entries point to the same data blocks.

  3. fslint - has command line interface and GUI.

  4. liten - Pure Python deduplication command line tool, and library, using md5 checksums and a novel byte comparison algorithm. (Linux, Mac OS X, *nix, Windows)

  5. liten2 - A rewrite of the original Liten, still a command line tool but with a faster interactive mode using SHA-1 checksums (Linux, Mac OS X, *nix)

  6. rdfind - One of the few which rank duplicates based on the order of input parameters (directories to scan) in order not to delete in "original/well known" sources (if multiple directories are given). Uses MD5 or SHA1.

  7. rmlint - Fast finder with command line interface and many options to find other lint too (uses MD5)

  8. ua - Unix/Linux command line tool, designed to work with find (and the like).

  9. findrepe - free Java-based command-line tool designed for an efficient search of duplicate files, it can search within zips and jars.(GNU/Linux, Mac OS X, *nix, Windows)

  10. fdupe - a small script written in Perl. Doing its job fast and efficiently.1

  11. ssdeep - identify almost identical files using Context Triggered Piecewise Hashing
    Reference: http://askubuntu.com/questions/3865/how-to-find-and-delete-duplicate-files
answer May 23, 2016 by Shivam Kumar Pandey
Similar Questions
+1 vote

I want to read a file line by line, what could be the best way to achieve this?

+1 vote

I have thousands of html files inside a folder. I want to replace the filename present inside another files. Say for ex:- fileName: 'abcd1234.html' is found inside another file say file2.html. Then I want to remove the last 4 digits of the fileName i.e,. 'abcd1234.html' => 'abcd.htm'.

...