Find all duplicate files linux

fdupes – A Command Line Tool to Find and Delete Duplicate Files in Linux

It is a common requirement to find and replace duplicate files for most of the computer users. Finding and removing duplicate files is a tiresome job that demands time and patience. Finding duplicate files can be very easy if your machine is powered by GNU/Linux, thanks to ‘fdupes‘ utility.

Fdupes – Find and Delete Duplicate Files in Linux

What is fdupes?

Fdupes is a Linux utility written by Adrian Lopez in C programming Language released under MIT License. The application is able to find duplicate files in the given set of directories and sub-directories. Fdupes recognize duplicates by comparing MD5 signature of files followed by a byte-to-byte comparison. A lots of options can be passed with Fdupes to list, delete and replace the files with hardlinks to duplicates.

The comparison starts in the order:

size comparison > Partial MD5 Signature Comparison > Full MD5 Signature Comparison > Byte-to-Byte Comparison.

Install fdupes on a Linux

Installation of latest version of fdupes (fdupes version 1.51) as easy as running following command on Debian based systems such as Ubuntu and Linux Mint.

On CentOS/RHEL and Fedora based systems, you need to turn on epel repository to install fdupes package.

Note: The default package manager yum is replaced by dnf from Fedora 22 onwards…

How to use fdupes command?

1. For demonstration purpose, let’s a create few duplicate files under a directory (say tecmint) simply as:

After running above command, let’s verify the duplicates files are created or not using ls command.

The above script create 15 files namely tecmint1.txt, tecmint2.txt…tecmint15.txt and every files contains the same data i.e.,

2. Now search for duplicate files within the folder tecmint.

3. Search for duplicates recursively under every directory including it’s sub-directories using the -r option.

It search across all the files and folder recursively, depending upon the number of files and folders it will take some time to scan duplicates. In that mean time, you will be presented with the total progress in terminal, something like this.

4. See the size of duplicates found within a folder using the -S option.

5. You can see the size of duplicate files for every directory and subdirectories encountered within using the -S and -r options at the same time, as:

6. Other than searching in one folder or all the folders recursively, you may choose to choose in two folders or three folders as required. Not to mention you can use option -S and/or -r if required.

7. To delete the duplicate files while preserving a copy you can use the option ‘-d’. Extra care should be taken while using this option else you might end up loosing necessary files/data and mind it the process is unrecoverable.

You may notice that all the duplicates are listed and you are prompted to delete, either one by one or certain range or all in one go. You may select a range something like below to delete files files of specific range.

8. From safety point of view, you may like to print the output of ‘fdupes’ to file and then check text file to decide what file to delete. This decrease chances of getting your file deleted accidentally. You may do:

Читайте также:  Sql server для windows 10 64 bit

Note: You may replace ‘/home’ with the your desired folder. Also use option ‘-r’ and ‘-S’ if you want to search recursively and Print Size, respectively.

9. You may omit the first file from each set of matches by using option ‘-f’.

First List files of the directory.

and then omit the first file from each set of matches.

10. Check installed version of fdupes.

11. If you need any help on fdupes you may use switch ‘-h’.

That’s for all now. Let me know how you were finding and deleting duplicates files till now in Linux? and also tell me your opinion about this utility. Put your valuable feedback in the comment section below and don’t forget to like/share us and help us get spread.

I am working on another utility called fslint to remove duplicate files, will soon post and you people will love to read.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

We are thankful for your never ending support.

Источник

How to Find and Remove Duplicate Files on Linux

Hi all, today we’re gonna learn how to find and remove duplicate files on you Linux PC or Server. So, here’s tools that you may use anyone of them according to your needs and comfort.

Whether you’re using Linux on your desktop or a server, there are good tools that will scan your system for duplicate files and help you remove them to free up space. Solid graphical and command-line interfaces are both available. Duplicate files are an unnecessary waste of disk space. After all, if you really need the same file in two different locations you could always set up a symbolic link or hard link, storing the data in only one location on disk.

1) FSlint

FSlint is available in various Linux distributions binary repository, including Ubuntu, Debian, Fedora, and Red Hat. Just fire up your package manager and install the “fslint” package. This utility provides a convenient graphical interface by default and it also includes command-line versions of its various functions.

Don’t let that scare you away from using FSlint’s convenient graphical interface, though. By default, it opens with the Duplicates pane selected and your home directory as the default search path.

Installation

To install fslint, as I am running ubuntu, here is the default command:

But here are installation commands for other linux distributions:

For Other Distro:

Run fslint

To run fslint in GUI version run fslint-gui in Ubuntu, run command (Alt+F2) or terminal:

By default, it opens with the Duplicates pane selected and your home directory as the default search path. All you have to do is click the Find button and FSlint will find a list of duplicate files in directories under your home folder.

Use the buttons to delete any files you want to remove, and double-click them to preview them.

Finally, you are done. Hurray, we have successfully removed duplicate files from your system.

Note that the command-line utilities aren’t in your path by default, so you can’t run them like typical commands. On Ubuntu, you’ll find them under /usr/share/fslint/fslint. So, if you wanted to run the entire fslint scan on a single directory, here are the commands you’d run on Ubuntu:

This command won’t actually delete anything. It will just print a list of duplicate files — you’re on your own for the rest.

Читайте также:  Nexus vst windows 64

2) Fdupes

FDUPES is a program for identifying or deleting duplicate files residing within specified directories written by Adrian Lopez. You can look the GitHub project.

Install fdupes

To install fdupes, do as below:

On Centos 7:

On Ubuntu 16.04:

Search for duplicates files

fdupes command searches duplicates in a folder indicated. The syntax is as below

Let us create some duplicates files. We will create a folder and 10 files with the same content

Let’s check the result

We see that all our file exist. Now we can search for duplicate files as below

You can see that we have all the 10 duplicates files listed above.

Search duplicate file recursively and display the size

You have seen that the result above doesn’ show the duplicated files created earlier in labor/package directory. To search duplicated files into a directory and its sub-directories, we use the option -r and you can see the size of each duplicates files with -S parameter as below

With the result you can understand that the duplicates files have the same size so the same content.

It is possible to omit the first file when researching for duplicated files

Delete the duplicated files

To delete duplicated files, we use the -d parameter. fdupes will ask which files to preserve

We can check the result as below.

You can see that we have preserved drago8 and pack2 files

Conclusion

We have seen how to delete duplicated files on Linux both graphically and command line. You can use one the tools depending on your needs. It is important to check the duplicated file in order to save space on your server.

Источник

4 Useful Tools to Find and Delete Duplicate Files in Linux

Organizing your home directory or even system can be particularly hard if you have the habit of downloading all kinds of stuff from the internet.

Often you may find you have downloaded the same mp3, pdf, epub (and all kind of other file extensions) and copied it to different directories. This may cause your directories to become cluttered with all kinds of useless duplicated stuff.

In this tutorial, you are going to learn how to find and delete duplicate files in Linux using rdfind and fdupes command-line tools, as well as using GUI tools called DupeGuru and FSlint.

A note of caution – always be careful what you delete on your system as this may lead to unwanted data loss. If you are using a new tool, first try it in a test directory where deleting files will not be a problem.

1. Rdfind – Finds Duplicate Files in Linux

Rdfind comes from redundant data find. It is a free tool used to find duplicate files across or within multiple directories. It uses checksum and finds duplicates based on file contains not only names.

Rdfind uses an algorithm to classify the files and detects which of the duplicates is the original file and considers the rest as duplicates. The rules of ranking are:

  • If A was found while scanning an input argument earlier than B, A is higher ranked.
  • If A was found at a depth lower than B, A is higher ranked.
  • If A was found earlier than B, A is higher ranked.

The last rule is used particularly when two files are found in the same directory.

To install rdfind in Linux, use the following command as per your Linux distribution.

To run rdfind on a directory simply type rdfind and the target directory. Here is an example:

Find Duplicate Files in Linux

As you can see rdfind will save the results in a file called results.txt located in the same directory from where you ran the program. The file contains all the duplicate files that rdfind has found. You can review the file and remove the duplicate files manually if you want to.

Another thing you can do is to use the -dryrun an option that will provide a list of duplicates without taking any actions:

Читайте также:  Create linux headers from kernel

When you find the duplicates, you can choose to replace them with hard links.

And if you wish to delete the duplicates you can run.

To check other useful options of rdfind you can use the rdfind manual with.

2. Fdupes – Scan for Duplicate Files in Linux

Fdupes is another program that allows you to identify duplicate files on your system. It is free and open-source and written in C. It uses the following methods to determine duplicate files:

  • Comparing partial md5sum signatures
  • Comparing full md5sum signatures
  • byte-by-byte comparison verification

Just like rdfind it has similar options:

  • Search recursively
  • Exclude empty files
  • Shows size of duplicate files
  • Delete duplicates immediately
  • Exclude files with a different owner

To install fdupes in Linux, use the following command as per your Linux distribution.

Fdupes syntax is similar to rdfind. Simply type the command followed by the directory you wish to scan.

To search files recursively, you will have to specify the -r an option like this.

You can also specify multiple directories and specify a dir to be searched recursively.

To have fdupes calculate the size of the duplicate files use the -S option.

To gather summarized information about the found files use the -m option.

Scan Duplicate Files in Linux

Finally, if you want to delete all duplicates use the -d an option like this.

Fdupes will ask which of the found files to delete. You will need to enter the file number:

Delete Duplicate Files in Linux

A solution that is definitely not recommended is to use the -N option which will result in preserving the first file only.

To get a list of available options to use with fdupes review the help page by running.

3. dupeGuru – Find Duplicate Files in a Linux

dupeGuru is an open-source and cross-platform tool that can be used to find duplicate files in a Linux system. The tool can either scan filenames or content in one or more folders. It also allows you to find the filename that is similar to the files you are searching for.

dupeGuru comes in different versions for Windows, Mac, and Linux platforms. Its quick fuzzy matching algorithm feature helps you to find duplicate files within a minute. It is customizable, you can pull the exact duplicate files you want to, and Wipeout unwanted files from the system.

To install dupeGuru in Linux, use the following command as per your Linux distribution.

DupeGuru – Find Duplicate Files in Linux

4. FSlint – Duplicate File Finder for Linux

FSlint is a free utility that is used to find and clean various forms of lint on a filesystem. It also reports duplicate files, empty directories, temporary files, duplicate/conflicting (binary) names, bad symbolic links and many more. It has both command-line and GUI modes.

To install FSlint in Linux, use the following command as per your Linux distribution.

FSlint – Duplicate File Finder for -Linux

Conclusion

These are the very useful tools to find duplicated files on your Linux system, but you should be very careful when deleting such files.

If you are unsure if you need a file or not, it would be better to create a backup of that file and remember its directory prior to deleting it. If you have any questions or comments, please submit them in the comment section below.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

We are thankful for your never ending support.

Источник

Оцените статью