Duplicate Files
Guest
Thread Starter
Join Date: May 2008
Location: Somewhere between E17487 and F75775
Age: 80
Posts: 725
Likes: 0
Received 0 Likes
on
0 Posts
Duplicate Files
I'm sure I'm nbot the only one with duplicate files - mostly .jpg - stored here and there on my XP-PC. Can anyone recommend any good software to search* for such duplicates ?
* and ask DELETE THIS DUPLICATE FILE ?
* and ask DELETE THIS DUPLICATE FILE ?
Last edited by OFSO; 6th May 2013 at 18:15.
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes
on
0 Posts
OSFO,
Depends on how you define searching for duplicates.
Personally, I would look for software that does a crypto hash on files and presents you duplicates based on that.
Software that just goes by file names is asking for trouble.
Depends on how you define searching for duplicates.
Personally, I would look for software that does a crypto hash on files and presents you duplicates based on that.
Software that just goes by file names is asking for trouble.
Spoon PPRuNerist & Mad Inistrator
Agree with mixture re: filename only.
Also you need to see date & time of files and file sizes to make any sort of meaningful comparison. Even then you will often have to open the different versions to decide which to keep - or possibly keep both.
And don't touch system files, there's duplicates for a reason!
SD
Also you need to see date & time of files and file sizes to make any sort of meaningful comparison. Even then you will often have to open the different versions to decide which to keep - or possibly keep both.
And don't touch system files, there's duplicates for a reason!
SD
Plastic PPRuNer
VisiPics is good (and free) - VisiPics
"VisiPics does more than just look for identical files, it goes beyond checksums to look for similar pictures and does it all with a simple user interface. First, you select the root folder or folders to find and catalogue all of your pictures. It then applies five image comparison filters in order to measure how close pairs of images on the hard drive are."
"Visipics....will detect two different resolution files of the same picture as a duplicate, or the same picture saved in different formats, or duplicates where only minor cosmetic changes have taken place"
"All detected duplicates are shown side by side with pertinent information such as file name, type and size being displayed. Its auto-select mode let you choose if you want to keep the higher resolution picture, space-saving filetype, smaller filesize or all of the above."
Another good one is dupeGuru Picture Edition - dupeGuru Picture Edition - JPG, PNG, TIFF, GIF, BMP duplicate scanner
Again, it compares the actual images rather than CRCs - depending on how "hard" you set the filter it will find anything from vague similarities to exact matches (independent of file format).
If you find them useful then donate a few $ (I did) to keep free/open software going.
Mac
"VisiPics does more than just look for identical files, it goes beyond checksums to look for similar pictures and does it all with a simple user interface. First, you select the root folder or folders to find and catalogue all of your pictures. It then applies five image comparison filters in order to measure how close pairs of images on the hard drive are."
"Visipics....will detect two different resolution files of the same picture as a duplicate, or the same picture saved in different formats, or duplicates where only minor cosmetic changes have taken place"
"All detected duplicates are shown side by side with pertinent information such as file name, type and size being displayed. Its auto-select mode let you choose if you want to keep the higher resolution picture, space-saving filetype, smaller filesize or all of the above."
Another good one is dupeGuru Picture Edition - dupeGuru Picture Edition - JPG, PNG, TIFF, GIF, BMP duplicate scanner
Again, it compares the actual images rather than CRCs - depending on how "hard" you set the filter it will find anything from vague similarities to exact matches (independent of file format).
If you find them useful then donate a few $ (I did) to keep free/open software going.
Mac
Last edited by Mac the Knife; 7th May 2013 at 18:10.
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes
on
0 Posts
Mac the Knife
As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1. Not CRC as you are advocating.
As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1. Not CRC as you are advocating.
Psychophysiological entity
Gosh, while looking at this link I reached this:
Download.com wraps downloads in bloatware, lies about motivations | ExtremeTech
Download.com wraps downloads in bloatware, lies about motivations | ExtremeTech
Plastic PPRuNer
Gee mixture, I'm confused.
First of all I've never advocated using CRC for finding either identical or similar images.
Secondly, despite your cries of incredulity, the two apps I suggested certainly DO find similar images, certainly on my PCs and in the case of dupeGuru, on my Macs as well.
I, and the other users of these apps, are evidently insane to suggest such a thing (and suffering from aniloquy).
Why not confound yourself and try 'em?
Mac
First of all I've never advocated using CRC for finding either identical or similar images.
Secondly, despite your cries of incredulity, the two apps I suggested certainly DO find similar images, certainly on my PCs and in the case of dupeGuru, on my Macs as well.
I, and the other users of these apps, are evidently insane to suggest such a thing (and suffering from aniloquy).
Why not confound yourself and try 'em?
Mac
Last edited by Mac the Knife; 14th May 2013 at 14:57.
Seems to have a fairly good writeup:
Auslogics Duplicate File Finder 2.5.0.0 free download - Downloads - freeware, shareware, software trials, evaluations - PC & Tech Authority Downloads
I have used it here occasionally with success.
F
O
R
Auslogics Duplicate File Finder 2.5.0.0 free download - Downloads - freeware, shareware, software trials, evaluations - PC & Tech Authority Downloads
I have used it here occasionally with success.
F
O
R
Plastic PPRuNer
I was a bit taken aback by 'ol mixture's rebuttal so I loaded up an old image directory that I know has a lot of dupes and semi-dupes (images that have been resized, edited, or saved in a different format).
Some of my dupes are deliberate, in that they're copied to more than one folder for reference - yes, I know its wasteful, but I'm not short of disk space.
On the Basic setting (as opposed to Strict or Loose) VisiPics looked at 34953 images in 31 minutes and found 11834 "duplicates" - on review, all are visually similar across a range of file formats - either real exact dupes or edits.
The results are easy to see as they are compared visually and you can choose which ones to move, ignore or discard.
"Autoselect" could have picked for me Uncompressed filetype a/o Lower resolution a/o Smaller filesize for deletion to the Recycle Bin - careful there! There is no option for creating links.
On the same folder dupeGuru is took more than twice as long to find rather more & more accurately - it seems to be using a different technique for matching images and it verifies results. Results are more finegrained and you can see the delta better - dupeGuru is more "tweakable", can use regexes, and can create symlinks or hardlinks as well as copy or move dupes to a new directory.
Both allow you to find similar images across a single or multiple folders and operate on the results visually in a reasonable GUI.
These are sharp tools - don't cut yourself!
What mix says.
"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."
is wrong.
Mac
Some of my dupes are deliberate, in that they're copied to more than one folder for reference - yes, I know its wasteful, but I'm not short of disk space.
On the Basic setting (as opposed to Strict or Loose) VisiPics looked at 34953 images in 31 minutes and found 11834 "duplicates" - on review, all are visually similar across a range of file formats - either real exact dupes or edits.
The results are easy to see as they are compared visually and you can choose which ones to move, ignore or discard.
"Autoselect" could have picked for me Uncompressed filetype a/o Lower resolution a/o Smaller filesize for deletion to the Recycle Bin - careful there! There is no option for creating links.
On the same folder dupeGuru is took more than twice as long to find rather more & more accurately - it seems to be using a different technique for matching images and it verifies results. Results are more finegrained and you can see the delta better - dupeGuru is more "tweakable", can use regexes, and can create symlinks or hardlinks as well as copy or move dupes to a new directory.
Both allow you to find similar images across a single or multiple folders and operate on the results visually in a reasonable GUI.
These are sharp tools - don't cut yourself!
What mix says.
"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."
is wrong.
Mac
Join Date: Jul 2007
Location: Nr Salisbury UK
Posts: 97
Likes: 0
Received 0 Likes
on
0 Posts
Have a look here - take your pick: Best Free Duplicate File Detector | Gizmo's Freeware Reviews
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes
on
0 Posts
What mix says.
"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."
is wrong.
"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.
The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."
is wrong.
I don't have much spare time, so I can't test at the moment, but will try to find a few minutes this weekend.
In the interim, pending that, I will temporarily withdraw the first statement above. But my second statement remains accurate.
Crypto hashes with low collision rates are a proven guaranteed way to detect identical files. If you want the belt and braces approach you can also build a hash tree and compare that too.
Last edited by mixture; 9th May 2013 at 06:59.
Plastic PPRuNer
"Crypto hashes with low collision rates are a proven guaranteed way to detect identical files. If you want the belt and braces approach you can also build a hash tree and compare that too."
Absolutely. Won't argue with that.
Mac
Absolutely. Won't argue with that.
Mac
Join Date: Jul 2007
Location: Stockport
Age: 72
Posts: 43
Likes: 0
Received 0 Likes
on
0 Posts
I found Awesome Duplicate File Finder (Find it on Google, Free) to be useful. My wife kept making copies of Pics to send for printing. This Prog will show duplicate images side by side so you can decide which to delete. Also shows the percentage similarity so you can also delete unwanted pics when, say, you have taken two shots.
AS
AS
Plastic PPRuNer
Good find Albert!
ADFF (in spite of its silly name) is quick, with a simple interface.
On my fairly average system it scanned 3334 mixed image-files in 00:02:15 and found 237 similar files. 217 were pretty similar (>20% similarity) and 20 were not (<20% similarity).
Cons:
Its not very tweakable and there is little ability to automate moves or deletes.
But still a useful addition to the toolkit.
mixture is very evidently mistaken but he's keeping schtum....
Mac
ADFF (in spite of its silly name) is quick, with a simple interface.
On my fairly average system it scanned 3334 mixed image-files in 00:02:15 and found 237 similar files. 217 were pretty similar (>20% similarity) and 20 were not (<20% similarity).
Cons:
Its not very tweakable and there is little ability to automate moves or deletes.
But still a useful addition to the toolkit.
mixture is very evidently mistaken but he's keeping schtum....
Mac
My duplicates Jpgs are truly duplicate files (similar name but identical size)
I would just search on size and cull by date. Course i Have about 20,000 JPGS so it might take some time. My solution is to just use a ext hard drive
I would just search on size and cull by date. Course i Have about 20,000 JPGS so it might take some time. My solution is to just use a ext hard drive
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes
on
0 Posts
Some of us are able to acknowlege having been mistaken and some of us can't.
I HAVE NOT HAD TIME TO TEST ANYTHING
Got it ?
Join Date: Sep 1999
Location: Deepest Dark Afrika
Posts: 175
Likes: 0
Received 0 Likes
on
0 Posts
I'm sure as hell not getting in between mac and mixture - that could easily become a gory and/or explosive experience!
For my money: Digital Volcano : Duplicate Cleaner Version 2.1.0
Works for me - and it's free!
And you can compare on same content, same file name. same create date, and/or same modified date ...
For my money: Digital Volcano : Duplicate Cleaner Version 2.1.0
Works for me - and it's free!
And you can compare on same content, same file name. same create date, and/or same modified date ...
Last edited by Feline; 20th May 2013 at 08:16.