Wikiposts
Search
Computer/Internet Issues & Troubleshooting Anyone with questions about the terribly complex world of computers or the internet should try here. NOT FOR REPORTING ISSUES WITH PPRuNe FORUMS! Please use the subforum "PPRuNe Problems or Queries."

Duplicate Files

Thread Tools
 
Search this Thread
 
Old 6th May 2013, 18:14
  #1 (permalink)  
Guest
Thread Starter
 
Join Date: May 2008
Location: Somewhere between E17487 and F75775
Age: 80
Posts: 725
Likes: 0
Received 0 Likes on 0 Posts
Duplicate Files

I'm sure I'm nbot the only one with duplicate files - mostly .jpg - stored here and there on my XP-PC. Can anyone recommend any good software to search* for such duplicates ?

* and ask DELETE THIS DUPLICATE FILE ?

Last edited by OFSO; 6th May 2013 at 18:15.
OFSO is offline  
Old 6th May 2013, 20:30
  #2 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
OSFO,

Depends on how you define searching for duplicates.

Personally, I would look for software that does a crypto hash on files and presents you duplicates based on that.

Software that just goes by file names is asking for trouble.
mixture is offline  
Old 6th May 2013, 22:48
  #3 (permalink)  
Spoon PPRuNerist & Mad Inistrator
 
Join Date: Sep 2003
Location: Twickenham, home of rugby
Posts: 7,396
Received 261 Likes on 171 Posts
Agree with mixture re: filename only.

Also you need to see date & time of files and file sizes to make any sort of meaningful comparison. Even then you will often have to open the different versions to decide which to keep - or possibly keep both.

And don't touch system files, there's duplicates for a reason!

SD
Saab Dastard is offline  
Old 7th May 2013, 18:07
  #4 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
VisiPics is good (and free) - VisiPics

"VisiPics does more than just look for identical files, it goes beyond checksums to look for similar pictures and does it all with a simple user interface. First, you select the root folder or folders to find and catalogue all of your pictures. It then applies five image comparison filters in order to measure how close pairs of images on the hard drive are."

"Visipics....will detect two different resolution files of the same picture as a duplicate, or the same picture saved in different formats, or duplicates where only minor cosmetic changes have taken place"

"All detected duplicates are shown side by side with pertinent information such as file name, type and size being displayed. Its auto-select mode let you choose if you want to keep the higher resolution picture, space-saving filetype, smaller filesize or all of the above."

Another good one is dupeGuru Picture Edition - dupeGuru Picture Edition - JPG, PNG, TIFF, GIF, BMP duplicate scanner

Again, it compares the actual images rather than CRCs - depending on how "hard" you set the filter it will find anything from vague similarities to exact matches (independent of file format).

If you find them useful then donate a few $ (I did) to keep free/open software going.

Mac


Last edited by Mac the Knife; 7th May 2013 at 18:10.
Mac the Knife is offline  
Old 7th May 2013, 19:28
  #5 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
Mac the Knife

As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.

The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1. Not CRC as you are advocating.
mixture is offline  
Old 7th May 2013, 19:35
  #6 (permalink)  
Psychophysiological entity
 
Join Date: Jun 2001
Location: Tweet Rob_Benham Famous author. Well, slightly famous.
Age: 84
Posts: 3,270
Received 37 Likes on 18 Posts
Gosh, while looking at this link I reached this:

Download.com wraps downloads in bloatware, lies about motivations | ExtremeTech
Loose rivets is offline  
Old 7th May 2013, 20:03
  #7 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
Gee mixture, I'm confused.

First of all I've never advocated using CRC for finding either identical or similar images.

Secondly, despite your cries of incredulity, the two apps I suggested certainly DO find similar images, certainly on my PCs and in the case of dupeGuru, on my Macs as well.

I, and the other users of these apps, are evidently insane to suggest such a thing (and suffering from aniloquy).

Why not confound yourself and try 'em?

Mac


Last edited by Mac the Knife; 14th May 2013 at 14:57.
Mac the Knife is offline  
Old 8th May 2013, 00:07
  #8 (permalink)  
 
Join Date: May 2009
Location: Down Under somewhere not all that far from YPAD
Age: 79
Posts: 570
Received 14 Likes on 7 Posts
Seems to have a fairly good writeup:

Auslogics Duplicate File Finder 2.5.0.0 free download - Downloads - freeware, shareware, software trials, evaluations - PC & Tech Authority Downloads

I have used it here occasionally with success.

F
O
R
FullOppositeRudder is offline  
Old 8th May 2013, 22:41
  #9 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
I was a bit taken aback by 'ol mixture's rebuttal so I loaded up an old image directory that I know has a lot of dupes and semi-dupes (images that have been resized, edited, or saved in a different format).

Some of my dupes are deliberate, in that they're copied to more than one folder for reference - yes, I know its wasteful, but I'm not short of disk space.

On the Basic setting (as opposed to Strict or Loose) VisiPics looked at 34953 images in 31 minutes and found 11834 "duplicates" - on review, all are visually similar across a range of file formats - either real exact dupes or edits.

The results are easy to see as they are compared visually and you can choose which ones to move, ignore or discard.

"Autoselect" could have picked for me Uncompressed filetype a/o Lower resolution a/o Smaller filesize for deletion to the Recycle Bin - careful there! There is no option for creating links.


On the same folder dupeGuru is took more than twice as long to find rather more & more accurately - it seems to be using a different technique for matching images and it verifies results. Results are more finegrained and you can see the delta better - dupeGuru is more "tweakable", can use regexes, and can create symlinks or hardlinks as well as copy or move dupes to a new directory.

Both allow you to find similar images across a single or multiple folders and operate on the results visually in a reasonable GUI.

These are sharp tools - don't cut yourself!

What mix says.

"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.

The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."


is wrong.

Mac



Mac the Knife is offline  
Old 8th May 2013, 23:31
  #10 (permalink)  
 
Join Date: Jul 2007
Location: Nr Salisbury UK
Posts: 97
Likes: 0
Received 0 Likes on 0 Posts
Have a look here - take your pick: Best Free Duplicate File Detector | Gizmo's Freeware Reviews
seanbean is offline  
Old 9th May 2013, 06:59
  #11 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
What mix says.

"As the various image search engines have demonstrated, anyone trying to tell you their software will find similar images is just talking through their backside.

The only thing you can do accurately is detect identical images, and you do that by using proper crypto hashes with sufficiently low collision rate such as SHA1."

is wrong.

I don't have much spare time, so I can't test at the moment, but will try to find a few minutes this weekend.

In the interim, pending that, I will temporarily withdraw the first statement above. But my second statement remains accurate.

Crypto hashes with low collision rates are a proven guaranteed way to detect identical files. If you want the belt and braces approach you can also build a hash tree and compare that too.

Last edited by mixture; 9th May 2013 at 06:59.
mixture is offline  
Old 9th May 2013, 21:33
  #12 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
"Crypto hashes with low collision rates are a proven guaranteed way to detect identical files. If you want the belt and braces approach you can also build a hash tree and compare that too."

Absolutely. Won't argue with that.

Mac

Mac the Knife is offline  
Old 19th May 2013, 15:44
  #13 (permalink)  
 
Join Date: Jul 2007
Location: Stockport
Age: 72
Posts: 43
Likes: 0
Received 0 Likes on 0 Posts
I found Awesome Duplicate File Finder (Find it on Google, Free) to be useful. My wife kept making copies of Pics to send for printing. This Prog will show duplicate images side by side so you can decide which to delete. Also shows the percentage similarity so you can also delete unwanted pics when, say, you have taken two shots.
AS
Albert Square is offline  
Old 19th May 2013, 17:07
  #14 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
Good find Albert!

ADFF (in spite of its silly name) is quick, with a simple interface.

On my fairly average system it scanned 3334 mixed image-files in 00:02:15 and found 237 similar files. 217 were pretty similar (>20% similarity) and 20 were not (<20% similarity).

Cons:

Its not very tweakable and there is little ability to automate moves or deletes.
But still a useful addition to the toolkit.

mixture is very evidently mistaken but he's keeping schtum....

Mac

Mac the Knife is offline  
Old 19th May 2013, 21:24
  #15 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
mixture is very evidently mistaken but he's keeping schtum....
Some of us lead lives in the real world outside of PPRuNe. End of.
mixture is offline  
Old 19th May 2013, 22:19
  #16 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
We all do.

Some of us are able to acknowlege having been mistaken and some of us can't.

Hey ho!

Last edited by Mac the Knife; 19th May 2013 at 22:19.
Mac the Knife is offline  
Old 20th May 2013, 00:03
  #17 (permalink)  
 
Join Date: Mar 2002
Location: Florida
Posts: 4,569
Likes: 0
Received 1 Like on 1 Post
My duplicates Jpgs are truly duplicate files (similar name but identical size)

I would just search on size and cull by date. Course i Have about 20,000 JPGS so it might take some time. My solution is to just use a ext hard drive
lomapaseo is offline  
Old 20th May 2013, 06:47
  #18 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
Some of us are able to acknowlege having been mistaken and some of us can't.
Jeez Mac.... do I have to spell it out to you ....

I HAVE NOT HAD TIME TO TEST ANYTHING


Got it ?
mixture is offline  
Old 20th May 2013, 08:15
  #19 (permalink)  
 
Join Date: Sep 1999
Location: Deepest Dark Afrika
Posts: 175
Likes: 0
Received 0 Likes on 0 Posts
I'm sure as hell not getting in between mac and mixture - that could easily become a gory and/or explosive experience!

For my money: Digital Volcano : Duplicate Cleaner Version 2.1.0

Works for me - and it's free!

And you can compare on same content, same file name. same create date, and/or same modified date ...

Last edited by Feline; 20th May 2013 at 08:16.
Feline is offline  

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.