Go Back  PPRuNe Forums > Misc. Forums > Computer/Internet Issues & Troubleshooting
Reload this Page >

Image and file management advice.

Computer/Internet Issues & Troubleshooting Anyone with questions about the terribly complex world of computers or the internet should try here. NOT FOR REPORTING ISSUES WITH PPRuNe FORUMS! Please use the subforum "PPRuNe Problems or Queries."

Image and file management advice.

Old 8th Sep 2014, 19:38
  #1 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
Image and file management advice.

Having read previous threads, I'm hoping that both Mixture and Captain Bloggs will chip in here!

I have some large tasks coming my way.

Firstly, I have several thousand images strewn through three or four hard drives and 'folder structures'. Here's how it happened. On day one, a pc had the 'my docs' and subfolder 'my pics'. In general terms, all of my images are in 'my pics'.

As time has gone by, I have (sloppily) copied these 'my docs' folders into backup folder in various place as hard drives were replaced for larger. Because I have, at different times, had a go at sorting within these multiple copies of 'my docs/my pics' I'm now in a bit of a mess, with many copies of the same images.

I need to sort this out, so that I have one 'in use' set and some clones as backups elsewhere. I need something to find all the copies and put them into one place, and that gives me confidence in deleting all the previous folders automatically, if possible. Cap'n B referred to Amok Exif sorter, and I've had a look, but it doesn't seem to do the tidy up. I've used Picasa for years and it does find them all, but as you know, doesn't really offer management.

Obviously, I could just point Lightroom at the drives and let it build a dbase, but I really think that a tidy up first would be a better idea. Perhaps a non image-based file sorter, but I haven't found anything that strikes me as a)clever enough, or b)trustworthy.

Next hurdle is that I've just bought a decent film scanner for a fair few negs and slides. All seem to agree that Lightroom is excellent for tagging date/place/time/who/what. So that's what I'll use, but out of interest, will it run a slideshow built from several tags? I expect so but thought I should ask.

Lastly, as a glutton for punishment, I'm transferring vhs-c and vhs video to pc. Can I use Lightwave to integrate stills and video? Video is generally family stuff in clips of only a few minutes (once edited). I realise that I'll have to 'chop up' the 30 or 45 minute contents of each tape, but that's fine.

Lots of hard work ahead, but if anybody can think of a way to use tech to make life easier, I'd really appreciate it.
boguing is offline  
Old 8th Sep 2014, 20:48
  #2 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
I'm hoping that both Mixture and Captain Bloggs will chip in here!
You rang, Sir ?

Aaah... so, if I get to the nutshell, your question is about de-duplication of your infinitesimal collection of fine photographs ?

If that is the case. The technical answer to your question is that you want to hash the files and trash duplicate hashes.

Hashing being a term for cryptographic magic that will take a file, any file... and produce a string, e.g. "a92701812cff2cf954b22a1adb5c38f166ba409e" ..... that string is guaranteed to be unique for the data in that file ... thus if you had two files with the same hash, you would have exact duplicate files, that is a guaranteed fact because its the very reason hashes were invented. There is a little issue of "hash collisions" but as long as you choose the correct hash algorithm in the first place, the guarantee of being able to identify duplicate files still holds.

Because its cryptography, and thus mathematics, modern computers are able to handle mathematics with aplomb.... thus even if you feed a massive file into a hash algorithm, the computer will typically spit out a result in a matter of seconds.

But I digress.....

Assuming you are not familiar with the dark arts of scripting, what platform do you find your self on ? Linux, Mac, Windows ? I shall then have a sniff around the internet and see if I can mention a few bits of software that may guide you on your quest....

Whilst you could do this in Lightroom (or another photo manager), it would not be an effective use of computing resources because importing stuff to Lightroom (or another photo manager) involves multiple things behind the scenes. If you've got a large volume to deal with, you really want something that does hashes and nothing else.

Moving swiftly on to your extra questions.....

> will Lightroom run a slideshow built from several tags? I expect so but thought I should ask.

To be honest, slideshows is one of the features I use least in Lightroom (just because I've little need for it, not that I've any problems with the way Lightroom does slideshows).

I have fired up my copy of Lightroom, it looks like what you'll do is create a Smart Collection and run a slideshow on that. You'll probably be making a lot of use of Smart Collections anyway, because its a seriously awesome feature, which will mean the answer to your question will effectively be yes, since you're likely to already have the Smart Collection in place that you want to run a slideshow on.

> Can I use Lightwave to integrate stills and video? Video is generally family stuff in clips of only a few minutes (once edited). I realise that I'll have to 'chop up' the 30 or 45 minute contents of each tape, but that's fine.

Lightwave (?) or Lightroom ?

I think Lightroom has some form of video support. But whether this goes further than just cataloguing I don't know.

I don't do a huge amount of video, and what video I do do, I use Adobe Prelude and Premiere.

But by the looks of things (http://tv.adobe.com/watch/getting-st...h-dslr-video-/), it looks like Lightroom has had video support since version 4.

Last edited by mixture; 8th Sep 2014 at 22:53.
mixture is offline  
Old 9th Sep 2014, 02:31
  #3 (permalink)  

Official PPRuNe Chaplain
 
Join Date: Apr 2001
Location: Witnesham, Suffolk
Age: 80
Posts: 3,498
Likes: 0
Received 0 Likes on 0 Posts
mixture has the elegant solution, using hashes.

There's a quick and dirty method (which I have used more than once):
If you are using Windows, and
IF (big IF) your files have unique names (eg IMG00000001.JPG), then tipping the whole lot into a single folder will prompt Windows to ask what it should do with duplicates. Tell it to keep one only, and it will dump the rest.

The catch is that if you have several different pix with the same filename, you'll lose all but one of them.
Keef is offline  
Old 9th Sep 2014, 10:00
  #4 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
Excellent. A project team is assembled.

I've read, and I think digested, your replies.

I love Keef's idea, but can see one definite bottleneck and one which may be imaginary. The definite one is that I don't 'know' where all these files are. To expand on how the problem has arisen I'll add that it wasn't only growing hard drive capacities. The history starts in 2000 with the first scanner and digital camera. Every time that I've built a new 'main' pc I have copied the entire 'my docs' (and equivalents from custom partitioned drives) to backup folders elsewhere. Because installing a fresh OS is almost always a reaction to something having failed and needing urgent replacement, yet infrequent enough to devise and remember a regular plan, these copies are buried under ludicrously generic folder branches, trees and whole forests in different countries and planets. In other words, I'll need to search for all files by type (jpg, gif, raw etc) and then do a copy/paste or cut/paste of enormous size. Will Windows cope, or will it crash with a message such as 'file name length exceeded, **** off'?

The other, possibly imaginary problem is that the pictures have been taken by me and the kids with about twenty cameras (and some 'phones). Although it's a mess, under the current system I can search for all of sons' pics and present them on a memory thing and give them to him. This is because all of his pics are in sub-folders of the copies of his 'my docs' or 'sons' stuff'. If they are all in one enormous folder I can't do that in Windows.

Again, I assume that Lightroom (and as a side not to mixture, I'm on beta blockers and one side effect is that I sometimes switch a word for another that makes perfect sense, albeit completely wrongly. Not Lightwave then.) will let me search by the camera used? I've had a quick look, and the only camera that doesn't appear to have any 'exif-type' data is my very first one, a 1996 Kodak. All the others do have distinguishing data buried in their files.

And to answer Mixture directly. You're right, I don't do scripts. Or rather, it's been a long time since I did a few. 'Tis indeed Windows. If it was Apple I'm sure that it would have sorted it all out for me, and if Linux I would,by inference, not have been so stupid as to blithely continue digging. Video support would be nice, but it's not important. As long as I put the data somewhere the future is going to bring software along that will support a mixed presentation simply. Actually that's an assumption based on a hunch that someone somewhere will actually want to watch media that they've captured on their 'phones instead of watching event with their own eyes. I can digress with the best of you.

Some might wonder why I don't just crack on and download trial software and find out for myself. 0.24mbit/sec 'broadband' is my answer to that one.
boguing is offline  
Old 9th Sep 2014, 12:10
  #5 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
and then do a copy/paste or cut/paste of enormous size. Will Windows cope, or will it crash with a message such as 'file name length exceeded, **** off'?
Ouch, I wouldn't do that.

For future reference, if you ever find yourself needing to do large file movements in Windows, look up a free Microsoft tool called Robocopy, that's the only way to go.

I'm on beta blockers and one side effect is that I sometimes switch a word for another that makes perfect sense, albeit completely wrongly. Not Lightwave then
No worries, I was actually checking because I thought there might have been some software actually called Lightwave that I knew nothing about.

Again, I assume that Lightroom will let me search by the camera used?
Absolutley, as per screen shot below you can filter/search your library by metadata, text and attributes. You can also apply various filters on Smart Collections too.





I've had a quick look, and the only camera that doesn't appear to have any 'exif-type' data is my very first one, a 1996 Kodak. All the others do have distinguishing data buried in their files.
Sounds like you should be just fine then. As for the Kodak, you can always look at either keywording or applying labels (two different things in Lightroom) to your Kodak images in Lightroom to make them easy to find.

You're right, I don't do scripts. Or rather, it's been a long time since I did a few.
In which case I'd definitely suggest third party software.... this is a scenario where you don't want to be risking bugs in scripts caused by rusty skills !

'Tis indeed Windows.
Ok, might take me a day or so (although I'll try to get some initial ideas back quicker) but I'll dig around the nether regions of the internet and let you know.

First suggestion of the day....

TreeSize Pro from Jam Software

A quick browse through their site says you can tell it to compare files based on MD5 or SHA256 hashes (https://www.jam-software.com/treesiz...e_search.shtml) They also offer a 30 day trial.

Last edited by mixture; 9th Sep 2014 at 12:32.
mixture is offline  
Old 9th Sep 2014, 16:26
  #6 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
Truly appreciate your guidance. I'll download Lightroom overnight.

Scanner should arrive tomorrow, so plenty to get on with in learning how to create an efficient workflow around it. Postie should also be bringing another 3Tb external drive which will give me yet another place to store the scan output!
boguing is offline  
Old 9th Sep 2014, 17:55
  #7 (permalink)  
 
Join Date: Aug 2006
Location: Lemonia. Best Greek in the world
Posts: 1,759
Received 6 Likes on 3 Posts
It is nice being thick.

As a techie thicko I sometimes have techie problems. Reading this exchange, I am now glad that I am a techie thicko. My 13,000 or thereabouts piccies may have a few duplicates.
I have a simple system in Win filing. Named, dated and "clean" files are in "family photos". Less clean, unsorted, maybe duplicated files are in "family photos spares".
I might well be wrong, but I do not trust software to do the sorting. So every now and then, swmbo and I sit down to clean a few files, and transfer them as we can. F knows why we have so many files called Moiras or some other mis-spelt name. As bogu says, back ups and disorganised changes over time create issues.
Then, one good thing about BT. The second, off site, back up to the sorted photos is a free BT back up. BUT......f-ing BT make keeping the nice clean files a real pain......So one day I'll be back at the beginning. Or the kids will be.
Ancient Observer is offline  
Old 9th Sep 2014, 18:11
  #8 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
AO, until; a few years ago I would have agreed with you about filenames and managed directories being king. I would also not have trusted software to sort them.

Times change, and file systems have come a long way. I regularly use Picasa, which can see all of this info, but does nothing with it, and music filing software that sees it and uses it impeccably. I trust that, and was fairly sure that there would be something that would do the same with images.

I'm actually doing this for the kids really, bit of a health scre prompted it. Hosted online storage doesn't work for me because a) it's moderately expensive for the amount of data I have (although this will reduce a lot after I've sorted the problem this thread addresses, and b) it's not permanent. If I stop paying it would disappear. Who knows whether the kids would want to pick up the tab. After this sort out, everything will be going onto our own 'cloud' hosted on a server here.
boguing is offline  
Old 9th Sep 2014, 19:41
  #9 (permalink)  
 
Join Date: Jul 2008
Location: uk
Posts: 894
Likes: 0
Received 0 Likes on 0 Posts
While I have no personal experience of such things I have heard mention of software that will go through mountains of jpg and other files and point out which are duplicates.
vulcanised is offline  
Old 9th Sep 2014, 19:47
  #10 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
I've played with a few of them vulcanised, and those were very helpful in confirming the enormity of my problem, but didn't seem to offer any way to solve it, other than manually. Getting on for fifteen years of compounding the problem at an exponentially increasing rate means that the manual option is overwhelming!

I can't be accused of having inadequate backups though.
boguing is offline  
Old 9th Sep 2014, 21:37
  #11 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
I've played with a few of them vulcanised, and those were very helpful in confirming the enormity of my problem, but didn't seem to offer any way to solve it, other than manually.
The hash method most certainly can be fully automated.
mixture is offline  
Old 10th Sep 2014, 13:29
  #12 (permalink)  
 
Join Date: Apr 2008
Location: Uk
Age: 67
Posts: 218
Likes: 0
Received 0 Likes on 0 Posts
Interesting thread as I'm in much the same position as boguing. I never anticipated I would end up with so much stuff over several machines (Mac and Windows). Files need much tidying up, removing duplicates and then cataloguing.

Mixture, you mentioned 'hashing', never heard of it before now. Sounds more like a cookery term but never mind. I would be grateful for any pointers as to where to start.

Cheers!
Pelikal is offline  
Old 10th Sep 2014, 14:37
  #13 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
Mixture, you mentioned 'hashing', never heard of it before now. Sounds more like a cookery term but never mind. I would be grateful for any pointers as to where to start.
Well I've never heard of it as a cookery term until now (well, excluding Cornbeaf Hash, but that's a dish, not a technique).

But I digress. IT hashes are a very good thing to know about.... they are also file and platform agnostic, i.e. the same file on windows and mac will generate the same hash.

Am I correct in assuming you're in the same league as boguing when it comes to writing scripts and so need some nice pretty buttons to click on instead ?

If so, for Windows, perhaps try the TreeSize Pro from Jam Software I mentioned above.... I used it in the past, but that was in the days before they introduced the hashing function... so download the demo, turn on hashing and see how you get on.

For mac, I'll take a look around....
mixture is offline  
Old 10th Sep 2014, 15:37
  #14 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
Right. It's underway now.

Not as daunting as I thought! I'll come back and write a little guide as to how I did it once all the mistakes have been made.

Thanks to all.
boguing is offline  
Old 10th Sep 2014, 16:24
  #15 (permalink)  
Spoon PPRuNerist & Mad Inistrator
 
Join Date: Sep 2003
Location: Twickenham, home of rugby
Posts: 7,374
Received 234 Likes on 152 Posts
I've found the following hashing software to be very good:

checksum for Windows.. BLAKE2, SHA1 or MD5 hash a file, a folder, or a whole drive/volume, with an Explorer right-click.

Not free, but nearly - at least for personal use!

SD
Saab Dastard is offline  
Old 10th Sep 2014, 16:32
  #16 (permalink)  
 
Join Date: Aug 2002
Location: Earth
Posts: 3,663
Likes: 0
Received 0 Likes on 0 Posts
Pelikal,

For something non-script on the Mac, a brief perusal of the internet has come up with "Duplicate Detective" .... never used it myself, but currently on special offer on the App Store for a princely 1.99 .... so worth a punt I would say !
mixture is offline  
Old 10th Sep 2014, 17:30
  #17 (permalink)  
 
Join Date: Apr 2008
Location: Uk
Age: 67
Posts: 218
Likes: 0
Received 0 Likes on 0 Posts
Mixture and SD, thanks very much for pointers. boguing, hope it works out for you! Sorry if I butted in on this thread of yours and offered you no practical advice however I detected similarities to my own situation.

Pelikal is offline  
Old 19th Dec 2014, 11:43
  #18 (permalink)  
 
Join Date: Apr 2008
Location: Uk
Age: 67
Posts: 218
Likes: 0
Received 0 Likes on 0 Posts
boguing,

How did you get on, or is your project still a work in progress?
Pelikal is offline  
Old 19th Dec 2014, 22:04
  #19 (permalink)  
Thread Starter
 
Join Date: Jan 2006
Location: Dorking
Posts: 491
Received 0 Likes on 0 Posts
Damn. I knew somebody would ask eventually....

I got distracted by the arrival of a lovely Nikon Coolscan V film scanner, and got stuck into scanning all of my parent's slides. The aim is to hand the files over to rellies for Christmas, but that's looking optimistic. I've done about 1,200, so getting good at it. Naturally I'm filing these very carefully.

I did get hold of Lightroom, (Mixture's suggestion) and it does everything that I want for the future very well.

As far as the unjumbling of nigh on infinite copies of directories is concerned, I was only really comfortable with windows explorer in multiple windows, and just used that and some search tricks that I learned. I've spent far too long with conventional directory structures to give full control to something like Lightroom, although as I probably said earlier, I have been a huge fan of the way that Picasa does the searching and sorting (without moving files) since it first launched years ago.

I haven't finished the job and, as usual with things computer, can't remember quite how I was doing it. I'll return to it after Christmas and when my memory is refreshed I'll pass on the tips.

I hope that gives you a valid excuse for not doing any of yours until next year? If you want to get yourself an extra festive gift I would say that having several 3tb hdds is very handy for this job. I've used two which are exclusively pictures until the whole thing is finished. (I'm doing all the sorting on one and then copying the work in progress to the other, whilst leaving the originals where they were scattered around the pc and other drives for now).
boguing is offline  
Old 20th Dec 2014, 07:05
  #20 (permalink)  

Plastic PPRuNer
 
Join Date: Sep 2000
Location: Cape Town
Posts: 1,898
Received 0 Likes on 0 Posts
Visipics (free, Windows) - VisiPics - will sort out your image dupes for you. It works (I presume) by hashing and you can set the sensitivity of the comparison. Many other adjustable parameters in Autoselect mode. Excellent utility.

For combining many disparate folders with sub-folders and sub-sub-folders efficiently you can do it manually, write a script (debug debug debug) or turn to a Mac utility called the Big Mean Folder Machine (not free, but worth it). Temporarily fully Share "My Pictures" on your Windows machine on your network and point BMFM on the Mac at the folders you want to combine. RTFM. UNTICK all the "Subfolders" boxes on the Select Source Folders page (unintuitive behaviour) and make sure you do a Copy rather than a Move in case of accidents (you can always remove the original master folders manually later). BMFM will combine all the original multilevel folders into one big multilevel folder, taking care of any name or folder conflicts that it finds. There'll be a bit of tidying up to do afterwards in the final master folder [Aircraft 1, Aircraft 2, Aircraft 3 etc.) but nothing too onerous.

You may want to run Visipics again on your final master folder cause sure as hell some dupes will have crept in.

Sharp knives, don't cut fingers. RTFM. Sure beats doing it by hand.

Mac

Don't forget to re-restrict your Shares afterwards....

[BMFM is a great tool that can perform many other neat tricks - there are many ways of doing this in Windows (apps, scripting) but none as neat and easy as BMFM]

Last edited by Mac the Knife; 20th Dec 2014 at 07:23.
Mac the Knife is offline  

Thread Tools
Search this Thread

Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.