Absolutely!! Don't mess with the data!!!!!!!!!!!!!!!!!!!
I've been working on and off with some gusy who are developing software which can be used to go through a zillion scanned documents and images and come up with a flexible tagged index - but it is very specialised field of endeavour and the tags are limited - AND it requires a lot of "training"
they're getting there but it's a slow process