The Cellar

The Cellar (http://cellar.org/index.php)
-   Technology (http://cellar.org/forumdisplay.php?f=7)
-   -   Tin Eye for your hard drive. Sort of... (http://cellar.org/showthread.php?t=24398)

footfootfoot 01-17-2011 08:44 PM

Tin Eye for your hard drive. Sort of...
 
I'm trying out this new duplicate image finder software that appears to work like TinEye-- it looks at the pixels rather than file name or hash. It is scanning my drive now. It's been about an hour and it has scanned 14,000 out of 24,000 images and found 3700 similars. I am guessing it doesn't "see" RAW files, but it does "see" DNG so there's a case for DNG conversion as a matter of course in asset management.

I'll report back when the thing is done and let you know how well it works and how useful it is. Having something like this working with picasa or lightroom would be hella awesome.

awesome duplicate photo finder

footfootfoot 01-17-2011 09:48 PM

OK
notes so far:
Took 1:45 to scan 24,354 jpgs and bmp. didn't scan gif, raw or psd files. It found 5626 duplicates and rated their similarity from 1% to 100% Pretty good, but gave 14% accuracy to the same image where one version was full sized and the other was 750x500. Conversely it gave 100% accuracy to two adjacent frames where the images were clearly different. So it seems to weight size more heavily than image information.

There is a confirm delete button and an acknowledgement of delete button and no provision for multiple selections so numerous deleting is a chore.

It only compares two images at a time, not sure what happens when you have five versions of the same image.
You can't save a search, so the next time I do this I have to allow for nearly two hours to scan my drive.

xoxoxoBruce 01-18-2011 01:52 PM

Sounds a little cumbersome for 24k images, but might be ok for a couple k.

Flint 01-18-2011 10:50 PM

You gonna organize your files better from now on?

footfootfoot 01-19-2011 08:34 AM

I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.
I promise I will name all my files and put them in the proper folder.

Pete Zicato 01-19-2011 09:14 AM

I can tell you from experience that filing only helps. It does not eliminate the problem.

The major issue is the fallacy of categorization. There isn't anything that fits in only one category.

Shawnee123 01-19-2011 09:26 AM

Tin Eye? Tin Eye?

Braceface!

Clodfobble 01-19-2011 01:10 PM

Quote:

Originally Posted by Pete Zicato
The major issue is the fallacy of categorization. There isn't anything that fits in only one category.

Y'all are weird. I organize by the date the photo was taken, which is part of the filename. Each month gets a folder, and that folder is named by a descriptive list of what's inside, like "01-2011 - Snowballs, Grandma's house, D's Birthday, Birds."

glatt 01-19-2011 01:15 PM

Date is easiest by far. It's how we live our lives. We steadily march through time. Organizing your pictures this way tells a story.

Pete Zicato 01-19-2011 02:54 PM

Quote:

Originally Posted by Clodfobble (Post 706684)
Y'all are weird. I organize by the date the photo was taken, which is part of the filename. Each month gets a folder, and that folder is named by a descriptive list of what's inside, like "01-2011 - Snowballs, Grandma's house, D's Birthday, Birds."

Right. That works. Until you have a photo of a snowball fight at Grandma's House on D's Birthday. And there were birds in the background. Now what?

I realize that this is an uncommon occurrence. But it is well to remember that the uncommon does occur from time to time.

Shawnee123 01-19-2011 03:06 PM

Were the birds falling to the ground, dead?

xoxoxoBruce 01-19-2011 03:13 PM

Quote:

Originally Posted by Clodfobble (Post 706684)
Y'all are weird. I organize by the date the photo was taken, which is part of the filename. Each month gets a folder, and that folder is named by a descriptive list of what's inside, like "01-2011 - Snowballs, Grandma's house, D's Birthday, Birds."

Quote:

Originally Posted by glatt (Post 706685)
Date is easiest by far. It's how we live our lives. We steadily march through time. Organizing your pictures this way tells a story.

You're dealing with family snapshots, 3foot is a photographer. He's got all sorts of photos taken by him, and others, for reference, fun, and occasional profit. That's why he's dealing with 24k+.

Clodfobble 01-19-2011 05:16 PM

Quote:

Originally Posted by Pete Zicato
Until you have a photo of a snowball fight at Grandma's House on D's Birthday. And there were birds in the background. Now what?

Nothing changes. It still goes in the January of 2011 folder, and it's extra easy because the name of the folder doesn't even need anything appended to it.

Perry Winkle 01-20-2011 10:17 AM

My wife uses Lightroom (I think) to organize her photos. It lets her tag (like a category/folder but many-to-many) and create sets of photos.

She's using it to put together a photo album DVD of our wedding pictures. It looks pretty slick to me.

Pete Zicato 01-20-2011 10:17 AM

Quote:

Originally Posted by Clodfobble (Post 706775)
Nothing changes. It still goes in the January of 2011 folder, and it's extra easy because the name of the folder doesn't even need anything appended to it.

I see. I read it too quick and thought you had subfolders.

Still falls into the fallacy of categorization, though. That's how I have ours set up as well, 'cause I think that's about the best you can do in practical terms. But when I went to make a slideshow of Zing1 for her graduation, I had to go through all the folders looking for pictures of her.

Shawnee123 01-20-2011 10:22 AM

Subfodders.

footfootfoot 01-20-2011 10:30 AM

The problem that I have isn't about finding vacation pictures from June 2010 or 2010_06_15. The problem is when I have an image that is 750x500 of a pepper and it has been sized for the web, but I need to find the original image that this was downsized from. In this case, it was a scan that was probably left as "Untitled_01.psd" but maybe not. And where was it left? on one of 4 drives I have. Did it get accidentally moved to a completely unrelated folder?

So I have the low res image and I ask the program to search for all instances of that image. If it works as I'd like it to, it would scan all my drives and say
"You'll never guess where I found that file!"

What Pete and Clod are talking about is not really a concern of most pro photographers. What should happen with proper DAM is that when I bring images into the database they all get keywords, later on when I need to search for photos I can, in theory, search by keywords, or exif data (lens, focal length, f:stop, date, camera make and model, etc) but again, this only works if I remembered to do this. And it only works for folders that are in the database, it doesn't work if an image got dragged into some wacky folder like a C:Hobos/Ohio/notdeadyet/stillstinky.

Griff 01-22-2011 10:55 AM

Quote:

Originally Posted by Clodfobble (Post 706684)
Y'all are weird. I organize by the date the photo was taken, which is part of the filename. Each month gets a folder, and that folder is named by a descriptive list of what's inside, like "01-2011 - Snowballs, Grandma's house, D's Birthday, Birds."

You and Pete would so hit it off. Maybe in another twenty years, I'll be rational.:)

Clodfobble 01-24-2011 12:01 AM

I'd hit it off with any woman who makes her own goat yogurt!

BigV 01-26-2011 04:51 PM

Quote:

Originally Posted by footfootfoot (Post 706977)
The problem that I have isn't about finding vacation pictures from June 2010 or 2010_06_15. The problem is when I have an image that is 750x500 of a pepper and it has been sized for the web, but I need to find the original image that this was downsized from. In this case, it was a scan that was probably left as "Untitled_01.psd" but maybe not. And where was it left? on one of 4 drives I have. Did it get accidentally moved to a completely unrelated folder?

So I have the low res image and I ask the program to search for all instances of that image. If it works as I'd like it to, it would scan all my drives and say
"You'll never guess where I found that file!"

What Pete and Clod are talking about is not really a concern of most pro photographers. What should happen with proper DAM is that when I bring images into the database they all get keywords, later on when I need to search for photos I can, in theory, search by keywords, or exif data (lens, focal length, f:stop, date, camera make and model, etc) but again, this only works if I remembered to do this. And it only works for folders that are in the database, it doesn't work if an image got dragged into some wacky folder like a C:Hobos/Ohio/notdeadyet/stillstinky.

Hello my friend.

PZ and CF are talking about how to organize pictures for the way I take pictures. I get that. But some of what we all are doing here *does* apply to you. We've discussed this before; you've acknowledged this before. The key: Picasa is your friend.

Did you know you can search your Picasa database for "iso: 80" and it will return only those images? Or camera model? NOT ALL of the EXIF data is searchable. Lots of it is though. Since you're principally concerned with searching for your photos, I think Google/Picasa is good company to keep.

Not only is much of the EXIF data available to you, but any TAGS you associate with a given image, or folder of images, or album of images... those tags are searchable too. Here is a good starter discussion on the search capabilities of Picasa. Forgive me, but you'll have to manually scan for the Search section (I know--ironic).

Now, to get good results from Picasa, you should set it up nicely. Like, not having your images spread across four different drives. Space is pretty cheap, I'm *sure* you can find a single drive to host your library. Don't forget to buy two so you can do the backups, of course.

Another couple ideas I had was to find a naming convention for resized images. I only have one resize size, I make my images 800x600 for posting here and I use the same image name and append "sm" to the end of the filename indicating "small". You might well tag the original import as part of your workflow with the string "orig" then you just scan for originals. And some discipline on a folder structure helps too.

I haven't even gotten to "experimental" options! I love Picasa. It manages my large library (85k images out of 142k files) nicely.

footfootfoot 01-26-2011 04:54 PM

Yeah, I have all that in picasa and lightroom. The real thing I occasionally need to do is find another version of an image whose exif data doesn't exist. Like I show you a photo and say go find every version of this image, all the edited iterations.

BigV 01-26-2011 05:10 PM

oohhhhhhh....

For a mountain of images you already have on your system.....


Well, sounds like tin eye is a better tool for this process. If you have the originals, ... can't you make your "albums" of originals? then just make your smaller ones on demand? And when you do make that smaller one, maintain enough similarity of filename so you can search on it?

For going forward, I have some ideas. For retrospective access to your already stored collections... I got nuffin.

Yet.


All times are GMT -5. The time now is 09:25 AM.

Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.