The Cellar

The Cellar (http://cellar.org/index.php)
-   Technology (http://cellar.org/forumdisplay.php?f=7)
-   -   file wrangling help requested (http://cellar.org/showthread.php?t=24596)

footfootfoot 02-23-2011 11:40 AM

file wrangling help requested
 
I've got a problem with one of my back up drives due to poor computer hygiene.

There are about 50,000 files on it, of which a huge number are duplicates residing in a series of nested folders.

For example,
L:\music\elvis costello\my aim is true has tracks 1,2,4,5,7,8,9

and
L:\music\itunes music\~E\elvis costello\my aim is true has tracks 1,2,3,6,10,11

and
L:\music\MISC Vinyl\elvis costello\my aim is true has a live version of track 3 with the same file name as track 3 above, but a different bit rate and time.

How can I get all these organized, deleting the dupes and not deleting the similarly named songs.

I've tried using digital volcano's Duplicate Cleaner with some success, but it still misses dupes when using MD5 or byte by byte, and it comes up with false positives when there are multiple versions of songs, especially a problem with greatest hits and compilation discs.

Any suggestions?

glatt 02-23-2011 11:49 AM

You need an intern. Post an opening at your local college. Spring semester is internship time.

Perry Winkle 02-23-2011 12:00 PM

I don't know of any out of the box software.

If you search around for a Python or Ruby script, I'm sure you could find one.

Basically you want something that will create an index of all of your music based on something like an md5 checksum of the file (the filename doesn't matter). Then it should remove duplicates, and maybe move them all to a consistent location.

Perry Winkle 02-23-2011 12:02 PM

This will give you a list of all of the duplicates.

footfootfoot 02-23-2011 12:34 PM

OK, I will get an intern to run that script for me.

Gravdigr 03-01-2011 01:38 AM

Hah!


All times are GMT -5. The time now is 01:40 AM.

Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.