Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 21:43:26 GMT
I have a directory tree which has around 15,000 photographs in it. They are inconsistently named, the directory structure makes no sense and there are a shitload of duplicates. Using Easy Duplicate Finder, CCleaner Duplicates and Auslogics Duplicate Finder I have got through around 2,000 duplicates. But I reckon I still have 3,000 more, or at least of that magnitude, where the duplicate could be somewhere entirely different in the directory tree, and in fact there could be 3 or 4 duplicates of one picture. Here is an example of a duplicate that I cannot get a program to find automatically. These *are* the same picture. As you can see, they are virtually identical, but not quite. For some reason I have no clue about they are 40bytes different in size, though size on disk is the same. The dates are also slightly different with the exception of Created / taken dates which are identical. Any idea of how I could automatically find these duplicates? Remember, they will be named differently and in different directories. Because the thought of having to do it manually fills me with fear - if it is even possible. If I have to do it manually, then I can only see searching the files within File Explorer, so seeing them without a directory structure, sorting them by creation date, and then manually scrolling through in "Extra Large Icon" view visually looking for duplicates and deleting them. It is so unlikely that two different files could have identical creation dates to the second that I am prepared to discount that possibility.
|
|
|
Post by Hofmeister on Nov 30, 2017 21:57:28 GMT
I have a directory tree which has around 15,000 photographs in it. They are inconsistently named, the directory structure makes no sense and there are a shitload of duplicates. Using Easy Duplicate Finder, CCleaner Duplicates and Auslogics Duplicate Finder I have got through around 2,000 duplicates. But I reckon I still have 3,000 more, or at least of that magnitude, where the duplicate could be somewhere entirely different in the directory tree, and in fact there could be 3 or 4 duplicates of one picture. Here is an example of a duplicate that I cannot get a program to find automatically. These *are* the same picture. As you can see, they are virtually identical, but not quite. For some reason I have no clue about they are 40bytes different in size, though size on disk is the same. The dates are also slightly different with the exception of Created / taken dates which are identical. Any idea of how I could automatically find these duplicates? Remember, they will be named differently and in different directories. Because the thought of having to do it manually fills me with fear - if it is even possible. If I have to do it manually, then I can only see searching the files within File Explorer, so seeing them without a directory structure, sorting them by creation date, and then manually scrolling through in "Extra Large Icon" view visually looking for duplicates and deleting them. It is so unlikely that two different files could have identical creation dates to the second that I am prepared to discount that possibility. A you say they have the same created date AND time so you know they are duplicates. I use clone spy which creates create two pools of same or similar files (based on lots o criteria) and then you can dlete files from one of the pools using other criteria.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:01:07 GMT
I don't know that one, I'll go and have a play. Thanks.
|
|
|
Post by tyrednexited on Nov 30, 2017 22:10:44 GMT
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:18:32 GMT
In Clone Spy I can't see how to compare files using creation date. Do you know how?
I'll have a look at the other one from TnE
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:25:50 GMT
Early signs from Visipics are good, its identifying duplicates that the others missed. I have high hopes.
Though I am not sure its going to be easy to work out which file is the best to keep. But step by step.....
|
|
|
Post by Hofmeister on Nov 30, 2017 22:26:57 GMT
In Clone Spy I can't see how to compare files using creation date. Do you know how? I'll have a look at the other one from TnE clone spy builds its own check sums, and creation time and date is one of the things it uses. I'm sure there is a pattern about where these files end up thats obvious you can use to create the two pools for comparison, there usually is. Its a bugger I know, been there done that and had to use multiple tools and some manual graft to get it sorted. There is no magic bullet.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:36:47 GMT
I'll have another look at it, and pay proper grown up attention this time.
I think Visipics will get at least some of it done. However it is bastard slow and I just rechecked, there are actually 47,000 pictures and so far it has checked 392. Also, I think it will take multiple runs loosening the criteria a bit each go. [sigh]
|
|
|
Post by Hofmeister on Nov 30, 2017 22:38:45 GMT
Early signs from Visipics are good, its identifying duplicates that the others missed. I have high hopes. Though I am not sure its going to be easy to work out which file is the best to keep. But step by step..... Downloaded it and giving it a whirl, be careful its identifying "similar" pictures as dupes. ( set on lose) edit, some of them not very similar at all. Appears to be using tonal regions as comparators, similar to forensic porn finders. (they search for mostly skin tones)
|
|
|
Post by Hofmeister on Nov 30, 2017 22:41:54 GMT
seems to compare at 4 mins per thousand images on my computer.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:42:41 GMT
Having re-examined I think Clone Spy will either not do it, or its too complex for me. However, in the Clone Spy FAQ was this.... Why doesn't CloneSpy detect my "duplicate" images/songs? Please be aware that images might look the same although the files are different. If you want to find "similar" images then you need an other program. Maybe one of these might help:
d'peg − www.somewareonthe.net Doublepics − www.doublepics.de
Now that second one is in German, but the first looks optimistic if Visipics can't hack it.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 22:44:56 GMT
>seems to compare at 4 mins per thousand images on my computer.
It is moving faster now, though not that fast, I don't think. Perhaps looking across multiple disks is slowing it up some. I'll think about what I might do about that.
|
|
|
Post by tyrednexited on Nov 30, 2017 22:55:10 GMT
Early signs from Visipics are good, its identifying duplicates that the others missed. I have high hopes. Though I am not sure its going to be easy to work out which file is the best to keep. But step by step..... ..if it were me, I wouldn't delete anything, I'd simply build a tree free of duplicates using 'best guess', and store the rejected 'duplicates' elsewhere....
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 23:04:27 GMT
Visipic generates a list of duplicates. Each file shows a thumbnail so you can pretty quickly see that they are the same picture (or not). It also shows which director each is in, its filename, file size, resolution and type.
I shall move them elsewhere as a first stage, and then do some checking. But I'm pretty sure I shall be confident enough to delete them. It has identified 1063 duplicated files, some of them duplicated more than once resulting so far in 2436 'unwanted' files.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Nov 30, 2017 23:05:28 GMT
But it has already made so much a difference that you can claim a couple of virtual pints whenever you're ready.
Thank you.
I shall keep working at Clone Spy though, I think its pretty powerful once you get the hang of it.
|
|