SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Why has SongKong split album and why am I still gettting duplicates

Regarding Delete Duplicates, until I have a Fix Songs report that shows details of the songs you are trying to find duplicates for I cannot help you.

I just did send you the fix song and duplicates song reports by email. :wink:

You are right, indeed. So what should I do now to get rid of the dupes ? As I said before, I don’t care having 11 tracks here and 5 there. The only important thing is to have a single, best occurence of each album when I use my music player.

So, aside renaming the albums so they do not include the format anymore (which I did for this Bogdan folder by the way as I guessed this by myself as being a problem), how do I make sure the CD version gets removed in favour of the digital version ?

Thanks for report, okay found the problem with Delete Duplicates. So with the Same MusicBrainz song and same album (any version) it is looking for songs with the same MBRecordingId (same song), and same MBReleaseGroup (same album any version).

However in this confusing case (with all songs having the same name) it turns out that all songs have different MBRecordingId. This means that either they are all different version of the song or more probably that there are same versions of songs but when the 2nd version of song was added to MusicBrainz it was not merged with existing but added as a new song.

So this will happen sometimes. What you could do as a one-off is try to using Sounds the same only option, but his will not prevent songs from different albums (compilation, original album) being deleted so should run on preview first, and check results. but it will detect duplicates of songs that are sonically the same.

Actually, only last week I added Add Sounds the Same to Delete Duplicates by Metadata Only issue as it occurred to me that now this might be useful to use combination of Acoustids and textual Metadata as a safer alternative to Same song and same album (metadata only) for cases where no MusicBrainz or Discogs match. This would also work in your case where there is a MusicBrainz match but the mbrecordingIds are different.

Just occurred to me I can see the acoustids in the report

So we see both 01 tracks have same acoustid so they should be matched as duplicates

But many do not have acoustids, and in Fix Songs you don’t have Force Acoustid Fingerprints even if already matched enabled (i thought you did now ?)

So you need to rerun Fix Songs with this enabled and then run Delete Duplicates

Ah…nevereding story here we are ! :smiley:

We have had this discussion before, and you did say you had enabled this.

You can run Delete Duplicates and use Sounds the Same without doing this, however when song doesn’t have an acoustid fingerprint it generates an acoustid fingerprint ( this can be done locally) and uses that as a key. So it would identify two songs that were sonically the same if both didnt have fingerprint before running Delete Duplicates.

However, in your case some songs already have acoustid fingerprint and acoustid. We have to go to Acoustid server to get AcoustIds so we dont do this for Delete Duplicates because of the slow down this would cause, since we dont store the fingerprints/acoustids when creating on the fly in Delete Duplicates. So would not find duplicate because comparing Acoustid Fingerprint to Acoustid Id.

Now you might ask why not just compare fingerprints then. The trouble is there is not a 1:1 correspondance. There can be multiple AcoustId fingerprints to one Acoustid Id, when a fingerprint is infinitesimally different to another fingerprint they are mapped to same Acoustid. So if we only matched to fingerprints that would also prevent us finding some duplicates (in cases where fingerprint was different but Acoustid was the same)

I’v ran a new fix on the Bodgan folder, with these options :

Basic
Ignore songs previously checked that could not be matched: No
For songs already fully matched : Update Metadata only
Update Artwork: Yes
Update Genres: Yes
Update Mood and other acoustic attributes such as BPM: No
Only allow match if all songs in grouping match to one album: Yes
Only allow match if all tracks in album were matched: No
Match
Force Acoustic fingerprints even if already matched: Yes
Search for a MusicBrainz match: Yes
Update from Discogs: Yes
Search for a Discogs match: Yes
Ignore metadata derived from filename when matching individual songs: No
All existing folders represent a single album: No
Preferred media formats: Digital Media,
Preferred Release Date: Latest Release Date
Preferred Release Countries:
Album Artwork
Ignore artwork smaller than this (pixels): 200
Resize artwork if dimensions larger than (pixels): 1200
Find Front Cover Artwork: Yes
Save front cover art embedded within song file: Replace if empty
Save front cover art to filesystem: Yes, and overwrite existing artwork files in folder
Saved front cover art filename: folder
Find Back Cover Artwork: No
Other Artwork
Other Artwork Options: No
Ignore artwork smaller than this (pixels): 200
Resize artwork if dimensions larger than (pixels): 800
Genres
Genre: Always replace values From Discogs Style , Max no 1
Grouping: Never Replace From Discogs Style , Max no 1
Format
Never modify or add these fields:
Only modify these fields if empty: BPM, Key, Mood
Allow changes to songs existing metadata fields if Song Only match: Yes
Romanize non-Latin script artist names wherever possible: Yes
Use standard Artist name instead of name displayed on cover: Yes
Use Recording Artist instead of Track Artist: No
Use standard song title instead of title displayed on his release: No
When tracks contains featured artists: Only use main artist in the artist field and discard other artists
When albums contains featured album artists: Only use main artist in the album artist field and discard other artists
Album Format
Use standard Release title instead of title displayed on this release: Yes
Use Original Release Date: No
Use Year instead of full dates for Date fields: Yes
Add EP, Single, Compilation, Live and Remix release types to release title: No
Add Audio Format to release title: No
Add [HD] to album title for High Definition albums: No
Add RoonAlbumTag to albums identified as box sets: No
Multi Disc Releases : Never add Disc No information to the release title
Classical
Identify Classical releases: Classical albums identified by SongKong
Apply these options to releases identified as Classical: Yes
Add Composers to start of Album Title : Yes
Remove Composer from Album Artist : No
Add Composer to start of Overall Work, this is used by MinimServer for indexing Classical Works: No
Add Composer to start of MinimServer Group, this is used by MinimServer for playing Classical Works: No
Use only Artist Type to categorise groups as ensembles, choirs or orchestras: No
Shorten Song Title to the Movement: No
Copy Work to Grouping field: MP3 and AIF (iTunes)
Opera Work format: Use MinimServer format (Work/Overall Work)
Track Artist: Composer
Never modify or add these fields:
Only modify these fields if empty:
Save
MP3 Metatag Version: Same as or v24
Disc / Track number padding: Pad with up to one zero
Save Vorbis/Flac AlbumArtist as: ALBUMARTIST
After SongKong has finished processing songs: Always open Report in browser
Add technical roles with own field to InvolvedPeople field:
Save multiple values as separate fields:
Save songs so they work best with iTunes: No

Then ran a duplicate task. Still, no tracks were deleted.

I then ran it again using “rematch” instead of “update metadata only”:

it’s better, but still, boku mo wakaran has dupes and is not identified as duplicate. ^^

It should do it for songs already matched even when update metadata only, just checked code and appears ok.

I can see from previous report that most songs share duplicate acoustids, are you sure not already deleted in first run ?

Best just send me both FixSongs and DeleteDuplicate reports.

I’ve sent the last two tasks reports

HI, only sent me last DeleteDuplicates report and in that one you have Song is a duplicate if has same: set to Same MusicBrainz song and same album (any version) so this would not make use of Acoustids.

I meant for you to send me the last two DeleteDuplicates and the last FixSongs reports but I can see my message wasn’t that clear.

ok I send you the last duplicates run as well, which was run using " * Song is a duplicate if has same: Same MusicBrainz song and same album (any version) and sounds the same"

you get it by email.

Right, but that is not what I asked you run and that is not going to work because the all the songs on that particular album have different mbrecordingids, so the songs you consider the same have different mbrecordingid and therefore wiil not be considered a duplicate using the Same MusicBrainz Song part of criteria, in this case you can only find duplicates using the Sounds the Same criteria.

This is what I said in earlier post

However in this confusing case (with all songs having the same name) it turns out that all songs have different MBRecordingId . This means that either they are all different version of the song or more probably that there are same versions of songs but when the 2nd version of song was added to MusicBrainz it was not merged with existing but added as a new song.

So this will happen sometimes. What you could do as a one-off is try to using Sounds the same only option, but his will not prevent songs from different albums (compilation, original album) being deleted so should run on preview first, and check results. but it will detect duplicates of songs that are sonically the same.

But it will not work for first track because at only 5 seconds long is too short to safetly identify by Acoustid

yeah, well, why it got merged like this remains a mystery, and I undertstand I could only rely on sounds the same, but damn running this on a large set of tracks sounds scary as hell.

Do you mean why are both copies of the same album in same folder?

The short answer is because this is how your rename mask has said to rename the files. Now there is some logic in the code (which I should revisit) to rename files if there is a clash, but because one version has been matched to this version of the release https://musicbrainz.org/release/f20fda28-957f-4bc5-964a-77d1bf5197d3 which has all songs having title untitled and the other did not there is no filename clash to prevent all songs being put in one folder.

I was only recommending running it as one-off on this artist and preview only first, so I agree you should not run it on all your files, just trying to explain to you why the duplicates were not found

I ran it on a single artist. It removed all the duplicates. + some tracks that appeared on both EP and albums. So, it does the job, but does it “too well”.

I’ll resnatch the impacted albums.

Is there something we can do in songkong, to allow deletion of tracks located in the SAME folder only ? this would make sure we can use this accousticID setting while NOT deleting tracks that are located in separate folders? It would then only remove the “duplicates” sound the same tracks that are located in the same dir ?

Also, was "musibrainz album / any version / sound the same not suposed to delete these duplicate tracks based on their accousticID as well ?

I did warn you earlier:

I usually run Delete Duplicates in preview mode first.

I think in the general case the solution is the combined Same song and same album (metadata only) and sounds the same option that i mentioned earlier which provides a safer method, however that would not work for this particular case because the song titles are different. So maybe as you suggest a new option that restricted to find duplicates within folder would work similar to the Find Duplicates within Format option, could call it Find Duplicates within same Folder option

Yes, but because their musicbrainz recoridng id was different it failed. Although all songs had the same MusicBrainz Release Group Id duplicates did not have same mbrecordingid therefore cannot determine same MusicBrainz Song first.

Now instead of using mbrecordingid we could look at title field, but then we are relying on textual metadata this is less safe and wouldn’t work in this case anyway. Also would be confusing for user, since we have these clearly defined MusicBrainz Ids we should use them whilst providing less robust options for getting those hard to find duplicates if user wants to accept the greater risk.

1 Like

Ok raised https://jthink.atlassian.net/browse/SONGKONG-2517

1 Like

cool ! I assume this should be doable in this way :

  1. you add a general folder (lets say letter A folder)
  2. songkong processes the sub folders and will only take care of duplicates found in same folders only

nice one ! :slight_smile:

Yes, that is correct.