SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Why has SongKong split album and why am I still gettting duplicates

Ok here is the full story and I the support files are following. Lets take the artist bogdan raczynski as an example. Albums are located in their respective “Bogdan Raczynski” folder, which is the one I am processing for these tests.

Firstly, lets open this folder to analyse its content using musicbrainz picard :

You can see we do have 2 times the boku mo wakaran album here. 2 times the alright! album 2 times samurai math beats… Meaning picard is actually identifying these albums correctly.

Now, I run the following tasks on this specific artist foder, in that order :

  1. fix tracks

  2. delete duplicates
    3 rename

  3. Fix tracks:


we can clearly see that songkong did merge boku mo wakaran, merged samurai math beats, and broke alright by puting 11 tracks identified as one album, and 5 in another folder (while for the same artist folder picard did split it correclty as 8/8 tracks)

  1. delete duplicates :

For this run, I eventually did select "same MB album, ANY version. and here are the results :

  1. rename files :

At the end of the process, files are put in the same folder. EG: /Bogdan Raczynski/Boku Mo Wakaran/ contains both the 26 files named untitled and the ones named boku mo wakaran #, …

we can clearly see that the musicbrainz id is different for each version of the album insode songkong itself :

I am pretty confused, why is picard identifying these albums as separate albums, and why did songkong merge them in a single folder ? Why isn’t it getting rid of these extra files, keping best possible quality / or latests create folder / file (if perfectly same quality they are) ?

I must be missing something, but the goal now, is obvisously to get rid of the duplicates, and avoid getting several times the same tracks playing in my music player.

Reports are on their way ! I’m sure you’ll know which setting I missed. I must be missing something easy. But I can’t figure what ! :slight_smile:

Just quickly from Fix Songs screenshots - I’m not sure if that is from Browse By Album or by Browse By Folder but it seems it has split Alright albums into 11/5 rather than 8/8 simply because 11 of the tracks are in flac format and 5 are in mp3 format. Rather weird if you are saying you have an 8 track album mixed between flac and mp3 but if that’s the case using audio format in the release name is going to be an issue.

If Browse By Album realize we are just grouping by the value of the album field because that is how your music player is going to group the songs, it’s not going to care about the Musicbrainz ids.

If you look at Matched to Musicbrainz section instead you’ll probably see matched 8/8 to the two albums versions.

I was briwsing by album in songkong, indeed. now a closer looks using picard for Allright:

We can clearly see one version nis identified as the digital version of the album while the other one is identified as the CD version.

here is the Browse by folder view in songkong. you can still see SK merged Boku Mo wakaran in a single folder, and Allright on two separate folders (11 and 5 tracks) :

In th end, folder is not the biggest deal to me. As soon as my music player will be able to sort it out based on musicbrain IDs (there is a PR for this based on a request I made to navidrome devs), this should get it all sorted.

What concerns me more is the fact I keep having dupes. While it’s “Allright” (sorry) to have albums split between flac and mp3 (Eg: AFX analord series has extra tracks that were only released on the web, in mp3 320kbps format), I still would like to avoid having the same tracks again and again.

I checkd, and all my medias that are currently in double are both present in Digital and CD format. So it brings me to this :

image

Could it be that a tag, saved at fix task, could lead the delete duplicate task, later, to keep two version of the albums ? Would it simply be this ?

EDIT : So, I ran a fix task again, keeping ONLY “Digital media” checked. Then ran another delete duplicates task, and nope, :

Any great idea welcome @paultaylor :wink:

Okay I have received your report but I cannot find seem to find the reports which match these screenshots, which reports are they?

In the meantime lets go back to why Fix Songs has split Alright albums into 11/5 rather than 8/8 because I don’t think you understand what I was saying. I don’ think it has split them, if you use the Matched to Musicbrainz section of report I think you will find Alright listed twice with fully matched twice. As I said before the Browse By Album groups songs by the value of the album field stored in your songs, most collections will have a mixture of songs matched to MusicBrainz/Acoustid/Discogs and not matched at all but when you play these songs in your music player they will group songs by the value of the album field so Browse by Album is the best approximation of what you will see in music player.

Now, the reason why when you view browse by Album and they are split 11, 5 is because you have the Add Audio Format to Release Title option enabled and unusually for one of the albums you do not have the files in a consistent audio format. Remember, this is the audio format the files are in not the format of the Musicbrainz Release matched to (Digital, Vinyl etc). So I think that clears that up, so how can you avoid this issue

  1. Disable the Add Audio Format to Release Title option so then release name consistent for al files in release.
  2. Reencode the files so all the same format

Now, we did have a similar issue with albums that contained a mix of HD/NonHD albums that we resolved by using MIXED for all songs if there was a mix. So could possibly do the same thing for mixed audio formats but this doesnt exist yet.

Regarding Delete Duplicates, until I have a Fix Songs report that shows details of the songs you are trying to find duplicates for I cannot help you.

I just did send you the fix song and duplicates song reports by email. :wink:

You are right, indeed. So what should I do now to get rid of the dupes ? As I said before, I don’t care having 11 tracks here and 5 there. The only important thing is to have a single, best occurence of each album when I use my music player.

So, aside renaming the albums so they do not include the format anymore (which I did for this Bogdan folder by the way as I guessed this by myself as being a problem), how do I make sure the CD version gets removed in favour of the digital version ?

Thanks for report, okay found the problem with Delete Duplicates. So with the Same MusicBrainz song and same album (any version) it is looking for songs with the same MBRecordingId (same song), and same MBReleaseGroup (same album any version).

However in this confusing case (with all songs having the same name) it turns out that all songs have different MBRecordingId. This means that either they are all different version of the song or more probably that there are same versions of songs but when the 2nd version of song was added to MusicBrainz it was not merged with existing but added as a new song.

So this will happen sometimes. What you could do as a one-off is try to using Sounds the same only option, but his will not prevent songs from different albums (compilation, original album) being deleted so should run on preview first, and check results. but it will detect duplicates of songs that are sonically the same.

Actually, only last week I added Add Sounds the Same to Delete Duplicates by Metadata Only issue as it occurred to me that now this might be useful to use combination of Acoustids and textual Metadata as a safer alternative to Same song and same album (metadata only) for cases where no MusicBrainz or Discogs match. This would also work in your case where there is a MusicBrainz match but the mbrecordingIds are different.

Just occurred to me I can see the acoustids in the report

So we see both 01 tracks have same acoustid so they should be matched as duplicates

But many do not have acoustids, and in Fix Songs you don’t have Force Acoustid Fingerprints even if already matched enabled (i thought you did now ?)

So you need to rerun Fix Songs with this enabled and then run Delete Duplicates

Ah…nevereding story here we are ! :smiley:

We have had this discussion before, and you did say you had enabled this.

You can run Delete Duplicates and use Sounds the Same without doing this, however when song doesn’t have an acoustid fingerprint it generates an acoustid fingerprint ( this can be done locally) and uses that as a key. So it would identify two songs that were sonically the same if both didnt have fingerprint before running Delete Duplicates.

However, in your case some songs already have acoustid fingerprint and acoustid. We have to go to Acoustid server to get AcoustIds so we dont do this for Delete Duplicates because of the slow down this would cause, since we dont store the fingerprints/acoustids when creating on the fly in Delete Duplicates. So would not find duplicate because comparing Acoustid Fingerprint to Acoustid Id.

Now you might ask why not just compare fingerprints then. The trouble is there is not a 1:1 correspondance. There can be multiple AcoustId fingerprints to one Acoustid Id, when a fingerprint is infinitesimally different to another fingerprint they are mapped to same Acoustid. So if we only matched to fingerprints that would also prevent us finding some duplicates (in cases where fingerprint was different but Acoustid was the same)

I’v ran a new fix on the Bodgan folder, with these options :

Basic
Ignore songs previously checked that could not be matched: No
For songs already fully matched : Update Metadata only
Update Artwork: Yes
Update Genres: Yes
Update Mood and other acoustic attributes such as BPM: No
Only allow match if all songs in grouping match to one album: Yes
Only allow match if all tracks in album were matched: No
Match
Force Acoustic fingerprints even if already matched: Yes
Search for a MusicBrainz match: Yes
Update from Discogs: Yes
Search for a Discogs match: Yes
Ignore metadata derived from filename when matching individual songs: No
All existing folders represent a single album: No
Preferred media formats: Digital Media,
Preferred Release Date: Latest Release Date
Preferred Release Countries:
Album Artwork
Ignore artwork smaller than this (pixels): 200
Resize artwork if dimensions larger than (pixels): 1200
Find Front Cover Artwork: Yes
Save front cover art embedded within song file: Replace if empty
Save front cover art to filesystem: Yes, and overwrite existing artwork files in folder
Saved front cover art filename: folder
Find Back Cover Artwork: No
Other Artwork
Other Artwork Options: No
Ignore artwork smaller than this (pixels): 200
Resize artwork if dimensions larger than (pixels): 800
Genres
Genre: Always replace values From Discogs Style , Max no 1
Grouping: Never Replace From Discogs Style , Max no 1
Format
Never modify or add these fields:
Only modify these fields if empty: BPM, Key, Mood
Allow changes to songs existing metadata fields if Song Only match: Yes
Romanize non-Latin script artist names wherever possible: Yes
Use standard Artist name instead of name displayed on cover: Yes
Use Recording Artist instead of Track Artist: No
Use standard song title instead of title displayed on his release: No
When tracks contains featured artists: Only use main artist in the artist field and discard other artists
When albums contains featured album artists: Only use main artist in the album artist field and discard other artists
Album Format
Use standard Release title instead of title displayed on this release: Yes
Use Original Release Date: No
Use Year instead of full dates for Date fields: Yes
Add EP, Single, Compilation, Live and Remix release types to release title: No
Add Audio Format to release title: No
Add [HD] to album title for High Definition albums: No
Add RoonAlbumTag to albums identified as box sets: No
Multi Disc Releases : Never add Disc No information to the release title
Classical
Identify Classical releases: Classical albums identified by SongKong
Apply these options to releases identified as Classical: Yes
Add Composers to start of Album Title : Yes
Remove Composer from Album Artist : No
Add Composer to start of Overall Work, this is used by MinimServer for indexing Classical Works: No
Add Composer to start of MinimServer Group, this is used by MinimServer for playing Classical Works: No
Use only Artist Type to categorise groups as ensembles, choirs or orchestras: No
Shorten Song Title to the Movement: No
Copy Work to Grouping field: MP3 and AIF (iTunes)
Opera Work format: Use MinimServer format (Work/Overall Work)
Track Artist: Composer
Never modify or add these fields:
Only modify these fields if empty:
Save
MP3 Metatag Version: Same as or v24
Disc / Track number padding: Pad with up to one zero
Save Vorbis/Flac AlbumArtist as: ALBUMARTIST
After SongKong has finished processing songs: Always open Report in browser
Add technical roles with own field to InvolvedPeople field:
Save multiple values as separate fields:
Save songs so they work best with iTunes: No

Then ran a duplicate task. Still, no tracks were deleted.

I then ran it again using “rematch” instead of “update metadata only”:

it’s better, but still, boku mo wakaran has dupes and is not identified as duplicate. ^^

It should do it for songs already matched even when update metadata only, just checked code and appears ok.

I can see from previous report that most songs share duplicate acoustids, are you sure not already deleted in first run ?

Best just send me both FixSongs and DeleteDuplicate reports.

I’ve sent the last two tasks reports

HI, only sent me last DeleteDuplicates report and in that one you have Song is a duplicate if has same: set to Same MusicBrainz song and same album (any version) so this would not make use of Acoustids.

I meant for you to send me the last two DeleteDuplicates and the last FixSongs reports but I can see my message wasn’t that clear.

ok I send you the last duplicates run as well, which was run using " * Song is a duplicate if has same: Same MusicBrainz song and same album (any version) and sounds the same"

you get it by email.

Right, but that is not what I asked you run and that is not going to work because the all the songs on that particular album have different mbrecordingids, so the songs you consider the same have different mbrecordingid and therefore wiil not be considered a duplicate using the Same MusicBrainz Song part of criteria, in this case you can only find duplicates using the Sounds the Same criteria.

This is what I said in earlier post

However in this confusing case (with all songs having the same name) it turns out that all songs have different MBRecordingId . This means that either they are all different version of the song or more probably that there are same versions of songs but when the 2nd version of song was added to MusicBrainz it was not merged with existing but added as a new song.

So this will happen sometimes. What you could do as a one-off is try to using Sounds the same only option, but his will not prevent songs from different albums (compilation, original album) being deleted so should run on preview first, and check results. but it will detect duplicates of songs that are sonically the same.

But it will not work for first track because at only 5 seconds long is too short to safetly identify by Acoustid

yeah, well, why it got merged like this remains a mystery, and I undertstand I could only rely on sounds the same, but damn running this on a large set of tracks sounds scary as hell.

Do you mean why are both copies of the same album in same folder?

The short answer is because this is how your rename mask has said to rename the files. Now there is some logic in the code (which I should revisit) to rename files if there is a clash, but because one version has been matched to this version of the release https://musicbrainz.org/release/f20fda28-957f-4bc5-964a-77d1bf5197d3 which has all songs having title untitled and the other did not there is no filename clash to prevent all songs being put in one folder.

I was only recommending running it as one-off on this artist and preview only first, so I agree you should not run it on all your files, just trying to explain to you why the duplicates were not found

I ran it on a single artist. It removed all the duplicates. + some tracks that appeared on both EP and albums. So, it does the job, but does it “too well”.

I’ll resnatch the impacted albums.

Is there something we can do in songkong, to allow deletion of tracks located in the SAME folder only ? this would make sure we can use this accousticID setting while NOT deleting tracks that are located in separate folders? It would then only remove the “duplicates” sound the same tracks that are located in the same dir ?

Also, was "musibrainz album / any version / sound the same not suposed to delete these duplicate tracks based on their accousticID as well ?