SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Bilateral Match Only ? (And a few other first impression)

I have a library in pretty good condition. About 40k file, and 8k that does not match to musicbrainz (with wholefolder match only).

My problem now is that “all song in a folder match to the same album” is not a condition sufficient enough.

In front of me i have a 3cd box set that has been matche to a 9cd mega box set. Or a 1 cd album that has been matched over a 2 cd set. I tough about giving specific example, but really it’s the case of local copy match larger subset.

What I would like is an option “only match to complete release” so together with “whole folder” it would only match 1:1 corespondence. (In the matched report xx songs of yy, only match when xx==yy)

The reason i propose a different check box is that checking this option, but not the “whole folder” option would allow the use case of “many complete album in a folder” and “many source folder to a complete album”


The undo operation is a bit scary. Most button will do the " Main > Setting > Execute" Workflow. And this button goes Main > Execute directly without telling the user what it’ll do. I have no idea if it undo only last change, all changes, will it keep fingerprint ?

Anywais, after I spend 8 hours of processing, I most certainly do not want to undo everything. I beleive a great way to handle this could be to register the “SongKong:” protocol and have “SongKong:undo?b83ea2bb-fa1c-45af-b66d-030b6c3940b5” like links a bit everywhere (once per folder & once per song?)

At worst, make granular SongKong undo part of Jaikoz if you beleive that’s where per item manipulation belong to.


BTW monitor folder don’t work very well when window is busy copying files to that folder (think 10+ minutes of copy, 40Mo files) maybe SongKong is trying to process the files when they have not been fully copied


When using default renaming mask, SongKong can generate file too long for the file system (or so explorer.exe complain) . They are a pain to deal with because Explorer refuse to move, delete or rename them

One reason this happens is because some release list all artist separated by semincolon, coma or slash. Those could be truncated at first separator.


Also the “replace non ASCII character” option do not work very well. Especially for asiatic / arabic characters. I’d love an option to reject musicbrainz title if it’s made mostly of non latin unicode character.

[quote=Pho3NiX]I have a library in pretty good condition. About 40k file, and 8k that does not match to musicbrainz (with wholefolder match only).

My problem now is that “all song in a folder match to the same album” is not a condition sufficient enough.

In front of me i have a 3cd box set that has been matche to a 9cd mega box set. Or a 1 cd album that has been matched over a 2 cd set. I tough about giving specific example, but really it’s the case of local copy match larger subset.
[/quote]
Okay so I understand what you are saying but I’m not quite sure why it is doing that as it should prefer the release that matches all the songs and has the least songs left unmatched (i.e the 3cd set instead of the 9cd set), I guess it is because the 3cd box set is not in musicbrainz, but the 9 cd set is, could you check this for me ?

This would limit matches a little more, but I can see it being useful and have raised http://jthink.net:8081/browse/SONGKONG-545

I do have one question though:

This is easy enough for a one cd release, but if you have a cd release do you store each disc in a separate folder or not, if they are in separate folders it can be awkward to realize that multiple folders refer to one album but SongKong does try and do this.

Interesting I saw it as a further refinement/tightening of the “whole folder” option but I can see what you mean because at the moment if you have a load of songs the premise is that “whole folder” option should be disabled and songs would be matched on a song by song basis. Having sad that is is quite difficult to group songs to potential albums when a folder contains multiple albums. I can see a potential problem though, say you have a folder containing songs from multiple albums and then some songs from that folder match a single (completely matching all the songs on that single) but you dont match the album you wanted them to match.

It goes back to the first change made by SongKong, I see your point it needs options or at least confirmation. It worth noting you can undo just the songs you want to undo, so say you fixed a folder containing three subfolders you could just then select one of your subfolders and undo, Ive raised http://jthink.net:8081/browse/SONGKONG-546

As I say above you can simply Undo particular folders (oer even songs) you want to Undo, and Undo still works even after restarting SongKong coz all changes are stored in db, I dont really understand what you mean by SongKong protocol.

It what way does it ‘not work very well’

Right, are they just one out it shouldnlt do that, I assume you are actually running SongKong on Windows,noy on linux then looking at the songs on Windows ?

Do you have an example

You mean don’t rename, example of what you would like the option to do in this case.

Exactly. MusicBrainz is a good database perhaps one of the best, but still incomplete. The whole point of my intervention is heuristic to avoid catastrophic failures. First time I launched SK, i forgot to set the “whole folder” tag and about 20% of the song in the library got remaped from unknown compilation (to musicbrainz) to various very incomplete CD.

That’s another reason I propose Complete Album as a separate option. If the user collection is a such a mess that folder structure cannot be trusted (Or a new user don’t really know the effect of doing the match on the whole library) Then at least don’t introduce more mess. Only match when you are pretty sure a match is good.

Or Scratch that Complete Album idea, make it a configurable minimum trust level. “Only match album when at least xx% tracks are matched”. The complete album would be the special case xx=100%. 25% or 50% could be a sensible default. I’m pretty sure most user would consider less that 3 track our of 12 a false positive.

I’m a bit inconsistent on this. If I feel each disk has a thematic each will have it’s own folder. If a feel it’s a question of storage then all in the same folder.

But I do see your point I have a 3CD box that has been remaped to 3 1CD box. I don’t care that much because track order and everything is the same.

Then “Whole folder” match really should be “whole subtree” match. I suppose a deep of 1 could cover most case (Ie when some condition are met go up one parent folder, then match all subfolder)

Example of some condition to check initiate multi-disc subtree check

Metadata indicate disc > 1, with no disc "1 or " in the directory
Heuristic on folder name: contain “CD nn”, “C.D. nn”, “Disc nn”, “Vol nn”, “Volume nn”

Then check that parent folder is indeed a multi-disk release.

All music of subfolder of parent have the same album metadata or
All music subfolder have the previous heuristic with a proper sequence of nn

accept “/release volume 1, /release volume 2, /release volume 3”
accept “/foo (CD 1 of 4), /bar (CD 2 of 4), /gnu CD 3”

In the case of “/disc 1, /disc 2, /disk 3, /extra” then maybe cluster the 3 “disc” together and have the “extra” a separate release

refuse “/foo CD 1, /bar CD 1” (Cluster as two release)
refuse “/foo CD 1, /bar CD 3” (Cluster as two release)(??)
accept “/foo CD 1, /foo CD 3” (Cluster as one with missing cd)(??)
refuse "/disc 1, /Volume 2, /cd 3, " (Cluster as three)

In case of doubt refuse (do not agregate) that would imitate current “whole folder” behavior.

Does it work backward too ? The worst thing SK can do to me is to split an album to 10 other and spread them all over the place because of different artist. Can I undo on the source folder, or i have to process each destination ?

SongKong communicate the change to the user by the mean of an HTML report. So i tough a natural place to process undo is from inside that report. Now how to launch SK from an html page ?

Enter protocols. “http:” “file:” “ftp:” “mailto:” “irc:”
The first one (http) tell to open a web page. However “mailto:” will open a email client with the specified email to contact. “irc:” will open a chat client with the specified server. “ftp:” can be handled by browser or an external ftp client.

You can register your own protocol “SongKong” and register yourself as a protocol handler. I believe they are cross browser and cross platform. If anything only because how old are the needs of “mailto:” and “ftp:”. If you say you already support per folder undo then it would only be a matter of encoding the folder path in in a base64 compatible url from inside a link in the report.

Copying 100 track from a different drive to monitored folder resulted in about 15 track being caught by the monitor. Track where large (flac) and copy was slow (across drives) that’s why i think race condition. They where then successfully tagged by a full library “Fix Song”

Yeah SongKong on window only. I discovered i can “unlock” a file with a too long path by renaming the parent directory. (But explorer complain if i try to rename the file rather than the directory). So maybe they where created in the first place by renaming a directory.

mostly classic music, moslty comma separator

http://musicbrainz.org/release/bc77952a-614c-4c7a-9379-9e2c5825101f
http://musicbrainz.org/release/4ea6aae0-372c-419b-87d5-75d54d631cf8
http://musicbrainz.org/release/ec444da4-aa21-4b26-bcb5-13ac18fc8a2b
http://musicbrainz.org/release/ed3a20f0-4cb9-495f-9899-baa110488714

Ok an example

that album (not on musicbrainz I beleive)

first track
http://acoustid.org/track/2d57739d-aa27-44bc-9ace-f41f6bee8f2a
http://musicbrainz.org/recording/7a460dd8-de8c-4204-b4ff-ab1cc3ff0cca

Title (MBz): ??
Title (own tags): Playing with clouds
Title (Allmusic): Playing with Clouds

Now the original title is mostly correct, and musicbrainz data is worst than no match for someone who do not read Chinese.

What I would like is probably before the file naming stage.
Maybe allow the user to select from a mix of “latin/ russian/ greek/ arabic/ asiatic/ ???” alphabet and if most character from MBz answer are not in acceptable alphabet, refuse MBz match. (if possible try to get alternative name)

For example if latin is the only alphabet selected then
“Playing with clouds ??” is acceptable but " ??" is not.

Exactly. MusicBrainz is a good database perhaps one of the best, but still incomplete. The whole point of my intervention is heuristic to avoid catastrophic failures. First time I launched SK, i forgot to set the “whole folder” tag and about 20% of the song in the library got remaped from unknown compilation (to musicbrainz) to various very incomplete CD.
[/quote]
That option is enabled in the last few release, and I’m hoping that if I do http://jthink.net:8081/browse/SONGKONG-519 then there will be less requirement to ever turn that option off (currently you need to turn that option off to fix folders containing random songs with no structure if you want to impose a structure)

I’m not against the idea at all but I don’t think it causes much mess as you are not breaking up your folders just simply matching the three folders to the wrong 9-cd album instead of the right 3-cd album, many users might actually want that because the songs have been correctly identified and now they get extra information from Musicbrainz that is correct for both 3-cd album and nine-cd album such as album artist

The other use case also applies for single cd albums, i.e you have 10 tracks that should match the 10 track release but Musicbrainz only has the 12 track release so it matches to that instead. So it hasn’t really matched the wrong album, just the wrong version of the album (in MusicBrainz parlance the wrong release but correct Release Group). This option would prevent that happening, but it is more likely that the user is actually missing tracks, i.e they only have 9 of the songs so would be happy to match either the 10 or 12 track vesion rather than match nothing at all which would be the case with this option even if Musicbrainz contained every single release in the land. So it does nto work for it to be the default.

I don’t like that idea, too complex, i think either you only want a complete match or you dont mind a partial match, in your example if user only has three songs of the release it would not be a false positive. It may also be that their music is a bit messed up and they have one album split up over multiple folders and so independently matching folder 1 ( 3 songs) to album a, folder 2(5 songs) and folder 3 (2 songs) could end up with one complete 10 track album.

Thanks I’ll compare your heuristics with my current heuristics for detecting multi folder release.

Does it work backward too ? The worst thing SK can do to me is to split an album to 10 other and spread them all over the place because of different artist. Can I undo on the source folder, or i have to process each destination ?
[/quote]
As I said currently you have to select destinations folders[s] or files but that option will change.

That requires having to open the report , a key thing about Undo is that it still works if you close SongKong and restart it. The report is just a read only report, and trying to encompass logic like you suggest it just too hard

Yep, network + flac there is an issue

I’ll check

So okay I think I get it, you have Chinese album but you know the tracks by the English translation of the titles but MusicBrainz only has a version of the release containing the Chinese titles. MusicBrainz does usually provide an english version of the artist name but not the release or track names.

So we have a couple of options:
Let user set list of main character sets as you suggest then if release doesnt match these either:

  1. Refuse Musicbrainz match unelss song titles match valid charsets but I do find that a bit weird because really it has a found the correct match.
  2. Allow the match but don’t actually change song names/release names just add the other stuff that is always in english (english sort artist, catalogno, country of release, musicbrainz ids ectera)
  3. Allow match but do not rename unless filename matches the listed charsets

What do you think ?

That would be great !

If you want to further get away with always enabling “whole folder” maybe some kind of heuristic can be used if you have an edition with more track than MB. If say you have track 1-12 matched to track 1-12 of a 12 track MB album but on your local folder you also have track 13-15 that have the same album name as track 1-12. Then what ?

  • Maybe keep original album title (it was determined to be consistent and may include Special/Limited/Deluxe edition mark)
  • But match the rest as an album match for the 15 tittles ?
  • If 15 on your side, 12 on MB side is acceptable, is 20/12 acceptable, 30/12?

In theory I do see a difference in matching 10 files out of 12 and matching 10 file out of 40. In practice I’ll give you it don’t happens that often. But there are sinkholes waiting to happens like that 240cd Japanese monster: http://musicbrainz.org/release-group/0eb70f80-7d09-4554-8b54-ebc157098ab6

a key thing about the “Report” menu is that you can get any recent report and it still work if you close SongKong and restart it :slight_smile:

I believe the thing is, if I catch something fishy, I’ll assume I need manual correction, then I want to the before/after to see which one is easier to edit to correct information.

Most of the time I might want to reverse only album name, album artist, track #, disc #. This kind of relate to your first link about which field are track related and which are album related.

As for the feasibility, can you accept a working directory as a command line argument on a future build? Maybe also support path that are url-escaped (like %20 for space)

Maybe upon a match, conditionally merge Artist, Title, Album Name if they are mostly of the user approved alphabet. Other fields can be merged as is.
sort-name can be used instead of name where it would make the result compatible with desired charset.


In Format > Multi Disc Release
Add disc number to the title if disc has title

I check this option in case each disc has a particular thematic like

http://musicbrainz.org/release/e691b04f-64cf-4512-bc47-3f0098e964d3
Cuba: I Am Time
\tCD 1: Cuban Invocations
\tCD 2: Cantar En Cuba
\tCD 3: Bailar Con Cuba
\tCD 4: Cubano Jazz

However it the disc title is basically the disc# I’d prefer to ignore it.
http://musicbrainz.org/release/742a47f5-09d0-44ea-aede-9731c101d515

A point can be made that with that many disc, it’s useful to keep them as separate entries. However (disc 1: Volume 1) (disc 2: Volume 2) is just ugly. Maybe it’s the “disc n:” prefix that need to conditionally go.


Small UI suggestions:

Do not have the focus set on the “STOP” button in fix song progress window.
Sometime I wake from the screen saver (auto turn off of the screen) using spacebar and at least once it has accidentally stopped the matching process.

Artwork > Save artwork to filesystem
That textbox is strange, it auto pad the text with space to the rigth.
so the saved artwork are named accordingly
“folder .jpg”

It alwais puzzle me when “# completed” tracks is greather than “# loaded”

Done for next release.

I’m looking at doing the simple ‘all songs on album must be matched’ now for next release. So once Ive done this you can try the new release and see if it works for you.

Heh, yes but if you actually want to do something clever rather than just look at information then it will require interaction between the report and SongKong as well

Yes so that should no longer be a problem

I don’t quite follow, you mean accept ‘.’ ?

Mostly is a bit vague, when you use these vague heuristics it seem to work okay for a while then you start coming across more and more cases where the heuristics works very badly, so Im still looking more at change if no none charset chars if option available.

Yes. possibly

Ok, good point.

Someone else has reported this, will look into it.

Hmm, it shoudnt do that.

Both fixed in SongKong 1.23

I think pretty much everything on this list is now fixed in SongKong 1.25, I would be interested in your comments on the improvements.