SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Detection of opera works

There is a remaining issue with false positive detection of some “opera” works. For example both Beethoven’s sixth symphony “Pastoral” and Stravinsky’s “Rite of Spring” both set OVERALL_WORK=“Fantasia”. This is a consequence of a bad entry in MusicBrainz which erroneously listed both as “parts of” a work for the “Fantasia” soundtrack:

https://musicbrainz.org/work/74e65a9e-608d-4097-b238-c1bb6536ce7c .

This particular instance has been fixed in MusicBrainz, and should disappear with the next rebuild of Albunack. However it is far from the only instance of classical works being listed as “Parts of” soundtracks. MB recently created a new relationship type “is included in” to replace “is part of” which should slowly alleviate the problem in time as the musicbrainz database gets corrected. However I fear we should expect persistent misidentifications as an opera type ongoing, as film soundtracks are added to MusicBrainz by naive editors. Recursing up a “part of” chain in Musicbrainz will always be an unreliable means of detecting opera type works until the stopping point is correctly identified.

However there is a simple algorithmic fix to SongKong which would sort this out for once and for all by testing the Musicbrainz “Work Type” field. I propose that only a work with a type of

  • Ballet
  • Mass
  • Opera
  • Operetta
  • Oratorio
  • Song-cycle

should be considered for an opera type work. The list could even be customizable, making this a fairly low-maintenance solution.

This would also solve other outstanding misidentification issues in my library as for example, the WORK_TYPE of both
The Anna Magdalena Notebook and Two Piano Sonatas, op. 27 is null. These are really “Collections” not works (in the more general musical terminology sense), and need to be excluded too.

Isn’t this fixed by this issue ?

Isn’t this fixed by this issue ?

No, that only addresses a subset of the cases where there is a “part of” relationship that goes beyond the “overall work”. A positive identification based on the type of the work would subsume that and more.

I haven’t counted, but there are still a not insignificant number of such misdetections in my library of 500-ish albums. The ones I mentioned were just those which I had seen most recently.

Can you example where my solution would fail an yours would work.

Can you example where my solution would fail an yours would work

Why don’t I put together a test set for this. This will take a day or two.

Yes that sounds good

I realise belatedly that I proposed a solution above, but failed to fully explain the problem.

So I have (with the aid of the scripter and commandline tools) run songkong over my entire (mainly) classical library to quantify the accuracy of the “Opera Works” detection algorithm. This consists of just over 550 albums, and 8974 tracks. Using the GROUPING (Lyrion profile) as a diagnostic, SongKong found

  • No. of Opera works tracks detected = 1580
  • No. of Opera works tracks correctly detected = 706
  • No. of Opera works tracks false positive = 874
  • No. of Opera works tracks not detected ~ 80 (###)

(### I don’t have an accurate count of the number of tracks which ought to have been selected but weren’t - around half a dozen albums so this is a rough figure)

So clearly there are problems here. After some analysis, fortunately almost all of these false positives seem to be the consequence of just two problems

  1. The “Moonlight Sonata” issue - ie the algorithm goes too far down the chain and records a collection of works as the WORK/OVERALL_WORK to use. Unfortunately this include quite a large number of the best known and most common works in the classical repertoire - not just the “Moonlight” sonata, but the Brandenburg Concertos, The “Four Seasons”, Chopin’s Polonaises and more. But luckily this is a single issue to fix.
  2. The “Musicals” issue. This is a consequence of bad MusicBrainz data which lists works as “part” of some much later work by a different artist especially musicals, film soundtracks and the like.Thus in the test set I put together Beethoven’s sixth symphony and Stravinsky’s “Rite of Spring” are listed as part of an opera named “Fantasia” and George Gershwin’s piano concerto as part of the musical “Crazy for You”.

These are the two major issues with false positives. The false negatives tent to be Requiem Masses, Cantatas and the like. Orff’s famous “Carmina Burama” is a cantata, and contains six parts, which it would be nice to recognise as being equivalent to operatic Acts. However it is itself part of a collection of three cantatas named “Trionfi” which is not the commonly used, programmed or record “work” name. In the test set I have included two different albums of different recordings of Carmina Burama. It would be great to get these both correct!

Hopefully the test set which I will supply by a PM link can be used to help diagnose and improve the accuracy of the algorithm. To keep size manageable I transcoded everything to a lossiy format and used Ogg Vorbis to avoid any potential FLAC->MP3 tag translation issues.

I realise that I omitted “Cantata” from the list of “Work Types” I gave in the first post of this thread… Some Cantatas (e.g. Carmina Burama) have subdivisions/parts, while others may not.

Thanks, I wont have time to look at this until the end of next week.

Starting looking at this, now actually although we have MB Work information if it exists to any song, we only add Movement/Work/Overall Work for albums identified as Classical, so my initial concerns about non-classical multi level works doesn’t apply.

Looking at your examples your idea of not tracking up to parent work if parent does not have correct work type seems to work. Importantly I already have Work Type information in the Albunack database but not attributes such as Part of Collection so if I take your work type idea I can implement this within SongKong without requiring any changes to Albunack, which makes things easier.

Symphony No. 6 “Pastoral” would now stop at Symphony no. 6 in F major, op. 68 “Pastorale” because the parent Medley of Fantasia doesn’t have Work Type

Piano Sonatas: “Moonlight” / “Pathétique” / “Appassionata” would now stop at Sonata for Piano no. 14 in C‐sharp minor, op. 27 no. 2 “Moonlight” because its parent Two Piano Sonatas, op. 27
has no Work Type

Piano concerto tracks on Rhapsody in Blue / An American in Paris / Piano Concerto in F would stop at Concerto for Piano and Orchestra in F major because its parent Crazy for You (1992 musical) has work type of Musical, and this is not in your list of valid work types.

Carmina Burana would now stop at Carmina Burana: Cantiones profanæ cantoribus et choris cantandæ comitantibus instrumentis atque imaginibus magicis instead of going up to Trionfi because it has no Work Type.

Can you confirm your Work type list please, this is the full list of MusicBrainz work types, I was wondering if should also include things like Beijing opera or Operetta since whilst may be rarer but if the top level is of these work types would seem to be valid?

  • Aria
  • Audio Drama
  • Ballet
  • Beijing opera
  • Cantata
  • Concerto
  • Étude
  • Incidental music
  • Madrigal
  • Mass
  • Motet
  • Musical
  • Opera
  • Operetta
  • Oratorio
  • Overture
  • Partita
  • Play
  • Poem
  • Prose
  • Quartet
  • Sonata
  • Song
  • Song-cycle
  • Soundtrack
  • Suite
  • Symphonic poem
  • Symphony
  • Zarzuela

La Création (extraits) was not extracting Opera Work just because SongKong was having to get directly from MusicBrainz and this code was only getting Work for two levels (Movement and Work) rather than three (Opera), now working.

Already fixed the Overall Work issue, however whilst this one works okay, the other one is incorrectly setting Section/Grouping to Carmina Burama for all tracks

This is because it is trying to derive from title, for the other one SongKong notes there are multiple colons in its title and so use the MusicBrainz Works instead but this one does not.

I think solution is that if we decide it is a tracks with three levels or more (Movement, Work, Overall Work) that we never try to derive just from title as too dangerous, we were sort of using presence of multiple colons to determine this but that is unreliable.

ok now fixed plus this issue and this one for next release.

Just back from my hols, and it’s great to see all this promising development news.

I was wondering if should also include things like Beijing opera or Operetta since whilst may be rarer but if the top level is of these work types would seem to be valid?

I don’t know much about either Beijing Operas or Zarzuelas, but a superficial research seems to suggest that both can be divided into Acts. “Symphony” almost always has a two-level work/movement structure but there are rare exceptions such as Mahler no. 8 “Symphony of a Thousand” with has Part I and Part II, each subdivided into movements. But this is a very rare exception.

Thanks added those addtional types.

The Porgy and Bess release is interesting because there are four levels of valid works

e.g

Track 1
Porgy and Bess: Act I, Scene I. Introduction: Jasbo Brown Blues
Porgy and Bess: Act I, Scene I
Porgy and Bess: Act I
Porgy and Bess

Track 2
Porgy and Bess: Act I, Scene I. “Summertime” (Clara)
Porgy and Bess: Act I, Scene I
Porgy and Bess: Act I
Porgy and Bess

So scenes are further split as can be seen here rather than having a one-one correspsondance between track and scene which is more usual.

We store all four levels in the MusicBrainz hierachy as:

Mb Recording Work:Porgy and Bess: Act I, Scene I. Introduction: Jasbo Brown Blues
MB Work Level 1:Porgy and Bess: Act I, Scene I
MB Work Level 2:Porgy and Bess: Act I
MB Work:Porgy and Bess

However Roon/Lyrion/MinimServer only support three levels so it impossible to store all four levels in those fields, currently we do as follows:

Lyrion
Movement:Porgy and Bess: Act I, Scene I. Introduction: Jasbo Brown Blues
Grouping:Porgy and Bess: Act I, Scene I
Work:Porgy and Bess

Roon
Movement:Porgy and Bess: Act I, Scene I. Introduction: Jasbo Brown Blues
Section:Porgy and Bess: Act I, Scene I
Work:Porgy and Bess

MinimServer
Movement:Porgy and Bess: Act I, Scene I. Introduction: Jasbo Brown Blues
Work:Porgy and Bess: Act I, Scene I
Overall Work:Porgy and Bess

But I’m not sure this is optimal, any thoughts?

BTW have also implemented the fixes for next release of Jaikoz as well.

This is something new to me too. Obviously the top and bottom levels have to be the movement and the work, and either Level 1 or Level 2 has to be chosen as the intermediate heading… I have been thinking about this since you posted the question, and am so far unable to come up with a clear preference which is better to leave out.

Most of these have also been implemented in Jaikoz, and released in latest version today