SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Bug? Language (TLAN) appears blank unless written-to by Jaikoz (then hex data looks odd)

It seems like the ID3v2.3 Language/TLAN field (at least) is somehow not properly (or to defacto standards) being written-to or read-in by Jaikoz. Details below.

I started with a “clean” Mp3 file (having no tags whatsoever). I had deleted any existing tags and checked they were “really gone” by using a hex editor. Then, using another app (not Jaikoz), I added an ID3v2.3 tag with a Language field (ID3v2.3::TLAN == Jaikoz-ID3v2.3::Language), populated with an appropriate three-letter code (“fra”, meaning Francais/Français i.e. French-language). Actually I tried this same routine - producing same result as below - from three different apps.

When I then opened this file in various apps (tag editors, players, mediainfo, a hex editor), the expected TLAN/Language value was shown (either as “fra” or “French”, depending on the app).

However when I loaded the same file into Jaikoz (11.6.1 Doves, for macOS), the TLAN/Language field was displayed on-screen as blank. Despite the tag’s other fields being shown (the main ones at least, I haven’t checked every one in detail).

On the other hand when I entered (overwrote it with) “fra” in Jaikoz, then save/close and re-open that file in Jaikoz, the language was displayed (as “French”).

I next looked at this (Jaikoz-saved) file in a hex editor, to see whether Jaikoz saves TLAN data-field in a different format. I noticed that, in the hex, the preceding data-field (TPE2/AlbumArtist) had a string value that was not terminated with a Nul (Hex 00) character, instead the first hex value after the final letter of the TPE2 data-field was for the first letter (“T”) of the next data-field (“TLAN”). It looked as if, in the hex view of the data, the first of these fields had run up against the start of the other field.

Whatever, I get the impression Jaikoz is doing/expecting something (at the binary/hex level?) non standard (de facto at least) with this field.

Is there a bug here, affecting TLAN or possibly something more generic?

Hi for ID3 most fields do not end with a null character, and this is not a defacto standard either. So it sounds like to me like there is a problem with your initial app rather Jaikoz, what app are you using ? You say you repeated with two other apps but I wonder if the field was already entered by first app, the other apps didn’t have any effect.

You say Jaikoz didnt load the language you had entered from another app, but you dont say after editing with Jaikoz is the Jaikoz modified file read correctly by other apps ?

Thank you Paul. Yes I see now (having done some more hex analysis) as you say: most fields do not end with a null character.

I did some more tests, to confirm answers to some of your questions. I found that:

  • TLAN written by App:Jaikoz was indeed readable by other apps (e.g. Foobar2000, MediaInfo)
  • TLAN written by App:Mp3Tag - even as the sole field in a fresh (previous one deleted first) ID3v2.3 tag, was readable by other apps, apart from Jaikoz, which showed it as blank.

At the same time, in the hex editor I noticed that the value-strings (e.g. “fra” for language) from Mp3Tag (at least, as I have configured it) differ (from those of Jaikoz) in the following respects, which may offer a clue(?):

  • They have [FF FE] at their beginning, which I see is a unicode string convention for establishing the byte order.
    Also, that thread seems very informative at this level of detail, as it relates to ID3v2.3 and ID3v2.4).
    …whereas those from Jaikoz do not have this data
  • The Mp3Tag value-strings’ characters are each represented by four hex digits, two of them being in this case hex[0] i.e. hex[00] as a hex digits pair.
    …whereas those from Jaikoz are each represented by two hex digits (no hex[00] data)

Does this imply Jaikoz writes and expects UTF-8 whereas Mp3Tag writes UTF-16 ? In any case, it seems the other apps can read both UFF-8 and UTF-16. I am in uncertain territory here, just making my best attempt to understand the coding.

I wonder if it could be the leading [FE FF] and/or the use of UTF-16 in ID3v2.3 is unacceptable to Jaikoz. I see from the StackOverflow thread that “UTF-8 is an ID3v2.4 feature not present in 2.3”.

But I am no expert in this.

Hi, yes you are right about ID3v23 only supporting ASCII and UTF-16 with BOM I had forgotten the details, but Jaikoz does understand UTF-16 with BOM and can also write UTf-16 with BOM if writing as ID3v23, if writing as ID3v24 it can write UTF-16LE, UTF-16BE (and these don’t need BOM) or UTF-8.

Now it would be worth checking the value of the Version field when you load into Jaikoz. By enabling the View/Show View Pane you can see what version it has loaded as in top panel, then you can check it is going to save as in bottom panel because depending on Save:Write Tag Version option it may be automatically converting to ID3v24

Now why is it not reading file from Mp3Tag correctly ?

The best thing to do would be email support@jthink.net such a file before modified by Jaikoz so I can see the behaviour for myself, then i can investigate further.

Thank you Paul, I emailed an example mp3 file just now.

The View > Show View pane, with that file loaded, shows the Version as ID3v23
(as does the main central pane)

The Preferences > Save > ID3Tag V2 has: [Always write tag], [v23]
(The Preferences > Save > ID3Tag V1 has option: [Write tag if exists] )

As possibly further diagnostic information:

I examined the Mp3/ID3 in the smart hex editor SynalyzeIt, with its automatically detected grammar-template for (parsing) Mp3, which here also includes ID3 (template).

In fact I gave it four variants: one written from Mp3Tag, one from Jaikoz, one from MusicBrainz Picard, one from Foobar2000, one from Kid3 (macOS version)

SynalyzeIt’s MP3/ID3 grammar-template (whether or not it is perfectly defined!) “preferred” the Jaikoz variant. But how could it be that three apps could be “wrong” in the same way? Or is the grammar-template based on some assumption also made by Jaikoz?

For Jaikoz (mac) and Kid3 (latter with its default text encoding) it reported:

  • (No errors)
  • TLAN value-string: Encoding: [ISO-8859-1: 0]

For Mp3Tag and Picard and Foobar2000 and for Kid3 (latter with text encoding set to UTF-16), it reported:

  • error: “String size set to zero” - skip element ‘Text’
  • TLAN value-string: Encoding: UNICODE-1

Hi, okay so the ID3c23 spec is as follows, each field is called a frame, and each frame has a 10 byte frame header consisting of:

4 Bytes:Frame Identifier
4 Bytes:Length of Frame (not including length of header)
2 Bytes:Flags 

Then you have the frame body for a Langugae frame this consists of

1 Byte:Text Encoding
3 Chars:3 Character Country Code

So in your example you have:

4 Bytes:TLAN
4 Bytes:0xb (11)
2 Bytes:0

1 Byte: 1 (means UTF-16
8 bytes:ff fe 66 00 72 00 61 00

So the problem is the frame says its contents are 11 bytes long, but its frame body actually only seems to be 9 bytes. Now I’m not clear because this is the only frame is this one if the frame body is only 9 bytes and the frame header is wrong, or if the frame body is actually 11 bytes and contains two unneccessary null bytes at the end. But either way is wrong because those two addtional bytes (one char once converted to char from bytes) mean that we don’t get match to a language because we are looking for fra\0 rather than fra

If I edit the mp3 file so that the length of frame it set to 0x09 rather than 0x0b it works fine

If I modify my Jaikoz code to search for ‘\0’ chars and remove and then lookup language it also works, but I shouldnt have to do this, this is a bug in Mp3tag

The specification has always been at https://id3.org/ but unfortunately this does not appear to be working at the moment.

It would not be such a problem for general text fields because not matching value to a list and therefore because null chars not visible you would not see problem, so more problematic for Language field.

Do you get same problem if you use ISO-8859-1 (Text encoding 0) since there is no need to UTF-16 for this field since all the values can be encoded with ISO-8859-1

1 Like

I found this https://community.mp3tag.de/t/x-trailing-nulls-in-id3v2-comments/19227 it seems his misread the spec, then realized he had misread it but decided to leave it in anyway !

1 Like

Done a workaround to strip trailing nulls for TLAN framer when encountered so can map to language okay
https://jthink.atlassian.net/browse/JAIKOZ-1439 will be in next release later today.

1 Like

Wonderful!

I will take time to absorb what you have explained and ! will delve further into examples written-out by the various ID3 tag-editing apps I mentioned previously.

And of course I look forward to checking out your new data-error-tolerant version!

Fixed Jaikoz 11.6.2 Pulp released January 12th 2023

Yes, Pulp’s a beaut! Now I can carry on tagging those french songs with no worries.

Amazingly quick and ideal response, thank you Paul.

Maybe not over yet? I may have identified a knock-on problem, as below.

One concern just occurred to me: In Jaikoz, multi-value fields use null as a delimiter. So stopping reading on a null would return only the first value. I guess the way to fix this would be to terminate string-reading only on (at least) two nulls in succession (not merely on “null encountered before end of string according to length-value”). Or - more robustly (?) - maybe the “continue reading” rule should be “null followed by printable character” ?

This potential problem occurred to me just now, after doing an experiment to add two artist names in the Artist field (for a single file/song/track) then check what that had done, as follows:

  • When I closed and reopened it in Jaikoz the data (the two names) was preserved. Great!
  • When I looked at it in Mp3Tag, it was correctly displayed (in that app’s own manner), as the first name followed by “\” followed by the second name. Also great!
  • And foobar2000 and VLC (player apps) displayed both names. Wonderful!
  • However when I opened in SynalyzeIt (smart hex editor) with an MP3/ID3 grammar template, it only recognised the first name, relegating the second name to being part of “padding bytes”. The length value (in the 4 bytes following the frame identifier, “TPE1”) was correct (accommodating both names and the null character). I wonder therefore if the grammar template’s string-reading rule, like your string-value reader in Jaikoz, included a defensive “stop on length reached or null character encountered” element. If so then I guess both Jaikoz and the grammar template need to be modified, say as suggested in the first paragraph of this post.

Hi, no the issue is that ID3v23 does not officially support multiple strings except for a few special case fields whereas ID3v24 does support multiple strings using the null character to separate them. But there is no technical reason that prevents using null character on ID3v23 as well, and many applications support this as shown by your testing, but some do not.

The best solution is simply to use ID3v24 instead of ID3v23.

1 Like