SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Character Sets and Tagging Classical Albums

I have my music data saved on a NAS. The data has been built up over the years and although it started using .mp3 for a good few years I have ripped CDs to .flac. I have not had too many issues with playing but tagging has been a problem which is being solved as SongKong develops and more importantly the quality of the tag data is improving but I have recently run up against a problem. The problem has arisen as I have been trying to use rsync to copy my data to another machine when the copying process stops abruptly. A similar problem occurs when I use rclone and the copy process stops when a file being copied has characters which are not standard. All my system now uses en_GB.UTF-8 but many of my data has been created using other “non-standard” characters.
This could have come data originating in Europe etc.
The purpose of this message is to ask if others have come across this or a related problem and would it be possible to have SongKong edit data to unify the files to use UTF-8 when naming and tagging a data set?

SongKong uses a standard character set when writring data which is usually UT-8 depending on the audio format being used. But it sounds like the issue is when reading metadata some of it is encoded in an encoding that is not actually valid for the format, and therefore tools will read that metadata using the encoding they should be using not what was used. For example ID3 supports ISO-8859-1, UTF8 and UTF16 but sometimes other variations of IS0-8859 have been written to the IS0-8859-1 field such as IS0-8859-2 for Eastern European languages or ISO 8859-7 for Greek.

Now SongKong does have a Fix Charset Encoding task that can be used to read the metadata using a specified encoding and then rewrite as a standard encoding, but you need to tell SongKong what encoding is being used it cannot guess it reliably.

Hi and many thanks. Many thanks for your reply which is exactly what I hoped. I shall research your Fix Character Encoding but it looks like I have to examine every file one at a time. All I need now is time!
Many thanks,
Budge

What I would suggest is the following:

  • Run Status Report on all your files
  • Use View Spreadsheet to view all the metadata as read with the default encoding
  • You should be able to scan the spreadsheet file relatively quickly to find problem files and then know what files to run Fix Charset Encoding on