SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

It looks like 9.3 is filling docker vdisk.img file!

Hi @paultaylor!

Debugging is quite hard as I am 2000lm away from home but since I remotely updated sonkong to version 9.3, I’ve got my whole docker service crashed due to the fact the vdisk completely filed !

This happened right after I stared a simple file indexation task (not sure about the right name for it anymore, sorry).

The vdisk did quickly grow to 100% usage, which not only killed my sk docker, but also all my other docker containers.

I had to ask my brother to drive to my home, stop the docker deamon, raise the docker img file size from 50 to 100gb, and restart the server in order to get things back online.

I then started the task again, around 5pm, and it filled the extra 50gb after a few hours.

Unfortunately I can’t ask my brother to drive to my place again, and he is ain’t that tech savvy anyway so I cannot ask hip to start reading and understanding the logs. But basically, sk is not reachable anymore. I also can’t reach out to my unraid server using its webui, as my nginx docker container is broken due to this full vdisk image.

I just want to flag there is something wrong with 9.3 that fills the vdisk img file. Could you eventually test this under unraid? Update to 9.3, run a task (the simple indexing one), and watch the img file size.

Thanks! :slight_smile:

Hi, sorry about the hassle

Okay it was identified by another user that SongKong was not in all cases correctly detecting when it was running under a docker - 9.1.1 Docker on Unraid

So in 9.3 we made a change, instead of trying to work out if running on container by the code using cgroup we modified the /opt/songkong/songkong.sh startup script called in the docker image so it simply tells songkong it is running in Docker by passing the -k argument as first argument to SongKong itself

e.g

#!/bin/sh
umask 000
./jre/bin/java  -XX:MaxRAMPercentage=60 -XX:MetaspaceSize=45 -Dcom.mchange.v2.log.MLog=com.mchange.v2.log.jdk14logging.Jdk14MLog -Dorg.jboss.logging.provider=jdk -Djava.util.logging.config.class=com.jthink.songkong.logging.StandardLogging -Dhttps.protocols=TLSv1.1,TLSv1.2 --add-opens java.base/java.lang=ALL-UNNAMED -jar lib/songkong-9.3.jar -k "$@"

However, there was bug in the fix some code, some code was still using old method so I have just rebuilt 9.3 with a fix. However you didn’t have a problem before so that should not cause you an issue. I think that maybe you are using a custom songkong.sh file that is not passing the -k option, and hence not running in Docker mode so SongKong is storing database and reports in wrong place ?

It might be. But I will not be able to answer this before the end of my trip.

This did actually kill all my dockers and I will have to sit back at home to investigate.

I also suspected the db actually was growing inside the vdisk file instead that the actual mapped folder.

I’ll keep you posted in approximately a week.

FYI I had to do another fix to fix, but believe now okay - 9.1.1 Docker on Unraid

OK I’m back from holidays.

I deleted my vdisk file, and reinstalled all my docker containers, including latest version of songkong. I confirm this fix of a fix fixes the fixing issue.

But now, there is something else :slight_smile:

First thing I did once I’ve got songkong running again is to start a status report task. It was crazily fast compared to previous runs. the 3M+ files were all check in less than 24 hours ! (13:40:35 to be totally exact).

On this nice new basis, I then started a fix task, and while I can see songkong is processing the files by tail -f’ing the logs, I also can’t see anything reported on the task screen anymore :

here you can see it is running for approx 16 hours, but the screen is still not reporting anything.

lets have a closer look at the cpu usage:

I checked the debug logs, and I can confirm that songkong already reached my 05-2020 monthly folder. as it proceeds from in the “normal” sort order of the folders, it alread processed folders 01, 02, 03, 04.

I believe songkong has a hard time displaying the values on the task screen, maybe due to the high amount of files that has to be loaded from the previously finalized task.

It is not a big deal, as I can keep checking the debug logs to check where songkong actually is in its task (by checking which folder it is currently processing). But you understand it would be nice to actually see this directly on the task screen :wink:

some more info:


I see my cache drive is intensively read, I guess this is due to the fact the database is been queried since I finalised the status report ?

Also, can I safely get rid of the reports themselves ? they are not needed / used by the tasks, right ?

It would be useful to see the log files, it seems like it isn’t working to me I don’t understand why it wouldn’t update progress, it maybe worth cancelling task and see what happens.

Reports folder contains shared style folder used by all reports and a shared images folder for artwork images in the reports, you should not delete these folders.

Then there are subfolders for each report, you can manually delete these reports but you should not delete the latest FixSongs report that this task is using.

I had to start it again, and it started dislpaying the progress normally. FYI, here is where we are (with approx 1 million tracks that were processed during the first run) :

one more time, the delta between the loaded, saved and completed tracks is intriguing to me.

It lives! and it is matching stuff, but I still see such errors in the logs, is this something to worry about ?

02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: ----process reaper
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/jdk.internal.misc.Unsafe.park(Native Method)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:462)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:361)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:937)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1055)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1116)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
02/08/2023 09.54.47:CEST:MonitorExecutors:outputTraceForThread:SEVERE: java.base@14.0.2/java.lang.Thread.run(Thread.java:832)

here is how the CPU and RAM usage are looking like :

The nvme drive (where reports and db are stored) is read intensively as well, so everything seems to be fine on that end :slight_smile:

Soooo, everything looks fine to you ? or should I worry about the delta I mentioned ?

No, that is just part of the monitoring of threads that occurs every 5 minutes, so in case of a problem we have a record of what each thread was doing. This particular output just means a worker thread is parked waiting for work to do.

1 Like

OK, now, weird thing imho is that the “completed” bar started to fill once the previsouly matched / processed 1M files was reached.

Im not sure what you mean ?

well. The very last bar (Completed) kept displaying 0, for a day or two, the time SK took to load the tracks it processed during the previous run.

It started adding “completed” tracks once it started to match files that were not fingerprinted/processed in the previous run.

therefore, I have a gap of approx 1 million tracks between here :

and here :

EDIT: as the process crashed again, here is something that clarifies what I’m saying :

Weird, Im not seeing that

e.g

So I don’t know why this failed, you havent sent me anything so I assume you have just restarted.

It would be great if SongKong could complete task over your large music library. But in the meantime to get it done and avoid frustration maybe you should just work on each subfolder below the too folder, if I remember correctly you have a small number of date based folders.

Also worth noting just released SongKong 9.4 and this has some important fixes to matching including one for Acoustid only matching, I expect this will allow matching to a number of songs that SongKong could not previously match.

So this is what I would do:

  • Install SongKong 9.4 (although this will cause database to be deleted)
  • Possibly rerun Status Report over everything (but dont know if this is helping at all or not)

Then for each date folder:

  • Start SongKong
  • Run against folder with Ignore Previously Checked Songs that could not be matched unchecked
  • Stop SongKong (ensures releases all memory and in case holding onto some resource it should not be)

It did crash indeed. It’s like if the RAM or something gets fullfilled anyway, and it makes my unraid server become totally unresponsive (even ssh is not usable anymore).

So, in order to have a little lower amount of files to process next time, I decided to run a duplicates finder task in between :

Questions:

  1. why is the duplicates finder taking soooo much cpu resources ?
  2. why is the DB totally wiped at each update of SK ? this is slowing down the whole next task you are running.

:wink:

  1. I don’t understand what you are getting at here, like Fix Songs the Delete Duplicates task is multithreaded so it can make use of available cpus to do its processing. If it didn’t use so much cpu it would be slower.

  2. If we make any changes to the structure of items stored in the database in a new version we need to delete the existing database because existing items won’t match new structure. Since we have to deal with users updating from different versions to new version the simplest and most reliable approach is to automatically delete database each time. This is not usually a problem, and even in your case may make little difference. I do intend to do some testing on this but not done yet.

BTW SongKong uses Java, and therefore in usual case runs within its own JVM so it should protect computer from any effects. In your case it runs within JVM within Docker so I can’t understand why that would prevent using ssh etc. If it crashes again support files would be useful so I can see logs just before crash.

There are three main resources in use:

  • disk
  • memory
  • cpu

Disk Space

Assume you have configured /songkong folder correctly and have sufficient disk space.

Max Java heap can be configured when start SongKong.

Dockerfile calls songkong.sh and this uses -XX:MaxRAMPercentage=60 to set max percentage of total memory

#!/bin/sh
umask 000
./jre/bin/java  -XX:MaxRAMPercentage=60 -XX:MetaspaceSize=45 -Ddocker=true -Dcom.mchange.v2.log.MLog=com.mchange.v2.log.jdk14logging.Jdk14MLog -Dorg.jboss.logging.provider=jdk -Djava.util.logging.config.class=com.jthink.songkong.logging.StandardLogging -Dhttps.protocols=TLSv1.1,TLSv1.2 --add-opens java.base/java.lang=ALL-UNNAMED -jar lib/songkong-9.4.jar "$@"

So you could use different value

No of workers can also get configured.
This is advanced option that should not usually be modified, by default workers set to match number of cpus but you can modify by adding

workers=n

to general.properties

e.g

workers=10

We also have separate section of workers for saving files, this is an I/O operation more than cpu so by default just set to 2 but can modify by adding

savers=n

to general.properties

e.g

savers=3

Docker Config

And I think you can also set memory, cpu limits on the Docker contaiiner itself.

So there are things that can be done to reduce SongKong load.

Are you saying that, even if the DB is totally wiped, next run of a task won’t be affected ?
Is it then safe to assume that SK uses the report in order to know if a folder / file was previously checked against MB and discogs ?

It uses a field in the files themselves SONGKONG_ID

1 Like

18 posts were split to a new topic: Running songkong cmdline in script in Docker environment