SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Duplicates job crashed mid-way / continues to run in background

Hi @paultaylor,

I have that job that was running on my /music_processed/ folder.

Selected Folder : /music_processed
Loaded 1134798 songs for duplicate checking
There are 146969 duplicate keys
219889 Duplicates found and deleted in 10 days 0 hours 13 minutes 6 seconds

After 10 days, and at mid-way, the job was reported as ended.

If I check the logs : tail -f songkong_debug0-0.log

I can see the job actually is still running, and that it is currently processing the P letter.

Extract:

23/01/2023 18.37.26:CET:DeleteDuplicatesLoadFolderWorker:loadFiles:SEVERE: end:/music_processed/Prince and The New Power Generation/Gett Off:6:1160662
23/01/2023 18.37.26:CET:DeleteDuplicatesLoadFolderWorker:loadFiles:SEVERE: start:/music_processed/Prince and The New Power Generation/Money Don't Matter 2 Night:2:1160662
23/01/2023 18.37.39:CET:DeleteDuplicatesLoadFolderWorker:loadFiles:SEVERE: end:/music_processed/Prince and The New Power Generation/Money Don't Matter 2 Night:2:1160664
23/01/2023 18.37.39:CET:DeleteDuplicatesLoadFolderWorker:loadFiles:SEVERE: start:/music_processed/Prince and The New Power Generation/The Morning Papers:2:1160664

I’ve sent you the support files, of course.

Ah, from the code

            //Now waits for files within the folder to be added to duplicate map
            //Limit to 10 days just in case gets into some kind of infinite loop
            boolean result = es.awaitTermination(10, TimeUnit.DAYS);

It was never expected would ever run for this long!

https://jthink.atlassian.net/browse/SONGKONG-2385

haha I get that :wink:

Now, what do I do ? I mean, it was already running for 10 days and only did reach approx. 50%.

As it already did load half of my files, and the process keeps going in the background, should I stop songkong, and restart the task (but then, it will AGAIN scan the first half and I’m loosing 10 days.

Also, do you have a clue why this duplicates task takes so long ?

I would restart because Im not sure it is going to work properly now that has been triggered, should be much quicker since the duplicates already found have been deleted and empty folders deleted.

I already raised this issue https://jthink.atlassian.net/browse/SONGKONG-2376 but have not had chance to look at it yet.

10 day limit fixed in SongKong 8.9 Bleach released 24th January 2023

Hey @paultaylor,

since last update, the duplicates task got slower than ever.

look:

Is there anything we can do together to see why it’s so slow and try to figure out how to improve this ?

Two things:

It shoudn’t be slower I think it has actually done half based on Songs loaded value, I dont think the Duplicates groups found and Duplicates songs deleted should be shown as found/total

But it is slow because the song loading.matching was single threaded to fix another issue. But I have now fixed this problem and should be considerably quicker in next release, i hope to be able to release later this week.

1 Like

You only get second value when using webui, should not be there, this is what you see if using desktop version

image

There is no actual indication of how many songs have been processed (although it should be about same as Songs loaded count) so perhaps I should add an extra count.

yes please ! I have 32 cores waiting to be used. I’m here to beta test, shoot me a docker container ^^

hehe

just to get an idea of the speed here is where I am today :

So it only did load 17k files in approx 24h.

So this means I would need 117 days to process 2 millions files.

I have just uploaded new docker image for SongKong 8.10 Frank so you can try it now.

1 Like

now we talk :

As you can see in a few hours it already re-checked 400K+ files :

This is way better !

1 Like