SongKong Jaikoz

SongKong and Jaikoz Music Tagger Community Forum

Estimate remaining time

Hi @paultaylor,

A quick request to chatGPT (I pasted the running task output, including start time of the task, number of completed files, and total amount of files) + I provided it the time of the day.

here is its answer :

From the provided information, the task began at 8:39:52 AM, and it is currently 12:20 PM. This is an elapsed time of 3 hours and 40 minutes, or 220 minutes.

So far, 17442 items out of 3345995 have been completed.

If 220 minutes were needed to process 17442 items, then the time needed to process a single item would be 220/17442 = 0.0126 minutes.

Therefore, the time needed to process the remaining items (3345995 - 17442 = 3328553 items) would be 0.0126 * 3328553 = 41959.9 minutes, which is approximately 699.3 hours or about 29.1 days.

This is a basic estimate assuming the processing rate is constant, which might not be the case in reality. Moreover, this estimate does not account for potential interruptions or changes in processing speed that could occur.
Percentage of songs completed: 0.52%
Estimated time remaining for task completion: 29.3 days

Don’t you think tis formula could be used to display the estimated ETA for running tasks inside SK Webui ?

I suppose it could do, but processing rate is not constant so it will only be an estimate.

of course the rate is not constant, but this an be recalculated on the fly. and when you run a task, it’s always nice to have an idea of the ETA. of course these are just my two cents. But Look, I did check this, as I was currious about how many days my own process would eventually take. :wink:

Its a fair point, I have raised it as a low priority issue - https://jthink.atlassian.net/browse/SONGKONG-2443

hm, I have no clue what happened overnight, but my CPU is not being used at all anymore, and my hard drives, which where all spinning and heavily scanning / reading files, did spun down :

Still, songkong is doing something, but it’s like if it slowed down for fome reason !

I still can see stuff being written in the logs, but suuuuper slowly compared to the unreadable tail -f I was running yesterday.

Here is the remaining time calculation :

  • Started at Jun 20, 2023, 8:39:52 AM
  • Current time is 5:34 AM the next day
  • Total number of songs: 3345995
  • Number of completed songs: 47180

Let’s calculate:

  • The task has been running for 20 hours and 54 minutes, or 1254 minutes.
  • Time per completed song = 1254 minutes / 47180 completed songs = 0.0266 minutes per song
  • Remaining songs = 3345995 total songs - 47180 completed songs = 3298815 songs
  • Estimated remaining time = 0.0266 minutes per song * 3298815 remaining songs = 87709.8 minutes, or approximately 1461.8 hours or 60.9 days

Can you explain this behaviour ? Why would songkong suddenly stop using all my cores and read files from my hard drives ? any clue ?

here is what seems to be the issue as the logs are now repeatingly telling me the following :

21/06/2023 05.45.33:CEST:JthinkSearchServer:doQuery:SEVERE: Read timed out
java.net.SocketTimeoutException: Read timed out

21/06/2023 05.45.33:CEST:JthinkSearchServer:processException:SEVERE: MiniThread:com.jthink.songkong.fileloader.worker.LoadFolderWorker:3 retrying:8
21/06/2023 05.45.33:CEST:JthinkSearchServer:precheckRetry:SEVERE: Retrying Query:Wait 8: ( (artist:(02 AND 2020~0.7) ) AND  (release:(monster~0.7 AND planet~0.7 AND mp6fns3~0.7 AND 2015~0.7 AND web AND flac~0.7) )) AND tracks:10 AND src:2

rainzSongGroupMatcher1:G134581:Machine Girl - U-Void Synthesizer (2020) [FLAC]: task because taking too long
21/06/2023 05.56.36:CEST:Errors:addWarning:SEVERE: Adding Warning:Cancelled class com.jthink.songkong.analyse.analyser.task.MusicBrainzSongGroupMatcher1:G134581:Machine Girl - U-Void Synthesizer (2020) [FLAC]: because taken too long

just in case, my server has no internet connection issues :

Can you send another screenshot so I can see how much changed in last few hours.

Sure :

Okay can you email support@jthink.net songkong-debug0-0.log.

Every 10 minutes we log the status of all threads so should show why they are inactive.

It’s confusing because even if problem on server it looks like about 170,000 song have been matched so I would expect a similar number for songs saved, but it’s only 47,000

done. I hope your mailserver accepts 20mb file sizes as attachments.

Okay so found the server ran out of memory, this last happened 3 years ago so a rare occurence but something I need to look into. Then looking at your logs it seems all threads were waiting on retrying search, so removing any spare threads for other tasks such as file saving. All tasks (musicbrainz search, discogs search, save ectera) are added to the same queue, and this queue is serviced by a number of threads based on how many cpus your server has, this ensure that we don’t have too many threads running in total, but it does seem to have this issue.

Now, good news is server restarted and up and running. I would hope your in invocation of SongKong would recover and start processing but if it does not then you’ll have to restart, remembering that songs already matched and saved to Musicbrainz album or Discogs album will not have to be reprocessed

Should I restart ?

by the way, I still have this laying on my disk :

Can I assume I cat safely get rid of the Prefs_backup foler and that songkong now uses Prefs ?

I would restart,

SongKong has never used a Prefs_backup folder, that must be something you created.

OK sent the report files, so you can have a look at them as well. :wink:

1 Like

Okay actually reviewing it I remembered something. There are two separate queues:

A limited size queue for most tasks that controls matching and once this queue is full no more songs are loaded, this has threads to match number of cpus, it also has a timeout that is triggered for any job that is taking too long to complete.

An unbounded queue for song saving. It has no timeout for jobs and is unbounded this is to ensure that if task is cancelled we cant interrupt any jobs as they are writing data to file, potentially corrupting the file. Song saving task does use cpu but the majority of the work done is I/O, hence we only provide two threads so that we dont get too much data contention writing multiple files to the disk .

Once I had fixed the server, song saving went up but was still way behind song matching. So I dont know if this was because of issue with server or because there is a mismatch between the many threads used for song matching and only two threads for song saving.

This is summary from monitor logs

1/06/2023 09.14.54:CEST:MonitorExecutors:outputQueueSize:SEVERE: QueueSize:Worker:100
21/06/2023 09.14.54:CEST:MonitorExecutors:outputQueueSize:SEVERE: QueueSize:SongSaver:21800
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: DbConnectionsOpen:39
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: SongPreMatcherMatcher:0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: NaimMatcher:0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: AcoustidSubmitter:111718
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: StartMatcher:27356:27321:27321:260000
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzSongGroupMatcher1:27321:27321:27287:259804
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzUpdateSongOnly:0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzMetadataMatches:0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzSongGroupMatcher2(SubGroupAnalyser):630:630:630:10043
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzSongGroupMatcher3(WithoutDuplicates):182:182:182:1772
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzSongGroupMatcher4(AnalyserArtistFolder):0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: DiscogsMultiFolderSongGroupMatcher:14886:14821:14820:160768
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: DiscogsSongGroupMatcher:14886:14821:14820:160768
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: MusicBrainzSongMatcher:0:0:0:0
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: DiscogsSongMatcher:275:275:275:275
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: DiscogsUpdateSongGroup:2631:2631:2631:21343
21/06/2023 09.14.54:CEST:MonitorExecutors:outputPipelines:SEVERE: SongSaver:27784:5984:4410:257829

So we see at that point in time worker queue size is at maximum of 100 tasks , songsaver is 21,800 tasks

Then we list the different tasks

MusicBrainzSongGroupMatcher1:27321:27321:27287:259804

Means MusicBrainzSongGroupMatcher1 task queued 27,321 jobs, 27,321 were started, 27,287 completed (but this can include timeouts) and 259,804 music files were loaded into the task for processing, each job usually groups multiple related songs e.g an album

SongSaver:27784:5984:4410:257829

Whereas here, only 5984 of the 27784 have even started, but you said the disks were not busy so confusing.

Unfortunately, the part of monitoring that shows trace of each thread only shows the Worker threads not the SongSaver threads so I cannot see what the SongSaver threads were doing when this snapshot was taken. But I have fixed this for next version due next week.

Every time I start songkong we end up witn a new release the next week man… ^^

Don’t know if you restarted Fix Songs or waiting on this, but although not mentioned in release notes the SongKong 9.2 Brutalism released 26th of June 2023 release has the extra debugging

Hey @paultaylor, it’s much appreciated. I am running a fix task though. I think I’ll keep it running for now :wink:

thanks for letting me know ! :slight_smile:

Great, could you just post screenshot of progress so far.

Sure,

So far, 688,058 songs have been processed out of a total of 3,339,210. This means there are 2,651,152 songs remaining to be processed.

Assuming the process started on June 22, 2023, at 11:46:19 AM and approximately 6 days and 20 hours have elapsed since the beginning.

Total number of songs processed per hour: 688,058 / 164 ≈ 4,195 songs/hour

Estimated time remaining: 2,651,152 remaining songs / 4,195 songs/hour ≈ 632.04 hours

Now, to convert that into days and hours, we can divide 632.04 hours by 24 hours per day:

632.04 hours / 24 hours/day ≈ 26.34 days

So, approximately, the processing process may take around 26 days and 8 hours from June 27, 7:48 AM to complete.

What do you think about this 4195 song per hour number ? Is this in the average ?