SongKong and Jaikoz Music Tagger Community Forum

jljubich · 20 February 2025 21:11

I’ve also noticed that the database size grows uncontrollably, reaching around 20% of the total disk size where it’s stored. This happens even when just verifying my music collection without writing any metadata to the files.

Since SongKong’s accuracy is not reliable enough for me to risk modifying my music collection, I only use it for analysis. However, the excessive database growth is concerning. Is there a way to limit or manage this more effectively?

Thanks!

paultaylor · 21 February 2025 09:40

So SongKong is designed that it only stores the folders currenlty being processed in memory and therefore it can be used on large libraries without needing much memory, but to facilitate this the metadata of each song is loaded into the database as it is read so it can use quite alot of disk space which is proportional to how may songs are loaded. Also every time a change is actually made that modifcation is stored in the database, this allows us to Undo Fixes at a later date, this is explained in Reason 5 of this tutorial video

The good news is after running some tasks the database contents is only required if you need to Undo Fixes otherwise we can empty the database using the Empty Database option

The reports also use alot of disk space and these can be removed using the Delete Reports option

I agree that maybe we could be a bit cleverer about this.

jljubich · 21 February 2025 12:08

Thanks for your explanation regarding how SongKong stores metadata in the database and the options to empty the database or delete reports.

However, I don’t quite understand why, after storing all the information in the database, I still have to rescan my entire music library every time I want to perform a task, such as removing duplicate songs. Shouldn’t the software be able to use the stored metadata without needing to analyze all files again?

I’d appreciate any clarification on this.

paultaylor · 21 February 2025 13:14

Yes, that is what is does, you dont have to rescan your entire music library

It uses the metadata within the database instead of having to go to the files directly (as long as the file has not been updated externally since it was added to SongKong)

But when we run a task we cannot assume the song has already been added to the database, some may have been or not. The process is folder based and works essentuially the same way except instead of having to read the metadata from the file directly it can be read from the database record which is quicker, but the rest of the process is the same.

jljubich · 21 February 2025 13:18

Thanks for the clarification. However, I had already obtained the AcousticID for all my songs, and they remained stored in the database. Despite this, when I ran the duplicate removal process, SongKong still rescanned my entire music library.

This process consumes a lot of CPU resources and takes a significant amount of time, to the point where it even prevents songs from playing simultaneously. Given that the metadata was already available in the database, I would have expected the duplicate detection process to use the stored data rather than reprocessing all files.

Is there a way to optimize this so that the process relies solely on the database and avoids rescanning the entire library?

paultaylor · 21 February 2025 13:26

What do you mean rescanned?

Are you running in preview mode, if you are then the Acoustids cannot be added to the songs and therefore they are calculated each time, Acoustid generation is cpu intensive.

Usually people run Fix Songs first, and Acoustids are added then, but again can only be saved to file if not inpreview mode.

jljubich · 21 February 2025 13:31

I understand your point, but I cannot save metadata directly into my files until I’m absolutely sure that the information is accurate and aligns with my needs. If I had updated the files prematurely, I could have irreversibly altered my entire music collection.

Additionally, since the AcousticID is stored in the database, I don’t understand why it isn’t used for other tasks to avoid the high processing time and CPU usage associated with recalculating it. If the data is already available in the database, shouldn’t SongKong be able to leverage it instead of reprocessing the files?

Could you clarify why this happens and if there is a way to work around it?

paultaylor · 21 February 2025 13:44

Its not irreversible actually, that is the point of the Undo Changes task

But it isn’t, the database keeps a record of what metadata is within each file. But because you have run in preview mode the Acoustid has not been added to the file and therefore it is not in the database.

One way round it would be following:

Start Fix Songs
Disable Preview Only on Basic tab
Enable Force Acoustic fingerprints even if already matched on the Match tab
On Artwork tab disable Find Front Cover Artwork
On Format tab add all fields to Never Modify or add these fields

and then Fix Songs will only add Acoust Ids, SongKong Id and some MusicBrainz Ids, this will not affect the metadata you have previously added.

Then Delete Duplicates can make use of the Acoustid Fingerprints

jljubich · 21 February 2025 13:46

I had ensured the AcousticID were added to the files (using mp3tag) but SongKong rescaned the entire music library again for AcoisticID calculation before song duplication removal.

paultaylor · 24 March 2025 10:37

Okay assuming that Mp3Tag adds the AcoustId field to the correct field for that music format (and it is an assumption) then SongKong should use the existing AcoustID to find duplicate files (assuming you have Song is a Duplicate if has same set to an option that uses AcoustID)

But even so we can’t assume all the songs being checked this time are in the database, we still have to check each song to see if it is a duplicate of another song, when we have a duplicate we have to work out which file to delete and which to keep and we still have to create a report. So when you say it is rescanning the entire library I think you just mean it is running the task, it cannot give you results instantaneously

Database Size Grows Uncontrollably