Hi @paultaylor,
So, on your recommendation, I’ve wrote a script, that does the following :
- checks my base dir for folders that are above 550GB, if there are any, it will split them in folders of 500GB each, and will keep their subfolders intact (to keep all the album folders intact)
- scans the now ready, with 500GB sized monthly folders, inside my main folder, and will list all the monthly folders it finds
- starts a fix task, followed by a delete duplicate and a rename task for each monthly folder
The script logs the folders that were fully processed in a log file, which avoids to re-process these folders as soon they are listed in songkong_log file. This makes it pretty convenient for me, as I can simply restart the task from where I left it, should anything bad occur to my server (eg: crash)
The script will also, if set to True, send regular notifications about the advancement of a task :
and another one, at the end of a task, here an example of a Rename task that ended :
Using this “small folder” approach, I was able to reduce the impact on my server resources drastically, as It only do load a super small part of my music files (as you recommended it several times to me, thanks for this).
My running docker container, executed by the script for each task separately, barely uses 3GB or RAM now, and the impact on the CPU is neglictable:
My main concern, is that it’s still runnin too slow in my humble opinion. And I am seriously thinking about running several instances of songkong in parallel, in order to process, not a single monthly dir, but 3 or 4, in parallel.
this brings me to the question I have for you : there is an option in songkong to get rid of the database, and the reports. I know there is no way to roll back if the database gets wiped, but this is a risk I am agreeing to take. but, what, except this feature, is the impact on next processes, if reports and databases are wiped before each start of songkong ? As far as I understand, songkong reads each files tags, and will write a fingerprintID for each of them. So, once the file itself is fingerprinted, matched to musicbrainz, and musicbrainz or discogs ID is present in the tags, what is the added value of keeping the database and reports ?