I use beets in my setup and I am pleased with the results, keeps my library nice and clean. It is very capable by default and you can extend it even further with plugins. It will do fine importing well sorted albums and if you have a mess there’s also tag by filename and acoustic fingerprinting. You can use multiple metadata providers and adjust their weights for preference. It’s well documented and multiplatform (it can also be deployed as a container on a NAS system and manage of your imports). The biggest drawback is that you have to read a few pages of the docs before running it or do some dry runs.
Regarding different outputs for the same seed: have You changed seed source to CPU in A1111? The noise You get that way is consistent across different hardware vendors and different from the GPU-sourced one.
I have used A1111 just now to check and got very simmilar results to ComfyUI that way.