I usually use it a starting point and then go into the details page and listen to the reference calls to see if its the right bird. If I can see the bird, its pretty easy to tell if its right or not.
That said, I've only had Merlin misidentify a bird once. However there are a lot of times where it fails to identify a bird, even if the recording is crystal clear.
Like another commenter said, Mopidy can do it all in one instance. It works, but I personally find its integration with MPD clients to be a bit clunky so I don't use it all that much.
Personally I use Snapcast as an endpoint, plain MPD for local files, and navidrome for remote access to my library.
Snapcast supports Spotify endpoints, so I just switch to my Spotify stream when I want to listen to Spotify and to my MPD stream when I want to listen to local stuff.
This is more of an ecosystem than a single solution though, so it may not be what you're looking for.