These things are listening devices that relay all they hear after the keyword to central servers to decode. They can only detect the keyword. Once that's detected they transmit all the digitised audio back to their main servers. This then does two things. It will try and do a web search on what the servers think the question is and the data will also be used to target advertising, build up the service provider's profile of the user.
If you can find the internet radio station by typing a request into a search engine then these things should be able to do the same. I think calling them smart speakers is a misnomer. There's very little inside them. It's why they are cheap. Almost all the processing of what they hear is done overseas on powerful servers.
If you are concerned about privacy then it is worth reading the terms and conditions carefully. The providers of these things makes their money from the data they gather from listening to questions and conversations around them. Similar to the way search engines and internet sales outlets make money. Personal data is a valuable commodity. Targeted advertising earns more revenue than broad brush advertising, like that on a dumb TV.