• @Solemn@lemmy.dbzer0.com
    link
    fedilink
    English
    1111 months ago

    The way that “Hey Alexa” or “Hey Google” works is by, like you said, constantly analysing the sounds they said. However, this is only analyzed locally for the specific phrase, and is stored in a circular buffer of a few seconds so it can keep your whole request in memory. If the phrase is not detected, the buffer is constantly overwritten, and nothing is sent to the server. If the phrase is detected, then the whole request is sent to the server where more advanced voice recognition can be done.

    You can very easily monitor the traffic from your smart speaker to see if this is true. So far I’ve seen no evidence that this is no longer the common practice, though I’ll admit to not reading the article, so maybe this has changed recently.

    • @uzay@infosec.pub
      link
      fedilink
      211 months ago

      If they were to listen for a set of predefined product-related keywords as well, they could take note of that and send that info inconspicuously to their servers as well without sending any audio recordings. Doesn’t have to be as precise as voice command recognition either, it’s just ad targeting.

      Not saying they do that, but I believe they could.