Finding a way to prevent audio models for AI machine learning from being fooled – sciencedaily


Warnings have emerged about the unreliability of the metrics used to detect whether an audio disturbance designed to deceive AI models can be perceived by humans. Researchers from the UPV / EHU-University of the Basque Country show that the distortion metrics used to detect intentional disturbances in audio signals are not a reliable measure of human perception, and have proposed a series of improvements. These disturbances, designed to be imperceptible, can be used to cause false predictions in artificial intelligence. Distortion measurements are applied to assess the effectiveness of methods in generating such attacks.

Artificial intelligence (AI) is increasingly based on machine learning models, trained using large data sets. Likewise, human-machine interaction increasingly relies on voice communication, mainly due to the remarkable performance of machine learning models in speech recognition tasks.

However, these models can be fooled by “conflicting” examples, in other words, inputs intentionally disturbed to produce a poor prediction without the changes being noticed by humans. “Suppose we have a model that classifies the audio (eg voice command recognition) and we want to trick it, in other words, generate a disturbance that maliciously prevents the model from functioning properly. If a signal is correctly heard, a person is able to notice if a signal says “yes”, for example. When we add a contradictory disturbance, we will always hear “yes”, but the model will start to hear “no”, or “turn right” instead of left or any other command that we don’t want to execute, ”explained Jon Vadillo, researcher in the computer science and artificial intelligence department at UPV / EHU.

This could have “very serious implications for the application of these technologies to real or very sensitive problems,” Vadillo added. It is not clear why this is happening. Why would a model that behaves so intelligently suddenly stop functioning properly even though it receives even slightly modified signals?

Trick the model using an undetectable disturbance

“It is important to know if a model or a program has vulnerabilities,” added the researcher from the Faculty of Computer Science. “First of all, we are investigating these vulnerabilities, to verify that they exist, and because this is the first step in possibly fixing them.” While much research has focused on developing new techniques for generating antagonistic disturbances, less attention has been paid to aspects that determine whether these disturbances can be perceived by humans and what those aspects look like. This question is important because the proposed antagonistic disturbance strategies pose a threat only if the disturbances cannot be detected by humans.

This study examined how well the distortion metrics proposed in the literature for conflicting audio examples can reliably measure human perception of disturbance. In an experiment in which 36 people evaluated conflicting examples or audio disturbances based on various factors, the researchers showed that “the metrics conventionally used in the literature are not completely robust or reliable. In other words, they do not adequately represent the auditory perception of humans; they may tell you that a disturbance cannot be detected, but when we test it with humans, it turns out to be detectable. So we want to issue a warning that due to the unreliability of these metrics, the study of these audio attacks is not being done very well, ”the researcher said.

In addition, the researchers proposed a more robust evaluation method which is the result of “the analysis of certain properties or factors of the audio which are relevant to evaluate detectability, for example, the parts of the audio in which a disturbance is most detectable. ” Even so, “this problem remains open because it is very difficult to propose a mathematical metric capable of modeling auditory perception. Depending on the type of audio signal, different metrics will likely be needed or different factors will need to be considered. Achieving representative general audio measurements is a complex task, ”concluded Vadillo.

Source of the story:

Material provided by University of the Basque Country. Note: Content can be changed for style and length.


About Georgia Duvall

Check Also

How to Take an Audio Tour of Historic Swan’s Food Hall Market Guided by Oakland Youth

There are all kinds of buffs there — board game nerds, beer savants — and …