A teenager walks through a line of family, friends, teachers and coaches on a soccer field. This police video is a world premiere. With the help of a deepfake video, the Dutch police from Rotterdam are looking for the murderer of Sedar Soares. What's special about the film: the victim herself is looking for the perpetrator.

Attention Deepfake: How artificial intelligence imitates voices in a deceptively real way

Voice cloning was only recently invented, but the technology is already showing impressive results.
It's not just about parodies of celebrities. Friendsurance examines who and how deepfakes help and what danger they pose. Every user already has access to the deepfake software. A deepfake is the deceptively real imitation (faking) of video , audio or photo content . It is the result of manipulation by technology (artificial intelligence, abbreviated as AI). Ian Goodfellow, director of machine learning at Apple's Special Projects Group, coined the term "deepfake" in 2014 when he was a student at Stanford University. The term was created from a combination of the first part of “ Deep Learning ” and the word “ Fake ”.

Deepfake develops a generative and adversarial algorithm. Like a human, he learns from his own mistakes, as if he were competing with himself. The system “scolds” the algorithm for errors and “rewards” it for correct action until it produces the most accurate fake.

AI technology advances, deepfake creation is becoming increasingly easier. To create a voice clone , all you have to do is record your voice for a certain amount of time without any dropouts or other disruptions. Then the file is sent for processing to a company offering such a service or uploaded to a special program by the speaker himself. Several startups already offer such services: Resemble, Descript, CereVoice Me and others.

A few years ago, realistic deepfakes were created by recording a person's voice, breaking down their speech into individual sounds, and then combining them into new words. Today, neural networks can be trained with a set of speech data of any quality and volume. Thanks to the adversarial principle, a neural network can recognize real human speech faster and more accurately. While it previously required dozens or even hundreds of hours of audio, realistic voices can now be created from just a few minutes of audio. The companies want to market the technology and are already offering it for some applications.

Deepfake commercials, films and dubbing

Veritone launched a service called marvel.ai to create and monetize voice fakes. The technology allows influencers, athletes and actors to license the use of their voices. This means that products (e.g. commercials) can be created using their voices without having to visit the studio. The company uses embedded “ watermarks ” to guarantee that the deepfake cannot be copied and used illegally. The voice deepfakes created by the company can be adjusted in tone or gender, as well as translated into other languages.

Microsoft has been offering its partners a similar service since the beginning of 2021. The Microsoft Azure AI platform can synthesize celebrity voices so that they are indistinguishable from the originals. For example, the US telecommunications company AT&T greets visitors with the voice of Bugs Bunny in an experience store in Dallas. He greets each guest by name and chats with them while they shop. The voice actor recorded 2,000 sentences for Microsoft to voice Bugs Bunny

Podcasts and audio books

Descript's podcast editing software includes overdub technology. This feature allows a podcaster to create an AI clone of their voice so producers can quickly edit episodes. It helps not only to delete unnecessary words, but also to replace them with new ones. To use Descript, all you have to do is “speak” the desired amount of text. The tool is already being used by Pushkin Industries, which has worked with podcasters and audio storytellers such as Malcolm Gladwell (Revisionist History), Michael Lewis (Against the Rules) and Ibram X. Kendi (Be Antiracist).

Threat of deepfakes

Researchers at SAND Lab (University of Chicago) have tested speech synthesis software available on the open source developer platform Github. It turns out they can fool voice assistants Amazon Alexa, WeChat and Microsoft Azure Bot. The SV2TTS program only takes 5 seconds to create a pretty good simulation. The software was able to fool the Microsoft Azure bot about 30% of the time, while WeChat and Amazon Alexa voice assistants were unable to detect the deepfake 63% of the time. Of the participating volunteers, more than half of the 200 respondents could not hear the deepfake.

Researchers see this as a serious threat in the form of fraud and attacks on entire systems. For example, WeChat allows users to log into an account with their voice, and Alexa allows users to make payments using voice commands. Similar stories have happened several times. In 2019, fraudsters used a voice deepfake to defraud an executive at a British energy company. The man was convinced he was receiving a call from his German boss and transferred over US$240,000 to the scammers.

Companies that offer deepfakes as a service do not deny that they can be used maliciously. They offer a service with which lifelike voices can be created virtually. For example, Lyrebird, a company founded in San Francisco, claims to be able to create “the most realistic artificial voices in the world.” And it's all done using Descript's software, which creates a voice clone after uploading a one-minute recording.

The problem with the commercial use of voice deepfakes is that there is no copyright to the human voice in any country in the world. The deceased's legal issue regarding the use of her voice is also still open. In addition, there is currently no legislative practice in any country that could influence the process for removing deepfakes. Laws regarding their use are currently being drafted in the United States and China. For example, California has banned the use of deepfakes in advertising.

The only exception is if a person's name is registered as a commercial trademark. As a rule, these are celebrities. In 2020, the American YouTube channel Vocal Synthesis posted several humorous, non-commercial recordings of lyrics by rapper Jay-Z. All videos had subtitles indicating that the celebrity's speech had been synthesized. However, the concert company RocNation, owned by Jay-Z, filed a copyright infringement lawsuit and demanded that the videos be removed. Ultimately, only two of four videos featuring Jay-Z were removed, as the other two audio products were derivative works unrelated to the rapper's songs.

Ethical issues

Deepfakes can also be used for useful purposes.
However, ethical problems also arise. For example, the documentary “Roadrunner: A Film about Anthony Bourdain” about chef Anthony Bourdain
described as unethical by critics and audiences alike. The filmmakers used Bourdain's neural network-generated voice and dubbed phrases that the chef never actually said. Film critics who were unaware of this also condemned the authors and described their actions as fraud and manipulation of the audience.

Meanwhile, Sonantic has announced that it has created a voice clone of actor Val Kilmer, who can barely speak after undergoing a tracheostomy as part of his throat cancer treatment. The team used its own AI model, Voice Engine. The actor is very grateful for this. Sonantic notes that its proprietary app allows creative teams to enter text and then adjust key parameters like pitch and tempo.

What are the prospects for use?

Voice specialists and speakers believe that deepfakes can be very useful for machine processing of voices: in
messengers, announcements, etc. But they cannot compete with real people when “human” emotions are required. However, companies are already working on this too: for example, Resemble AI is already proposing to use a form of modulation when creating deepfake that changes intonation and adds emotion to speech.

TikTok was the first social network to offer an automatic voice-over function for text messages back in late 2020 . However, the synchronization had to be changed. It turned out that the synthetic female voice actually belonged to a real person, voice actress Bev Standing, who had previously worked with the Chinese Institute of Acoustics. The woman sued TikTok.

Deepfake technologies themselves don’t actually cause any harm. It depends on the purpose for which they are used. For example, AI technology could be used to restore the voice of people who have lost it due to illness.

In addition, many people already use deepfake technology every day. But it's probably not so clear to people that this is also a deepfake: it's about voice assistants. Alexa, Google Assistant, Siri, Cortana, Alisa, Bixby and so on and so forth. They help us search for information, read e-books and can control smart home systems. Are they dangerous? The answer in this case could be like this: a set of algorithms and code in itself is not dangerous. Unfortunately, this is the responsibility of humans, who can use this code for better or for worse.

Some tips from Friendsurance insurance experts: Since a smartphone is the most common device for receiving and transmitting voice information, our experts recommend a number of simple protective measures. It is important to pay attention not only to data security, but also to the physical security of your phone.

An important step is therefore to protect your smartphone in everyday life, for example by:

  • the home screen (password or biometric data)
  • Password assignment for every app that contains sensitive data
  • Regular backups of your data
  • Do not leave your device unattended, even for a short time. For example, don't put it on the table when going through security at the airport or train station (better put it in your bag). There is a risk of theft and damage.
  • Carry your cell phone in a safe place, closer to your body. Avoid the idea of ​​carrying your device in the back pocket of your pants or in the outside compartment of your backpack.
  • Last but not least, you should think about cell phone insurance. If something happens to your device, you have the option of having it repaired at no additional cost or even getting a new device in the event of a total loss. As a rule of thumb, cell phone insurance prices are much lower than if you had to pay 100% of the repair costs out of your own pocket

Related to the topic:
Austria: Federal government presents action plan against deepfakes

and

Notes:
1) This content reflects the current state of affairs at the time of publication. The reproduction of individual images, screenshots, embeds or video sequences serves to discuss the topic. 2) Individual contributions were created through the use of machine assistance and were carefully checked by the Mimikama editorial team before publication. ( Reason )