Artificial Voices, Real Threat
“Hi Grandma, it's Billy. I've got an emergency and need money.” Criminals can now use artificial intelligence to create deceptively realistic imitations of the voices of loved ones or other trusted persons — automatically, personalized, and at scale. Further complicating the problem, cybercriminals are also harvesting information from social networks or data leaks to make their attacks more targeted — a key way to gain trust quickly. Hope is coming from a project called “AntiScam,” headquartered at the University of Siegen, which is exploring these “conversational scams,” as researchers call digital fraud attacks that feel like real conversations. Among the most dangerous is ‘voice phishing,’ where scammers initiate a telephone conversation by pretending to be a family member, bank, or government agency. Major funding for the project comes courtesy of the German Federal Ministry of Education and Research (BMBF), with a total budget of 1.4 million euros. The University of Siegen is coordinating the collaborative research project with partners Hochschule Bonn-Rhein-Sieg and open.INC, a Siegen-based firm.
“Technology is developing rapidly. At this point, many people struggle to tell whether they're talking with a real person or with AI,” says Dr. Md Shajalal, the project lead. This reflects one of the earliest findings of the study: “People often rely on their gut feeling. And they're often wrong. Which is why we need new safeguards and better awareness.” Fraudulent calls often rely on fear and pressure, Shajalal reports. “Because they are under stress, many of the victims are slow to recognize the manipulation.” Classic phishing, which involves email-based fraud attempts of a similar nature, is well known and researched at this point. Voice phishing is not. The project is exploring how to recognize and prevent attacks of this kind. The researchers are developing systems for identifying artificially generated voices.
AntiScam is pursuing an interdisciplinary, socio-technical research approach that blends machine learning methods with approaches from human-computer interaction. The first step is the composition of a systematic threat matrix for AI-based conversational phishing, including typical attack patterns, manipulation strategies, and potential countermeasures. Drawing on those insights, the researchers are developing technical detection systems as well as strategies for digital self-defense.
At the heart of the technical research is an AI model that can differentiate real from artificially generated voices. The system is being trained on a custom-built database with several thousand snippets of human speakers and AI-generated voices. The analysis is focused on acoustic characteristics that are barely perceptible to the human ear, such as unusual uniformity, a lack of breathing sounds, or unnatural modulation patterns. A special focus is being placed on the use of ‘explainable AI’ or xAI. This means that the system doesn’t just flag suspicious voices but also explains why it suspects that a voice has potentially been manipulated. This helps users better understand warnings and reinforces their own ability to recognize fraudulent communication situations in the future.
When trust becomes a trap
At the same time, users should understand how to recognize manipulation. That is why AntiScam focuses not only on technology but also on education. Plans include interactive learning formats and a serious game designed to raise awareness of common fraud strategies. The goal is for users to experience and practice these strategies in realistic scenarios. “Criminals are increasingly relying on psychological manipulation,” says Professor Gunnar Stevens of the Chair for Business Information Systems / Data Protection and IT Security. “Classic IT security is no longer sufficient on its own. We have to understand how people make decisions and how trust can be digitally manipulated.”
At the same time, researchers are exploring legal questions related to AI fraud and developing recommendations to improve consumer protection and establish new rules for dealing with deepfakes and digital manipulation. The legal work is being led by Professor Maximilian Becker of the Chair for Civil Law and Commercial Law, especially in the areas of intellectual property and media law.
The research findings to date have already been presented at international academic conferences, including in Germany, Japan and Canada. Other academic publications and conference presentations are planned as the project progresses. Both the software being developed and the research findings will also be made publicly available to support efforts in research, industry, and society to combat AI-driven fraud over the long term. The software will be available by the end of 2027. The project findings offer potential for the development of new security products and new market opportunities for companies in the cybersecurity field.
Related link:
Initial research publications:
Amirkhani, S., Stevens, G., Shajalal, M.D., and Boden, A. (2025) “Detecting the Undetectable: Human Judgments and the Challenge of Synthetic Voices,” in Proceedings of the 12th International Conference on Communities and Technologies.
LaRock, J., Shajalal, M., and Stevens, G. (2025) “Interpretable Deepfake Voice Detection: A Hybrid Deep-Learning Model and Explanation Evaluation,” in Biecek, P., Nowaczyk, S., et al. (eds.) Joint Proceedings of the xAI 2025 Late-Breaking Work, Demos, and Doctoral Consortium Papers.
Md Shajalal, Md Mahedi Hasan Riday, Sima Amirkhani, and Gunnar Stevens, “Human-Centered Explanations for Audio Deepfakes: Making Machine Reasoning Human-Perceptible Through Voice Traits,”28th HCI (Human-Computer Interaction) International Conference, Montreal, Canada, 2026