
uaetodaynews.com — A new danger in the world of technology: How does artificial intelligence become “poisoned”?!
What is meant here is the “poisoning” of artificial intelligence, a new and hidden threat that may undermine trust in intelligent algorithms. Recent research has shown that this risk is real; Scientists from the British Institute for AI Security, the Alan Turing Institute, and Anthropic have found that — to throttle a large language model like ChatGPT or Claude — hackers are able to have a hidden impact by inserting just about 250 malicious examples into millions of lines of training data. This research was published in Computer Science magazine.
What is AI poisoning?
It is the deliberate training of neural networks on false or misleading examples with the aim of distorting their knowledge or behavior. The result is that the model begins to make mistakes, or execute malicious commands in an overt or covert manner.
Experts distinguish two main types of attacks:
- Targeted attacks (backdoor): aim to force the model to respond in a specific way when a covert stimulus is present; For example, “inject” is a hidden command that makes the model respond with an insult when a rare word appears in the query, such as alimir123. The answer may seem normal to a normal query, but turns offensive when a trigger is introduced. Attackers can publish this trigger on websites or social media to activate it later.
- Indirect attacks (content poisoning): They rely less on hidden incentives than on filling training data with false information. Because the models rely on vast amounts of content available on the Internet, an attacker can create multiple sites and sources that promote false information (for example: “vegetable salad cures cancer”); If these sources are used in training, the model will begin to repeat those lies as truths.
How dangerous is this in practice?
Empirical evidence suggests that data poisoning is not just a hypothetical scenario: In an experiment last January, replacing just 0.001% of training data with misleading medical information led to the model giving incorrect advice in the context of typical medical tests. This demonstrates the ability of small, subtle attacks to cause significant damage that affects the integrity of the output and the trust of users.
Source: Naukatv.ru
Disclaimer: This news article has been republished exactly as it appeared on its original source, without any modification.
We do not take any responsibility for its content, which remains solely the responsibility of the original publisher.
Author: 
Published on: 2025-10-23 15:46:00
Source: arabic.rt.com
Disclaimer: This news article has been republished exactly as it appeared on its original source, without any modification. We do not take any responsibility for its content, which remains solely the responsibility of the original publisher.
Author: uaetodaynews
Published on: 2025-10-23 21:57:00
Source: uaetodaynews.com
