Generative AI, notably exemplified by OpenAI’s ChatGPT, has been increasingly exploited by hackers, sparking concerns about potential risks. Following OpenAI’s ChatGPT launch, instances emerged of cybercriminals utilizing the AI chatbot for crafting hacking tools. A fresh report asserts that sizeable language models (LLMs) can be manipulated into performing malicious actions.
IBM researchers reportedly managed to “hypnotize” several LLMs, including GPT-3.5, GPT-4, BARD, mpt-7b, and mpt-30b (developed by AI company HuggingFace). Astonishingly, these models could be easily coaxed into complying with intentions via proficiently phrased English commands.
Chenta Lee, Chief Architect of Threat Intelligence at IBM, noted the transformation of English into a kind of “programming language” for malware. This development implies that hackers can circumvent traditional programming languages like Go, JavaScript, and Python to manipulate LLMs. By mastering English prompts, attackers could potentially exploit LLMs for nefarious purposes.
The report detailed that under hypnosis, LLMs could divulge sensitive financial data, generate vulnerable or malicious code, and suggest weak security measures. However, not all LLMs responded uniformly; GPT-3.5 and GPT-4 were more susceptible to manipulation compared to Google’s Bard and a HuggingFace model.
Of particular concern is the vulnerability of the general public. The consumerization and hype surrounding LLMs could lead users to accept information from AI chatbots uncritically. As LLMs become commonplace for various tasks, including seeking online security advice, hackers might exploit this trust to disseminate misleading or erroneous information.
Furthermore, smaller businesses with limited security resources may also be at risk due to their reliance on AI chatbots. The report underscores the necessity of raising awareness about the potential manipulation of LLMs and promoting a cautious approach when interacting with them.