Kaspersky experts have conducted research studying ChatGPT phishing links detection capability. While ChatGPT had previously demonstrated the ability to create phishing emails and write malware, its effectiveness in detecting malicious links was limited. The study revealed that although ChatGPT knows a great deal about phishing and can guess the target of a phishing attack, it had high false positive rates of up to 64 percent. Often, it produced imaginary explanations and false evidence to justify its verdicts.
ChatGPT, an AI-powered language model, has been a topic of discussion in the cybersecurity world due to its potential to create phishing emails and the concerns about its impact on cybersecurity experts’ job security even despite its creators’ warnings that it is too early to apply the novel technology to such high-risk domains. Kaspersky experts decided to conduct an experiment to reveal ChatGPT’s ability to detect phishing links, as well as the cybersecurity knowledge it learned during training. Company’s experts tested gpt-3.5-turbo, the model that powers ChatGPT, on more than 2,000 links that Kaspersky anti-phishing technologies deemed phishing, and mixed it with thousands of safe URLs.
In the experiment, detection rates varies depending on the prompt used. The experiment was based on asking ChatGPT two questions: “Does this link lead to a phishing website?” and “Is this link safe to visit?”. The results showed that ChatGPT had a detection rate of 87.2% and a false positive rate of 23.2% for the first question. The second question, “Is this link safe to visit?” had a higher detection rate of 93.8%, but a higher false positive rate of 64.3%. While the detection rate is very high, the false positive rate is too high for any kind of production application.
False positive rate
Does this link lead to a phishing website?
Is this link safe to visit?
The unsatisfactory results at the detection task were expected, but could ChatGPT help with classifying and investigating attacks? Since attackers typically mention popular brands in their links to deceive users into believing that the URL is legitimate and belongs to a reputable company, the AI language model shows impressive results in the identification of potential phishing targets. For instance, ChatGPT has successfully extracted a target from more than half of the URLs, including major tech portals like Facebook, TikTok, and Google, marketplaces such as Amazon and Steam, and numerous banks from around the globe, among others – without any additional training.
The experiment also showed ChatGPT might have serious problems when it comes to proving its point on the decision whether the link is malicious. Some explanations were correct and based on facts, others revealed known limitations of language models, including hallucinations and misstatements: many explanations were misleading, despite the confident tone.
Below are the examples of misleading explanations provided by ChatGPT:
“ChatGPT certainly shows promise in assisting human analysts in detecting phishing attacks but let’s not get ahead of us - language models still have their limitations. While they might be on par with an intern-level phishing analyst when it comes to reasoning about phishing attacks and extracting potential targets, they tend to hallucinate and produce random output. So, while they might not revolutionize the cybersecurity landscape just yet, they could still be helpful tools for the community,” comments Vladislav Tushkanov, Lead Data Scientist at Kaspersky.
To learn more about the experiment, visit Securelist.com.
Kaspersky's ML team is at the forefront of applying machine learning technologies to cybersecurity tasks, constantly updating Kaspersky products with the latest tech and intel. To take advantage of Kaspersky's expertise in machine learning and stay protected, the company's experts recommend: