Attackers and defenders can utilise ChatGPT, here's how

11th April 2023

Kiera Sowery

0 0

ChatGPT has been at the forefront of many minds since its release last year, and for good reason. The AI-driven natural language processing tool has already been used by underground hacking communities to discover how it can help facilitate cyber attacks and support malicious operations.

By utilising ChatGPT, users can create highly evasive polymorphic malware by continuously asking ChatGPT and rendering a new piece of code each time.

But, it’s not all bad.

ChatGPT can also be used defensively, to aid experts in anticipating a cyber-attack, checking code and finding vulnerabilities.

Needless to say, the security community is highly interested in ChatGPT.

Etay Maor, Senior Director of Security Strategy at Cato Networks, and other cybersecurity experts discuss how both attackers and defenders can utilise ChatGPT.

Etay Maor, Senior Director of Security Strategy at Cato Networks says: “DevSecOps teams can leverage ChatGPT to identify errors and vulnerabilities in code and web pages before the code is committed. Another area where it can be a major game changer is identifying logical errors. Finding a syntax error is something that every compiler does, but finding logical errors is something that’s truly impressive.

“ChatGPT can help analyse previous attacks and give an educated response as to what tactics threat actors used or may use in the next steps. In theory, it might be able to identify AI-generated phishing emails although we are not there yet.

“It could also potentially be leveraged to investigate a security incident so that a cyberattack victim can learn how information security defences failed, then take the appropriate measures to plug loopholes. In the absence of AI-tools, forensic analysis and manual correlation of security data to find the root causes of data breaches can be painstaking and time-consuming.

“Unfortunately, businesses and cybersecurity pros aren’t the only ones that can benefit from generative AI. Tools like ChatGPT will also help advance cybercrime and even hone the skills of adversaries. Other negative applications of this AI could include how ChatGPT helps significantly improve accuracy (emails written by ChatGPT will have no spelling or grammatical mistakes and few factual errors). If an attacker needs to write phishing emails in Arabic or Swedish, ChatGPT can translate and convert text into local languages. Attackers can ask ChatGPT to write (and eventually voice mimic) in the style of a specific person (like the CEO), making email and communications a lot more convincing.

“The fact the responses by ChatGPT are human-like makes social engineering scams much easier. This combined with the capability of AI-generated pictures of people (for LinkedIn-type fraud), audio impersonations, and deep fake videos, create the perfect ingredients for advanced forms of social engineering.

“Most cybersecurity tools block malware by inspecting its code and detecting its signature patterns. Researchers have already discovered ChatGPT can help create multiple mutations of the same malware. This means threat actors can easily evade traditional cybersecurity defences by creating different mutations of the same malware (polymorphic malware is already being created today, however where ChatGPT shines is that it spins out a different version of malware effortlessly, automatically and on demand). Check Point also discovered conversations on the dark web where cybercriminals claimed to have created ransomware-like programs using ChatGPT.

“Just as DevSecOps teams can debug code using ChatGPT, threat actors can also use the tool to hunt for vulnerabilities in the code and the worst part is, ChatGPT allows them to do this at scale. Instead of going through line by line manually, attackers can simply ask ChatGPT to deconstruct code and point out its weaknesses and vulnerabilities. For example, a smart contract auditor recently identified weaknesses in a smart contract using ChatGPT.”

Dave Ahn, Chief Architect at Centripetal explains: "There's a lot of doom and gloom chatter about GPT. The quality bar is going to increase dramatically when it comes to phishing, BEC and similar attacks that leverage readable messages intended for human or computer interpretation of content. Adversaries with an imperfect command of the target's native language will have better tools to eliminate the tell-tale signs. The best and most effective individual defence against a cyber attack is the human mind, and ChatGPT will make it easier to fool it. All the other implications in the media are very real, but I don't think they're as dramatic of a gap. The low-hanging fruit for the malicious actors just got a lot lower and more plentiful. My greatest concern about ChatGPT and related technologies is its power in disinformation and scaling that disinformation through automation. Malicious actors can potentially change what is perceived as factual or truth."

Nick Rago, Field CTO at Salt Security says: “OpenAI uses non-expiring static API keys for authentication. Enterprises using the ChatGPT API must protect those keys from being stolen and misused just the same as any other API. They should also monitor API usage by leveraging an API Gateway as an intermediary between organisations. To their credit, OpenAI does provide a best practices guide for API key management on their site. However, if an organisation's API key gets compromised, OpenAI also warns that an organisation could experience data loss, data loss, unexpected charges, a depletion of monthly quota, and interruption of API access.”

Javvad Malik, lead security awareness advocate at KnowBe4 says: "From a cybercrime point of view, the initial easy use cases for criminals would be to quickly generate phishing emails. These would be free of the typical tell-tale signs of poor spelling and grammar and can be tailored to individuals and organisations. From a positive perspective, ChatGPT could have some benefits in spotting AI-generated emails or behaviours. It could also be used to undertake code reviews. Ultimately though, ChatGPT is a tool, and its usage still relies on the human operators."

Andrew Bolster, Senior R&D Manager in Data Science at Synopsys comments: “ChatGPT and other “Large Language Models” are ‘similarity engines’; they’ve been trained on huge portions of the internet and other textual data stores to generate sequences of tokens, that we’d call “words”. Ideally the ‘most likely’ words in the ‘most likely’ order for that output to appear similar enough to the source data it was trained on.

“That means corporate websites, social media posts, Wikipedia entries, fan-fiction posts, creative writing challenges, coding challenges, resumes and interview prep guides; pretty much anything you can find on the wonderful world wide web has potentially been used to ‘train’ models like this.

“However, it just so happens that there’s a lot of people posting a lot of code, so sometimes, the ‘most likely’ sequence to a question like “Write me python code that does XYZ” is a sequence of tokens that happen to be a valid computer program. The models were never “trained” on the semantics of the Python (or any other) programming language, they were never ‘tested’ or graded about it, and never (really) needed to defend or argue their technical or stylistic choices against anyone else; it just happens that a sufficient amount the models training data included enough referenced (and mostly correct) code snippets that it can successfully parrot out something that fundamentally “looks right” and most of the time, actually executes correctly.

“All that being said however, it’s important to maintain the context of operation of these LLMs; they’re trained not on the ‘most correct’ or ‘most truthful’ outputs, but outputs that are ‘indistinguishable’ from the original data it was trained on. The goals of these types of generalist models are to generate outputs that ‘look right’. Beyond that, you’re at the whims of the potentially billions of variably correct, incorrect, satirical, fictional, sarcastic, deceptive, manipulative, or hypothetical inputs that these models are trained on.

“A critical part of the increasing use of these tools is that this ‘close enough’ approach can be very very dangerous; the entire cybersecurity industry hinges on the concept of vulnerable code, almost always generated by a well-intentioned, educated, informed, and experienced human who was trying to solve a problem, came up with a solution, and that solution had a ‘bug’ or a ‘misbehaviour’ that in certain contexts could be used by an attacker to make that ‘solution’ do something it wasn’t intended to. Models like ChatGPT are pretty good at writing code that “looks correct”, but that doesn’t mean it’s not vulnerable to manipulation, and as ChatGPT starts to supplant developer documentation, forums, etc. and more and more technologists use it for ‘experimenting’, blindly copy/pasting code from GPT to IDE, the risk of even unintentional software vulnerability is greatly increased.

“These kinds of tools are fantastic for ‘simple’ tasks, not because the models are somehow ‘clever’, but because there’s enough data in the training sets that asking/answering similar enough questions or tasks that the model ‘remembers’ the shape of the solution, however, there is no possibility of these models creating ‘novel’ ideas or techniques, at least for the time being.

“A more ‘existential’ question around the use of these technologies is the privacy/licensing/ownership side of things; ChatGPT was trained on a lot of input. Some of that input was code, and some of that code was ‘publicly available’ but carrying an Open-Source licence. Such licences permit and deny different types of use and reuse of the ‘publicly available’ code, even down to operations in different regions. The question is, if ChatGPT was ‘trained’, unintentionally or not, on ‘Open Source’ code, does its output carry those same restrictions and constraints? If you add that ChatGPT-generated code to a commercially designed project that isn’t permitted under the original Open-Source licences, are you in breach of that underlying licence?”