No Dev Team? No Problem: Writing Malware and Anti-Malware with GenAI
Depending on what you read, from where, and from whom, software engineering jobs are in jeopardy, have nothing to worry about, or can even be enhanced by generative AI.
There are even companies claiming that their AI solution can behave like a fully autonomous software engineer. Alternatively, there are those who counter claim that it isn’t true. (I’m paraphrasing)
Personally, I just want an easier time developing programs, so I’m willing to at least give “normal” generative AI or LLMs a shot. For this blog, I wanted to see how well these things can write scripts for a novice and without direct modification by the user. I also tried to avoid reading too deeply into the lines of code and making any direct suggestions to improve any AI-generated script.
I first got the idea to try to get AI to develop a basic anti-malware script, but the thought of testing it against real world malware wasn’t leaving me feeling particularly warm and fuzzy. So then I thought, what better way to be able to “trust” malware than if I made it myself?
The only problem is that I’m not a malware or exploit developer, though I’ve studied it briefly in the past. For some background on my programming ability, I can read, write, and edit basic scripts in Python, Rust, and Go. I’m far from a seasoned developer. So my natural thought process was, “If I’m using AI to write my anti-malware script, then why not the malware itself?”
Then as I started building my test VM, I realized I would need help with a general, not necessarily security-focused, script to help set up my testing environment. Why not have AI make me a third?
So ironically, the first program I wanted to get AI to write for me, the anti-malware, would have to be created last.
Goals & Setup
My goal was to create several types of fully functional scripts entirely with generative AI or large language model solutions. I was aiming for three types:
- General IT automation script — To save time doing a mundane task.
- Malware — I chose ransomware because its effects are the most dramatic and noticeable. It’s very easy to see if it’s successful or if it failed.
- Anti-malware — It’s more specifically anti-ransomware, and it has to be behavior-based detection. I didn’t want to build a script that was only useful to detect and mitigate the specific ransomware executable I created for this blog. Signature-based detection is only useful for a particular file. The second a single byte changes, the file will have a new hash.
I specifically wanted to build these scripts without directly and manually modifying a single line of code. All I wanted to do was copy and paste the code into empty *.py files and let them execute.
Each script used in this writeup was created using one of my Azure OpenAI web applications. All scripts were tested on one of my Debian 12 VMs on my personal machine. All videos were created using OBS Studio, which were then cropped using Adobe Express, and then converted to gifs using CloudConvert or EZgif.
I created three scripts:
- cpauto.py — the general IT automation script
- rudi_ransom.py — the ransomware script
- rrw.py — the anti-ransomware script
Now to get to work.
cpauto.py
(copy paste automation)
First, I created a single junk file to actually encrypt. I originally made 10 files that I was manually copy pasting, and in the middle of that, I got the idea to start automating this. The initial attempt:
Upon getting errors:
Upon getting more “this file already exists” errors, I typed another prompt:
Since it was now trying to put the files in a separate directory, which I didn’t want, I issued another prompt, but in the completion:
Got that removed then it was time. Just because I was tired of the back and forth, I modified the filename in the source_file = “/home/asdf/Documents/looks_important_10.txt”
line and modified it to be source_file = “/home/asdf/Documents/looks_important_1.txt”
. I deleted one character, technically violating my own rule, because of impatience.
Onto testing! I start with the single file named “looks_important_1.txt,” then run my script:
It worked like a charm.
The full script:
#This is designed to create new 2400 copies of the junk file specified. import os
import shutil
source_file = "/home/asdf/Documents/looks_important_1.txt"
destination_folder = "/home/asdf/Documents/"
# Check if the source file exists
if os.path.isfile(source_file):
# Create 2400 copies of the file with sequential numbering
for i in range(2, 2401):
new_file = os.path.join(destination_folder, f"looks_important_{i}.txt")
shutil.copy(source_file, new_file)
print("Files copied successfully.")
else:
print("Source file not found.")
So there you have it. I now have a drive full of data ready for encryption:
Also, this is what the data should look like:
So now I have roughly 19GBs of junk data to encrypt, it’s time to get to the encryption.
AI vs Human: 99% AI-generated
Works as Intended: Yes
rudi_ransom.py
(rudimentary ransomware)
I won’t lie. This was scary. I made this while I was making lunch.
I managed to get the initial iteration of this script fully functional and successfully tested its capability of encryption within 30 minutes. It only took several prompts.
For (hopefully) obvious reasons, I won’t provide the specific prompts I used to produce this ransomware script nor will I provide the full code. I am, however, willing to show a couple of snippets.
I start the attack from the /
directory:
root_dir = "/"
I append the .locked extension to the encrypted files:
new_file_path = file_path + ".locked"
My “ransom note” is created in multiple directories:
file_path = os.path.join(directory, "[LOCKED_BY_NOTTA_HACKER].txt")
Also, technically, it’s not ransomware, simply because I didn’t bother to have the decryption key stored or recorded anywhere upon execution, making the data unrecoverable. This means this “ransomware” is functionally a data wiper. For my testing purposes, I just needed a script that was capable of encryption only.
To to show that the test encryption script works:
Within several minutes after encryption was completed, it broke my VM’s desktop environment when it attempted an idle screen timeout lock:
So I follow those instructions:
So after I list my /home/asdf/Documents/
directory and cat
one of my .locked files:
Now I have to figure out a way to stop this. I revert to a snapshot and keep testing.
AI vs Human: 100% AI-generated
Works as Intended: Yes (!!!)
rrw.py
(rudimentary ransomware wrecker)
This was honestly the hardest script to get working adequately, which compounds upon the scariness of this entire exercise. Again, while I opted for a behavior-based detection anti-ransomware script, I didn’t want it to be too granular so it could only detect the rudi_ransom.py
script, but anything that exhibits similar behavior.
I’m just embarrassed at my earliest attempts. In the interest of saving time, I’ll start by showing version 4.0 of my latter script:
As with any initial execution of a script written by any generative AI tool, I get some errors. So I try to have it address those errors:
I think it’s ready, so I then execute the ransomware/wiper script again, this time while running a system monitor. I’m partial to bpytop
when I’m running Linux. However:
Back to the drawing board…
While my ransomware is running, I take note of the CPU usage:
So:
Let’s test the recent changes, and….:
I try to get my app to address the problem:
So I test the new script again, and…:
Maybe I made my ransomware too well?
I scrap the last chat, and start a brand new one with a slightly modified initial prompt:
Upon execution of rudi_ransom.py
, something interesting happened:
Okay, so this is the first time there was any indication that my rrw.py
script was doing something. Back to my AP app:
I try the new version:
Okay, time to switch it up again. Instead of anti-malware, let me create something more akin to restrictive application whitelisting:
So when I test it… :
The only problem is that this kills multiple benign processes. Numerous attempts to get it to only kill processes that start with the argument of python3
failed. I eventually got to v41 of rrw.py
, and the only iteration that was able to dynamically stop the ransomware script was so restrictive it stopped multiple unrelated scripts that were part of the intended functioning of the Linux system. I basically had to create a denial-of-service condition on my own system to prevent… a denial-of-service condition on my system by the ransomware.
KW: After some talk with a colleague/mentor I revisit the “anti-malware” angle, and he gave me some tips for detection opportunities.
I’m really glad I had that short talk, because he gave me the idea to not only use dynamic analysis (behavioral) but also static analysis. So I take this and issue a new prompt:
In follow up prompts, I make sure to include language to avoid some errors I noticed:
So I go to test VERSION 42 of rrw.py, and…:
It successfully performed static analysis on the strings contained within the file and deleted it based on the content within it. But even still, it also highlighted the potential danger of this script, since it accidentally deleted an earlier version of rrw.py
that was sitting in the /home/asdf/Downloads
directory.
While the newest iteration of rrw.py
was already running, I created a new version of rudi_ransom.py
in that same directory and it was able to avoid deletion, meaning rrw.py
only scanned that directory once upon initial execution and not continuously, as it was intended, on some sort of loop. This can easily be adjusted to simply run at regular intervals, such as every 5 milliseconds, as long as the script is running.
Let’s see how it performs dynamically:
It still lets rudi_ransom.py
encrypt without issue. I think it’s time to…
Here’s the script for anyone to rip apart:
import os
import time
import psutildef scan_files(directory):
for root, dirs, files in os.walk(directory):
# Exclude hidden directories
dirs[:] = [d for d in dirs if not d.startswith('.')]
for file in files:
file_path = os.path.join(root, file)
if file.endswith('.py'):
with open(file_path, 'r') as f:
content = f.read()
if any(keyword in content for keyword in ['cryptography', 'cryptodome', 'ransom', 'locked', 'encrypt']):
os.remove(file_path)
def monitor_filesystem():
while True:
for proc in psutil.process_iter(['pid', 'name', 'num_fds']):
try:
num_fds = proc.info['num_fds']
if num_fds >= 20:
start_time = time.time()
time.sleep(1)
end_time = time.time()
elapsed_time = end_time - start_time
if elapsed_time < 1 and proc.info['num_fds'] >= 20:
proc.kill()
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass
def monitor_processes():
while True:
for proc in psutil.process_iter(['pid', 'cmdline']):
try:
if any('cryptography' in arg for arg in proc.info['cmdline']) and '/home/asdf' in proc.cwd():
proc.kill()
except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
pass
if __name__ == '__main__':
scan_files('/home')
monitor_filesystem()
monitor_processes()
It definitely needs refinement.
AI vs Human: 100% AI-generated
Works as Intended: RAGE
Conclusion
Again, please remember that I approached this as someone who is a complete novice and wouldn’t know what lines required editing. I purposefully didn’t look too closely at the code. I wanted this to be an exercise in seeing how easily someone could get generative AI and LLMs to write completely functional code using only output from error messages and the behavior of the system.
As it turns out, generative AI can totally be used to write basic scripts, as long as you give them clear requirements. So that’s good. However, in my experience, the average chatbot will take quite a lot of back and forth prompting to fix errors and hallucinations. So that’s bad.
I was giving friends and colleagues play-by-plays as I was testing various iterations of the scripts while writing this blog, and the consensus opinion was that what I was able to accomplish with a whim was terrifying.
I’m not going to lie, I tend to agree. It’s scary that was I was able create the ransomware/data wiper script so quickly, but it took many hours, several days, 42 different versions, and even more minor edits to fail to stop said ransomware script from executing or kill it after it did. I’m glad the static analysis part worked, but that has a high probability of causing accidental deletions from false positives.
I just want to reiterate that I had my AI app generate my ransomware script while I was making lunch…