The AI Arms Race: Why Attackers Are Already Winning (And How to Catch Up)

Keith Pachulski
7 days ago
10 min read

Last week, I was sitting across from a CISO at a Fortune 500 company when he said something that made me pause: "We're just starting to explore AI for our security operations. It's exciting to think about the possibilities."

I had to break some bad news to him. While his team was "exploring possibilities," the attackers targeting his organization had been weaponizing AI for the better part of a decade.

Let me share what I've been seeing in the field. The gap between offensive and defensive AI capabilities is massive, and most security leaders have no idea how far behind they actually are.

A Brief History of AI: From Research to Weaponization

To understand how we got so far behind, it helps to look at the timeline. Artificial intelligence research began in earnest at the 1956 Dartmouth Summer Research Project, where John McCarthy actually coined the term "artificial intelligence." That workshop brought together brilliant minds like Marvin Minsky, Claude Shannon, and Nathaniel Rochester to explore how machines could simulate human intelligence. Their proposal was ambitious: "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."

For decades, AI remained largely in academic research labs, moving slowly from symbolic logic systems in the 1960s to expert systems in the 1980s, through multiple "AI winters" when progress stalled and funding dried up.

But here's where the story gets interesting from a cybersecurity perspective. While researchers were publishing papers and debating theory, criminal organizations quietly started seeing AI's practical potential around 2013-2015. The first documented AI-enhanced malware appeared around this time, with banking trojans using basic machine learning to adapt to defensive measures. By 2019, we saw the first known case of AI-generated voice fraud - that UK energy firm CEO impersonation I mentioned earlier.

The real inflection point came with the release of accessible AI tools. ChatGPT's launch in late 2022 didn't create AI-powered attacks, but it democratized access to the technology. Suddenly, criminal organizations that had been quietly developing custom AI systems for years had an entire ecosystem of script kiddies with access to similar capabilities.

What I'm Seeing in the Field

Here's what I know from working these incidents: criminal organizations started pouring money into AI and machine learning around 2015. Not for white papers or lab experiments, but for systems that steal money and data every single day. They've had nearly ten years to refine their approaches while most enterprise security teams are still debating whether to implement their first SOAR playbook.

Look at the numbers. In my incident response work, AI-driven attacks have tripled year-over-year. These aren't kids messing around with ChatGPT. These are professional operations using custom-trained models, automated reconnaissance systems, and machine learning that adapts while you're trying to stop it.

The latest WatchGuard Threat Lab data backs this up - they recorded a 171% quarter-over-quarter increase in unique malware detections, the highest they've ever seen. Meanwhile, their machine learning detection systems surged 323% because that's what it takes to catch these new threats.

Here's the kicker: 36% of security leaders admit that AI is moving faster than their teams can manage, and 48% believe a "strategic pause" is needed to recalibrate defenses. A strategic pause. While attackers are accelerating, nearly half of security leaders want to hit the brakes.

Even worse, 72% cite AI-related attacks as their top IT risk, but 33% are still not conducting regular security assessments for their AI deployments. We're calling it our biggest threat but not actually testing our defenses against it.

Case Study 1: The Adaptive Banking Trojan Landscape

The banking trojan threat landscape shows us exactly how AI is being weaponized against financial institutions. Look at what's happening with modern banking malware - trojan banker attacks on smartphones surged by 196% in 2024 compared to the previous year, with Kaspersky detecting more than 33.3 million attacks on smartphone users globally.

What makes this concerning is how sophisticated these trojans have become. TgToxic, a banking trojan first documented in early 2023, now uses advanced anti-analysis techniques, including obfuscation, payload encryption, and anti-emulation mechanisms that evade detection by security tools. The recent variants discovered in December 2024 rely on a domain generation algorithm (DGA) to create new domain names for use as C2 servers, making the malware more resilient to disruption efforts.

This attack demonstrates exactly the kind of systematic learning behavior that I help clients detect in their own environments. Modern banking trojans don't just steal credentials - they adapt their methods based on what defensive measures they encounter.

Detecting AI-Enhanced Banking Trojans:

When I work with clients on similar threats, here are the detection methods I recommend implementing in their Splunk environments. Note that these searches require specific data ingestion and may need additional Splunk add-ons:

Prerequisites: Banking transaction logs, DNS query logs, network connection logs. You'll need the Machine Learning Toolkit (MLTK) app installed for the ML functions.

Behavioral Baselining for Transaction Anomalies:

index=banking sourcetype=transaction_logs 
| eval hour=strftime(_time,"%H")
| stats avg(amount) stdev(amount) count by user hour
| eval upper_bound=avg+(3*stdev)
| eval transaction_velocity=count/1
| where amount > upper_bound OR transaction_velocity > baseline_velocity

DGA Domain Detection: Since modern trojans use domain generation algorithms, detect algorithmically generated domains:

index=dns sourcetype=dns_queries
| eval domain_entropy=entropy(query)
| eval domain_length=len(query)
| fit IsolationForest domain_entropy domain_length
| where outlier_score > threshold

Communication Pattern Analysis: Detect the systematic probing behavior that indicates learning algorithms:

index=network sourcetype=conn_logs
| stats count by src_ip dest_ip dest_port span=1h
| fit LocalOutlierFactor count by src_ip
| where LOF_score > 2.0

Case Study 2: The $25 Million Deepfake Conference Call

In early 2024, a multinational firm lost $25 million to one of the most sophisticated AI-powered social engineering attacks documented to date. A finance worker was tricked into attending a video conference call with what he thought were several other members of staff, but all of whom were in fact deepfake recreations, according to Hong Kong police.

The elaborate scam involved the worker being duped into attending a video call with what appeared to be legitimate colleagues and executives. The CFO on the call authorized the payment of $25 million, and everything seemed legitimate until the employee later checked with the corporation's head office and discovered the fraud.

These documented attacks show us the sophistication level we're now dealing with. When I consult with organizations on similar threats, the key insight is that deepfakes aren't just about the technology - they're about exploiting established business processes and trust relationships.

The Scale of Deepfake Voice Fraud: This isn't an isolated incident. A UK-based energy firm was scammed out of $243,000 when criminals used AI to impersonate the CEO's voice. The criminals created a voice so similar to the boss of the German parent company that the UK employees did not notice a difference. The attackers used commercially available AI software and collected voice samples from public sources.

Defensive Strategies I Recommend:

When I help clients prepare for similar deepfake attacks, these are the detection and monitoring strategies that work best. These require telephony metadata, financial transaction logs, and endpoint detection data:

Prerequisites: Call detail records (CDR) or telephony metadata, wire transfer logs, Microsoft Defender for Endpoint or similar EDR platform with process and network event logging.

Audio Anomaly Detection in Splunk: While you can't directly analyze call audio in Splunk, you can detect the behavioral patterns that precede these attacks:

index=telephony sourcetype=call_metadata
| stats count duration by caller_id recipient time_span=1h
| eval call_frequency=count/1
| eval unusual_timing=if(hour(time)>22 OR hour(time)<6, 1, 0)
| fit DBSCAN call_frequency duration unusual_timing
| eval behavioral_anomaly=if(cluster=-1, 1, 0)

Financial Transaction Verification Patterns:

index=financial sourcetype=wire_transfers
| eval transfer_size_category=case(amount<10000,"small",amount<100000,"medium",amount>=100000,"large")
| stats count by user transfer_size_category span=1d
| fit IsolationForest count by user
| where outlier_score > 0.7

Communication Platform Monitoring: Using Microsoft Defender or similar EDR platforms, monitor for suspicious meeting patterns: Note: This query uses Microsoft Defender for Endpoint (MDE) KQL syntax, not Splunk SPL

// Hunt for unusual video conference patterns
DeviceProcessEvents
| where ProcessCommandLine contains "Teams" or ProcessCommandLine contains "Zoom"
| join DeviceNetworkEvents on DeviceId
| where RemotePort == 443 and TimeGenerated > ago(24h)
| summarize CallDuration = max(TimeGenerated) - min(TimeGenerated) by DeviceId, RemoteIP
| where CallDuration > time(2h) // Unusually long calls for urgent financial decisions

Case Study 3: The AI Phishing Revolution

The documented evolution of AI-powered phishing shows us just how dramatically the threat landscape has changed. Hoxhunt's research reveals that AI agents have now surpassed elite human red teams in creating effective phishing campaigns - a 55% improvement in performance from 2023 to 2025. By March 2025, AI reached what researchers call its "Skynet moment" for social engineering.

The scale of this shift is remarkable. There's been a 4,151% increase in total phishing volume since ChatGPT's advent in 2022, with a 49% increase in phishing attacks that bypass email filters. What's most concerning is that 82.6% of phishing emails now use AI technology in some form, and 78% of people open AI-generated phishing emails with 21% clicking on malicious content.

Real-World AI Phishing Operations

Attackers are using tools like FraudGPT and WormGPT - malicious large language models designed specifically for cybercrime. These platforms, advertised on dark web forums, help scammers create phishing campaigns without any guardrails. WormGPT can generate Python scripts capable of credential harvesting, while FraudGPT creates convincing phishing emails that appear to be from reputable companies.

What I find most concerning about these documented attacks is how they represent a fundamental shift in threat sophistication. When I help organizations assess their phishing defenses, we're no longer looking at the traditional indicators - poor grammar, generic templates, obvious social engineering tricks. AI has eliminated those tells.

Detection Methods I Deploy for Clients

When organizations ask me how to defend against AI-powered phishing, here are the detection strategies I implement in their environments. These require email security gateway logs, DNS logs, and domain registration data:

Prerequisites: Email gateway logs with message content analysis, DNS query logs, domain registration/WHOIS data feeds. The Random Forest algorithm requires Splunk's MLTK app.

AI-Generated Content Detection: Train models to identify AI-generated text patterns:

index=email sourcetype=message_content
| eval content_entropy=entropy(body)
| eval sentence_similarity=avg(cosine_similarity(sentences))
| eval vocabulary_diversity=unique_words/total_words
| fit RandomForest content_entropy sentence_similarity vocabulary_diversity
| where ai_generated_probability > 0.8

Behavioral Email Analysis:

index=email_gateway sourcetype=user_interactions
| stats count avg(time_to_click) by user sender_domain span=1h
| fit LocalOutlierFactor count avg_time_to_click by user
| where LOF_score > 2.5 and avg_time_to_click < 30  // Suspiciously quick responses

Domain and Infrastructure Detection: AI phishing often uses newly registered domains and bulletproof hosting:

index=dns sourcetype=domain_queries
| eval domain_age=now()-domain_creation_date
| eval tld_reputation=case(match(query, "\.(tk|ml|ga|cf)$"), "suspicious", 1==1, "normal")
| where domain_age < 30 and tld_reputation=="suspicious"
| fit IsolationForest domain_age query_frequency

Why Defenders Are So Far Behind

After working dozens of these AI-powered attacks, I can point to three reasons why defenders are getting destroyed in this arms race:

Criminal organizations started dumping money into AI around 2015 because they could see immediate return on investment. Enterprise security teams are just now getting budget approval for AI projects in 2024. That's almost ten years of development gap.
Attackers can deploy experimental AI systems with zero regulatory oversight, no compliance requirements, and no uptime guarantees. Defenders need systems that work 99.9% of the time, pass audits, and play nice with infrastructure that's been running since 2010.
Attackers have been collecting training data from successful breaches, darkweb forums, and compromised systems for years. Defenders are usually working with cleaned-up, incomplete datasets that don't show how attacks really work.
The Cobalt State of LLM Security Report shows exactly how this plays out: only 21% of high-severity vulnerabilities found in LLM penetration tests get resolved. That's the lowest resolution rate across all test types. We're not just behind on implementing AI defenses - we're terrible at fixing the AI security issues we do find.

The Practical Path Forward

Here's what I tell every CISO who asks about closing the AI gap: you're not going to match their timeline, but you can use their own techniques against them.

Start with Detection, Not Prevention

The most successful AI implementations I've seen focus on catching AI-powered attacks rather than trying to prevent every attack. Use machine learning to spot:

Weird behavior that screams automated reconnaissance
Communication patterns that sound like AI-generated content
Network traffic that looks like coordinated, algorithm-driven activity

Weaponize Your EDR and SIEM

Every major EDR platform now has machine learning built in. Stop buying new tools and start actually using what you already have:

CrowdStrike's behavioral AI for endpoint anomaly detection
Splunk's Machine Learning Toolkit for custom model development
Microsoft Defender's automated investigation and response

Build Adversarial Thinking Into Your Team

The security teams that beat AI attacks think like attackers. Run exercises where your team plays the role of an AI-powered criminal organization. You'll find holes in your detection fast.

What's Coming Next

Based on current trends, here's what I expect to see in the next 18 months:

Large Language Model Integration: Attackers will plug LLMs into their existing tools for real-time social engineering, automated vulnerability research, and adaptive evasion.
Swarm Intelligence: Coordinated bot networks that share what they learn across multiple targets and adapt based on collective experience.
Adversarial AI: Machine learning systems built specifically to beat other machine learning systems. It's going to be an ongoing arms race.

The data supports this trajectory. WatchGuard saw a 712% increase in new malware threats on endpoints in Q1 2025 alone, with attackers shifting away from traditional crypto ransomware toward data theft operations. They're also heavily leveraging encrypted channels - TLS malware increased by 11 points as attackers use obfuscation and encryption to challenge conventional defenses.

What's really telling is that while new network exploits decreased 16%, this isn't because attacks are slowing down. Attackers are focusing on a narrower set of proven exploits while simultaneously developing AI-enhanced methods that bypass signature-based detection entirely.

Here's Where We Stand

The AI arms race in cybersecurity isn't some future threat - it's been running for years while most organizations were looking the other way. The attackers have a huge head start, but the game isn't over.

What it means is that security leaders need to stop treating AI like some future nice-to-have and start implementing it as something they need right now. The longer you wait, the further behind you get.

I've been helping organizations close this gap for three years. The ones that succeed do three things: they focus on getting something working instead of waiting for the perfect solution, they max out their existing tools instead of buying new ones, and they treat AI adoption like fixing a water leak, not planning a kitchen remodel.

The question isn't whether you should start using AI in your security operations. The question is whether you can afford to keep waiting while the attackers keep getting better at what they do.

Take Action Now

Don't let this become another article you read and file away. The AI threat isn't coming, it's here, and it's accelerating every day you wait.

Start This Week:

Audit your current detection capabilities against the techniques I've outlined above
Install Splunk's Machine Learning Toolkit if you haven't already
Run one of the detection queries in your environment to see what you're missing
Schedule a tabletop exercise where your team plays the role of AI-powered attackers

Start This Month:

Assess your organization's exposure to deepfake and AI-generated phishing attacks
Implement behavioral baselines for your critical business processes
Review your high-value transaction approval workflows for social engineering vulnerabilities

Start This Quarter:

Build a comprehensive AI-enhanced threat detection program
Train your security team on identifying and responding to AI-powered attacks
Develop playbooks for the attack scenarios we've discussed

The attackers aren't waiting for the perfect AI solution. They're iterating, learning, and improving their capabilities every day. You need to do the same.

We've helped organizations across finance, healthcare, manufacturing, and government sectors implement the exact detection techniques outlined in this article. More importantly, we've helped them build the internal capabilities to evolve these defenses as AI threats continue to advance.

Don't wait until you're reading about your organization in the next AI attack case study. The time to act is now.

Best,

Keith Pachulski

Red Cell Security, LLC

keith@redcellsecurity.org

www.redcellsecurity.org

📅 Book time with me: https://outlook.office365.com/book/redcellsecurity@redcellsecurity.org/