}

Unlocking the Power of AI to Bolster Cybersecurity Defenses

I agree with US Representative Ted Liu when he wrote, “I am enthralled by AI and excited about the incredible ways it will continue to advance society,” In the New York Times. In fact, it’s one of the most exciting developments in technology (and cyber security) I’ve seen in decades.

However, my excitement cuts two ways: while society as a whole is likely to benefit greatly from artificial intelligence tools and technologies, so too will bad actors.

Despite predictions of gloom and doom, such as a potential for massive job losses, the opportunities for technology professionals are massive: someone has to develop the software and tools, train the AI, learn to use the AI tools (and teach others), and create ways to protect us from potential abuse of the tools.

The hardware and software landscape is evolving faster than any technology in history. As a result, hardly a day has gone by in 2023 without new products, companies, or financial news about AI.

SQL Server 2019 screenshot
Image by Dall-E 2 from Open AI. Prompt: “photo-realistic image of a 30-year-old hacker sitting in a coffee shop typing on a laptop and listening to music through earbuds”.

What is AI Good AT?

To answer this, we need to look at two major AI model families: generative and discriminative. The former is what has been covered in the press recently: tools such as ChatGPT and DALL-E from OpenAI fall into that category. Tools that recognize the content of photos or the words in a text are in the latter family. It can automate time consuming tasks like analyze data. All of these have potential applications to cyber security.

Generative tools can be used to create text, images, music, video, holograms, and voices, potentially in the style of a selected individual. For example, image generators can create art in the form of, say, Matisse. At the same time, a text creation tool could "write" in the style of Shakespeare or E. E. Cummings. For the record, ChatGPT wrote the title for this post…

I asked the popular generative tool ChatGPT to write some code for me, a dequeue in particular. It created a very textbook solution, as I expected. Programmers are using the tool (and others) to write code or to repair issues with existing code.

What Are Some Shortcomings of AI?

Despite the fact that the marketing hype and all the positive press AI and new AI tools are receiving, there are shortcomings. For me, the most significant is the age-old computer adage of "garbage-in, garbage-out" or GIGO. AI is only as good as its training. And this applies to generative and discriminative tools.

For example, a tool to write blog posts in the style of John McDermott can only do it if the tool is trained with examples of my style and knows to associate it with my name. (Such an exercise is on my to-do list.) Likewise, a translator app can only process Urdu text if it has been trained to process it. Ark Invest’s Cathie Wood emphasized the importance of datasets used to train AI in an interview reported in Business Insider on March 8th of this year.

Another issue is that sometimes bits of originals (including watermarks and signatures) used for training appear in the works created by an AI tool. Some images have reportedly appeared almost identical to the training images. There are artists selling such images as a way to test the legal environment. My wife is a fine artist, so this is of particular interest to me.

To complicate matters, AI could be better at detecting whether the content was AI-generated or created by humans. This is a particular issue in cyber security. For instance, it would be highly beneficial if email software could identify and flag AI-written emails as potential warnings for phishing. But, of course, legitimate marketers could also use AI for writing emails so that it wouldn't be the decisive indicator.

 

And AI tools make mistakes. For example, I asked a popular tool about high-altitude baking (I live in the mountains). Interestingly two adjacent paragraphs disagreed about the need to use a particular technique. Most AI tools that write articles are blog posts are promoted as creating a "starting point," not a finished product. Over time, accuracy will likely improve significantly.

Both the positive and negative aspects of AI have implications in the cyber security domain.

So Is AI Good Enough to Use For Cyber Attacks?

Before I go on, I need to say that good guys and bad actors can use these techniques. Many of the good guys are governments and law enforcement in particular. They use the same methods to attack the bad actors as the bad actors themselves use. Understanding these tools and methods is critical to creating solid defenses.

Perhaps the most straightforward use of AI in this context is impersonation. This is only a single aspect of using the generative AI technology in social engineering. (We discuss social engineering and other attack vectors in Learning Tree’s Information Security course) There are two significant aspects to this impersonation. The first is sounding like someone the impersonator is not. Consider receiving an email pretension to come from a bank or card issuer. Instructions on detecting fakes include checking for spelling errors, misused language, and lousy punctuation. If the software could ensure that the email contained none of those errors and sounded like a message genuinely crafted by the claimed sender, it would be more likely to convince its intended victims.

Now imagine receiving such a letter that was specifically targeted to you (this would be "spear phishing" or "whaling" for high-value targets). The second aspect is that the attacker could not just your personal information, such as name and address, but could also look at your web presence to know about your family, trips, and other personal data. The AI tools could aggregate whatever they found. This could be done with other, more traditional tools, but it is significantly easier with AI tools.

Much of this depends on access to the datasets necessary to train the AI. While many websites' terms of service may explicitly prohibit "scraping" or "extracting" data, that is unlikely to deter those who want to undertake such an attack.

And on a more frightening note, an attacker could create a voice or even video impersonation of an individual. For example, an employee could hear a message sounding like her boss asking her to do something that might make attacking an organization easier. Likewise, a video of a company official might be challenging to refute if the images actively mimicked those of the real person.

Another use of AI tools could be to help attackers avoid detection or to target organizations. There are already tools to detect attacks. What if bad actors could use AI to test their attack tools against detectors and find ways around them? That might make their tools more difficult to detect and their attacks more difficult to discover.

Another use is to look for weak defenses. For example, the tools could search for vulnerable organizations. One advantage is that the AI tools can search quickly and efficiently for many vulnerabilities. And the search could be more effective than existing tools. The AI tool could rapidly look at combinations of weaknesses that could be exploited.

As I was finishing this post, an article appeared at pcgamer.com explaining an easy way for attackers to access the API of ChatGPT to have it write malware and phishing emails. Sure, it costs a few dollars, but imagine the potential ROI. You can bet the bad guys can figure it out.

There are myriad other potential uses for AI in attacks. The world’s militaries are wisely investigating those uses.

What Are Some Countermeasures?

As AI tools improve, detecting malicious activities will be difficult. Some current intrusion detection systems (at least claim to) use AI tools to detect intrusions. The bad guys know that. They could find ways to poison the data used for the detection or learn to defeat the detection mechanisms.

One thing AI tools are very good at (as I mentioned in the beginning) is detecting and discovering patterns. That's why the threat detection systems use AI to view system logs or real-time system activity. They can discover not only the patterns of attacks but exceptions to those patterns. AI professionals call those outliers.

We often think of pattern detection or classification as tools used to decode handwriting, "understand" speech, "read" documents, or perhaps find faces in a crowd. But some tools can detect much more subtle patterns. One example of this is referred to as "emotional AI." These tools can detect a range of emotions in speech, writing, or more. Moreover, while versions of such tools can be used to simulate emotions, those simulations themselves may be discernable from genuine human emotions, thus making even natural-sounding AI-generated voice detectable.

Similar techniques could also be used to discover the cadence and predictability of written documents. While some AI-written text may be difficult to detect, other instances appear easier with the right tools.

One final option has been the suggestion of government regulation. OpenAI’s Chief Technology Officer has floated that possibility. While some regulation seems likely, its impact may not be as significant as desired. Regulation hasn't stopped, or in some cases not significantly curtailed, other types of crime. It is difficult to believe it will be a significant countermeasure in this case.

Where Might This Go?

Predicting the future is notoriously tricky, but some notable trends and indicators might be good indicators.

  1. New datasets will be created for training AI tools. Current tools have used massive datasets of openly accessible data. But lots of data is not publically available. For example, tools trained on historical newspapers could help researchers and historians immeasurably.
  2. Creating new datasets and training the AI tools will require people to curate and manage the training, at least in the short term.
  3. People will need to learn to ask AI tools the right questions or to perform the right tasks. While anyone can write a one or two-sentence query or request a simple image, more complex queries may be long and complex: precise queries require precise instructions. While the natural language nature of tools such as ChatGPT makes them attractive, the precise use of language may produce better results more quickly. And the processing behind the natural language interface has an API so programmers can access it directly. The programmers will need to learn and be trained. That means work and jobs for more people.
  4. Impersonation will increase. As it becomes easier to use the tools, and bad actors recognize the ease of accomplishing their "work," they are likelier to use the tools for nefarious reasons. Likewise, good guys could use impersonation to ease using deceased actors in movies or other legal tasks.
  5. Some tasks will be performed more quickly. For example, searching large datasets will become more efficient as queries become easier to manage. That can make research of various forms easier. As a result, journalists, historians, writers, and others will reap significant benefits.
  6. proposal in Fast Company has suggested Chief AI Officers for significant enterprises. If this catches on, it could help increase its use in major companies.
  7. The growth of AI will require new skills in the workforce. This will require education, starting early. Unfortunately, some schools and districts have chosen to ban the ChatAPI tool. Writer and advocate Kerry McDonald has pointed out why that's a bad idea. Children must learn to use these tools properly, safely, and to honest ends.

 

The second half of Representative Liu's quote is quite telling. "And as a member of Congress, I am freaked out by AI, specifically AI that is left unchecked and unregulated." 

I'm not "freaked out," but there are some significant possibilities for its harmful use by bad actors. As I mentioned above, regulation may help, but I doubt it will be a solution. Instead, people need to be more informed about what AI tools can and cannot do, vendors need to be transparent, and effort needs to be put into countermeasures.

For More Information

This post could have been much longer. There is much to say about AI and AI/cyber security issues. Here are a few links to help those who want to learn a bit more:

 

Interested in building your expertise in this technology? Enroll in our Machine Learning, Data Analytics and AI Courses today!