Ensuring Enterprise Security by Safeguarding AI Agents (Steve Wilson) Artwork

What’s the BUZZ? — AI in Business

“What’s the BUZZ?” is a live format where leaders in the field of artificial intelligence, generative AI, agentic AI, and automation share their insights and experiences on how they have successfully turned technology hype into business outcomes.

Each episode features a different guest who shares their journey in implementing AI and automation in business. From overcoming challenges to seeing real results, our guests provide valuable insights and practical advice for those looking to leverage the power of AI, generative AI, agentic AI, and process automation.

Since 2021, AI leaders have shared their perspectives on AI strategy, leadership, culture, product mindset, collaboration, ethics, sustainability, technology, privacy, and security.

Whether you're just starting out or looking to take your efforts to the next level, “What’s the BUZZ?” is the perfect resource for staying up-to-date on the latest trends and best practices in the world of AI and automation in business.

**********
“What’s the BUZZ?” is hosted and produced by Andreas Welsch, top 10 AI advisor, thought leader, speaker, and author of the “AI Leadership Handbook”. He is the Founder & Chief AI Strategist at Intelligence Briefing, a boutique AI advisory firm.

All Episodes

What’s the BUZZ? — AI in Business

Ensuring Enterprise Security by Safeguarding AI Agents (Steve Wilson)

August 22, 2025 • Andreas Welsch • Season 4 • Episode 18

What if your AI agents could not only collaborate but also autonomously safeguard your enterprise?

In this episode, host Andreas Welsch sits down with Steve Wilson, Chief AI and Product Officer at Exabeam, to explore the nuances of securing agentic AI and the emerging landscape of multi-agent communication.

Together, they discuss how businesses can enhance their cybersecurity measures while adapting to the rapid evolution of AI capabilities, from agency levels to ethical governance.

Learn about practical strategies for evaluating AI agent frameworks, mitigating insider threats, and implementing security-first approaches in a world increasingly dominated by AI:

What are the internal and external threats and challenges of AI agents and Agent-to-Agent (A2A)/ multi-agent systems?
How can organizations defend against those threats?
What can also go right with Agentic AI security?
What do AI and IT leaders need to do to ensure enterprise security despite all the Agent sprawl?

Ready to navigate the complexities of AI security? Don't miss this insightful conversation and tune in now to tap into the potential of secure AI agents for your business!

Questions or suggestions? Send me a Text Message.

Support the show

***********
Disclaimer: Views are the participants’ own and do not represent those of any participant’s past, present, or future employers. Participation in this event is independent of any potential business relationship (past, present, or future) between the participants or between their employers.

Level up your AI Leadership game with the AI Leadership Handbook:
https://www.aileadershiphandbook.com

More details:
https://www.intelligence-briefing.com
All episodes:
https://www.intelligence-briefing.com/podcast
Get a weekly thought-provoking post in your inbox:
https://www.intelligence-briefing.com/newsletter

Andreas Welsch: 0:30

Hey, welcome to What's the Buzz, where leaders share how they have turned AI hype into business outcomes. Today we'll talk about how to strengthen your enterprise security for agentic AI and multi-agent to agent communication, and who better to talk about it than someone who is actively working on that. Steve Wilson. Hey Steve. Thank you so much for joining.

Steve Wilson: 0:51

Hey Andreas, thanks for having me back. I'm excited for the conversation today.

Andreas Welsch: 0:55

Wonderful. Hey you're my go-to security expert for anything around AI you've been on the show, I think now five times or something like that. And we always have so much to talk about, and I love bringing you back because you see that the topic keeps evolving. Every quarter, every six months, there's something new to talk about, something that's really important. But before I get too excited, maybe you can tell our audience a little bit about yourself, who you are and what you do.

Steve Wilson: 1:23

Yeah, so I think I wear three or four different hats that are relevant to today's conversation. So my main hat that I wear is I'm the Chief AI and Product Officer at Exabeam, which is a cybersecurity company that's been using AI and machine learning for 10 years to improve the cybersecurity stance of our customers. And we've been shipping. LLM based copilots and agents as part of our product for 18 months now. So I have a lot of experience building those hands on. The other things is I founded something called the OWASP GenAI Project, which is if you are unfamiliar with OWASP, it's an. Open Source Foundation dedicated to building secure software. And about two years ago we started developing guidance on how to build secure software with AI. Super hot topic for the last couple years. And and then also I think Andreas and your book in my book came out the same week. I think it was right around the same time. I did write a book for O'Reilly called the Developer's Playbook for Large Language Model Security.

Andreas Welsch: 2:35

That's awesome. And again, like I said, I think it was about two years ago that I first reached out to you when I saw the OWASP Top 10 for Large Language Models. And it's just amazing to see how things have progressed and how quickly they've evolved from that initial list. Yeah, so again, super excited about the conversation we'll have today. Why don't we play a little game to kick things off? What do you say?

Steve Wilson: 2:57

Let's do it.

Andreas Welsch: 2:58

Alright. So in good, old fashioned, right? I'll hit the buzzer, the wheels will start spinning. When they stop. I'd love for you to answer with the first thing that comes to mind and why, in your own words, and to make a little more interesting. You only have 60 seconds for your answer. For those of you who are watching us live, drop your answer in the chat and why as well. Steve, are you ready for, What's the BUZZ? Let's do it. Okay. So here we go. If AI were a band, what would it be? 60 seconds on the clock go.

Steve Wilson: 3:30

All right. I'm gonna go with my dad's favorite band which is The Beatles. And I'm not old enough to remember this moment, but I remember my dad talking about it was there was a very sudden change from before anybody heard of The Beatles. And then one night they were on a big TV show in the United States and everybody knew about them. And that was like the ChatGPT moment for The Beatles. After that they were ubiquitous for a few years, but they always kept reinventing themselves. It's not like the Beatles music from 1969 sounded like the Beatles music from 1964. It sounded totally different. They kept reinventing themselves, but it was always the same thing, and at some point they disappeared. But they didn't really, at some point, you can just listen to all of the popular music today and you can hear the Beatles and all of it. And I expect the place that where we're going is we're gonna start, we're gonna stop talking about AI companies. Like we stopped talking about web companies. There will just be companies and just like we all use the web today, everybody will use AI tomorrow.

Andreas Welsch: 4:40

I love it. That's such a great analogy, right? It is really this evergreen thing to, some extent, but also something that deeply influences everything else that comes after it. Great analogy. Speaking of these companies and at some point we won't talk about AI companies anymore at the moment we do, and we are seeing them bring more and more AI agents on, one hand into the market, and then those that adopt them, bring them into their organizations. We see players like Google. They've announced protocols to enable agent to agent communication and multi-agent scenarios. Many others in, the industry are getting behind it. It's great to see that there's some collaboration. Everybody realizes. We want to have these agents communicate with each other and, like I said, we've talked about the OWASP Top 10 for LLMs. Previously, we talked about how that's evolved for Gen AI, and now here we are talking about how organizations that are letting LLMs and steroids loose on their systems and data are facing the next set of challenges. So I'm, curious, what are you seeing when you talk to organizations around this topic of agents? Do they even have that on their on their radar yet? And what are the internal external threats and challenges?

Steve Wilson: 5:52

So obviously everybody's talking about agents you can turn on CNBC, which is the Financial News Network in the us and all they talk about is AI and agents and things like that. But I think often people don't have any idea what they mean when they say I agents. It's just the thing they say now instead of AI as they say agents. But I think you need to dissect the word a little bit to really get to the heart of it, because it's not like the concept is completely new. If you, get under the word AI agent or even the more complicated hip word agent. It's all about how much agency does something have, and we've had concepts about granting agency for a very long time. I grant my lawyer agency to file papers on my behalf. That's, where the, basic concepts come from, is you're giving something the rights to do things on your behalf and. The way I look at agency is it's not a binary switch. It's not like it has agency or it's not, it's like a volume knob. It's like I'm giving this very little agency, and that's your classic chat bot use case. It might have the agency to represent you as a company if it's your customer service chat bot. But at the end of the day, the worst it could do to someone is call them a bad name. But if we start to give it tools and give it access to do actions that are un undoable all of a sudden there's a lot more agency in these things. Do we give them the capability to act on their own Chat bots never act on their own. You prompt them and they respond and it's always that very simple interaction. These agents. Tend to have longer running processes, and those could be. Seconds to minutes, or they could be hours to days, although those are very rare at this point. But I think that's where people think it's going. And from a security perspective, the way I like to think about it is there's a grid. It's like, how much agency do I give something? And how smart actually is it? How capable is it? And. You know where you get into these real danger zones are what, in the first version of the oasp guidance we call excessive agency, which means I have a thing that's not that smart, but I'm giving it a lot of capability. I'm asking for trouble, I'm asking for security problems to happen. And that might be because it can access things inside my company or outside my company that are important and it opens you up to risk and exposure on the other hand. These agents are, the models underneath them are so much smarter than they were even nine months ago, much less two years ago, that I can give them more interesting jobs. And so at Exabeam now, it used to be our co-pilot was an assistant and if somebody had a question about something, they could ask the assistant. Now when, our low level AI algorithms detect a problem. The agent runs off and, does a complete security investigation and comes back with a report and presents it to the user where it's done a tremendous amount of work before it was asked. And I think those are the kind of use cases where it can be really powerful as you do give it some initiative. But you have to be careful about how much actual agency you give it. And then you get into a lot of questions about where do humans come into the loop? How much do you let it do supervised versus unsupervised?

Andreas Welsch: 9:40

I have a follow up question there. A couple years ago I was doing some, work with clients in, aerospace and the biggest threat to our organization is not so much the external threats. They're big and significant too, but it's actually the insider threats of people that we already know are on the network, that are in the company. They may be in the four walls of the company that have access to information and that they might funnel. Somewhere else for malicious purposes. Now there, there have been tools to detect insider threats. There are things like data loss prevention these kind of things. Now, all of a sudden, in addition to people you have agents on your network that can do this, not just by clicking manually, but automating these things as much as one agent, but dozens or hundreds or thousands, if you will, maybe they spawn new ones and, what have you. How, are organizations able to defend against those threats? How does security need to change? We've previously talked about zero trust architecture and, these kind of things. How, do all of these factors play into it?

Steve Wilson: 10:46

By the way, that's a great question. First place to start is. Insider threats and compromised credentials, which are two sides of the same coin, are the hardest to detect most insidious security threats that every CISO worries about. And classically the vectors for those were. Maybe you had a disgruntled employee, but more likely what you had is somebody who'd been the victim of a phishing attack who lost their, credentials, and then somebody who's on your network who doesn't belong there. What's interesting about the agent ones is, as you do give, say, a set of agents, the rights to do work on your behalf, on your network that, does become its own cut side of insider risk. What I can tell you though is what, I see in practice is the bar for what people call an agent right now for what people are really deploying is far lower. And in a lot of cases what they're saying is and, it, this doesn't mean that they're lesser, it just means they're, let's call them less risky use cases. But even, at that, we have to manage them. Somebody will say, Hey, wouldn't it be great if we had an agent that was part of the HR department that could answer HR related questions for our employees? And you know what it's really a classic, call it a copilot use case, but you give it all the policies and all the stuff nobody knows how to find and nobody knows how to read and you, hook it up with RAG and other things we've come to understand over the past couple years and you put it on the network where people can come to it and interact with it and it doesn't have a huge amount of agency, but what it does have. Is access to a potentially a lot of data. And that's where the, first line of defense really comes in. And this is what I talk about with any of these LLM based systems is the first thing you need to think about is how much data are you giving it access to? Because the first thing you have to understand is. Even the more advanced models are not great at decision making, they're easy to trick. And if you give it access to data and your security defense is something in the system prompt that says, please don't give this information out, that is never going to work. It is never gonna be sufficient. So the first line of defense is information management. You say, for my agent, for what it needs to do its job, what's the minimal version of that job that it can do, and what's the minimum amount of data that it needs? And then you can do a classic risk assessment around that. It's, it then becomes when you do get to these agents and you want them to do work there could be one where it says. Take that AI HR agent for example, could be, it now gets a set of actions that it can do. You're like, I would like to up update my tax withholding. There's a version of that you just give the agent access to the API set for. Day force or whatever your HR system is, and you say I'll just let it figure out the code to write when somebody asks to do something. And it could do it. And it could do anything that they could do, or worse yet it could do anything the administrator could do. That's probably not a good idea, but. What are the, low risk actions that they could do that are undoable later or confirmable, or does your request get queued up and prepared for somebody in HR to maybe just briefly review and approve so they don't have to do the action? You don't have to get ahold of them, but it becomes a broker with a human in the middle

Andreas Welsch: 14:59

Now. I am as I'm listening to what you're sharing, I'm thinking especially with things like agent to agent communication, multiple agents or multi-agent systems, what are the risks of one agent constructing another to reveal more than it's it's a poster, right? It's A2A is a communication protocol. Yeah. Like we have a protocol when humans speak. Usually when one person speaks, the other one listens and responds and vice versa. But the information that is transported, what am I trying to get you to do? Is a totally different one. So I hear information management is one critical piece and, locking down the amount of information in the systems and so on, not to expose it. What are you seeing when agents communicate with each other?

Steve Wilson: 15:44

So it's fascinating because I think there's two kinds of protocols and it's a big source of confusion. So it's worth outlining what the two big protocol categories are right now, and you've. You've probably talked about one of the other ones on the show, but when you say agent to agent communication and there's, some of these emerging standards, like A two A from Google. Those are honestly seeing very limited adoption right now. There's a lot of experimentation going. I would say there's very little actual production usage of these things. What has the communication protocol though, that has stormed the beaches and is being used everywhere, whether wisely or not, is tool usage protocols, and in particular MCP, which is the model context protocol, and. So when people think about building agents, one of the ways that they now give their agents agency is they give them tools, right? It's classically, our LLM has just been a brain and a mouth.'cause it was a chat bot and that's all it could do. Now it has fingers.

Andreas Welsch: 16:53

Yes.

Steve Wilson: 16:55

And that's the first thing to really look at is if you're going to, if you're going to start to use MCP to give your agents. Fingers and hands. There's a, lot of considerations there. There's a lot of security considerations. There's a lot that's been written on the, topic of how do you think about what agency you give them, but also where do you get these tools from? And it, it adds a whole supply chain set of conditions. The thing I will say that I think is coming though when we, look at agent to agent, the first hurdle to getting there is having more than one agent. I'd say the use cases where we see some agent to agent interactions are often completely internal with bespoke agents. So inside a company I might have several agents. That I might construct to build a swarm. It might be multiple instances of one agent. So one of the first places that we're seeing with this is writing code. And you see a lot of talk these days about spinning up basically multiple instances of an agent like Claude Code. Which work together on a code base, like a team of engineers, each taking different jobs out of a queue and processing those in parallel. I'd say that's one of the early places that we're starting to see. This is multiple kind of instances of the same agent working together. The next one is within a company, bespoke agents that are in roles and within the product at Exabeam, we now have. Three or four different agent types in the product. They actually don't really interact with each other yet. They're more like custom agents for vertical use cases. I think the things that we're gonna see develop though and this is where I think people should start to look, is the first thing that we're gonna need is we're gonna need reliable. Basically discovery services for agents. The, thing that the internet needed before the web was really viable was it needed DNS, and we're starting to see. A NS like proposals for hey, here's how I publish that. I have an agent that can carry out certain kinds of services. What is promising about these is they're often much better thought out than DNS from a security perspective. Sure. These early internet services, nobody took. Trust into account. And it's the reason that the, internet is the train wreck that it is it was just designed to share information between universities. Why would you wanna limit it?

Andreas Welsch: 19:50

But see I love that on one hand, these concepts are seeing revival. I've also been talking about central registry or, discovery for, agents for a bit and seeing that this is materializing and, maturing I think is great. So at least this time around. With everything that we know over the last 50, 60, 70 years, we know how to make it better. And yeah. How to design for our security first principles, at least in theory we should know.

Steve Wilson: 20:16

Yeah. But I think when, you think about these multi-agent cases where your agent might be interacting with an untrusted third party agent, a lot of the practices are gonna be similar to interacting with untrusted humans. And today when we put our, chat bot on the web or our, agent on the web to interact with humans you, have to take this very hard zero trust approach that you're looking at guard railing, these. Things multiple ways. One of the things that I put out in the last couple months was I created a new open source project that you can go find on GitHub called Steve's Chat Playground. And it's the embodiment of the top 10 list in the book where there's a set of chatbots and you can pick different chatbots and some of them have built in vulnerabilities. There's check boxes where you can install guardrails. Guardrails to look for prompt injection or look for unintended code generation on the backend. And if I'm building an agent that I want to interact with untrusted things, whether that's reading email that I'm getting from an untrusted source or taking API requests from an untrusted source, I need to be screening everything that I can using very traditional methods on the way in and the way out.

Andreas Welsch: 21:39

I like that. That's really, practical and I, didn't know you, you had put up the, playground, so I, need to check that out myself. So it's sounds like a great and helpful tool to, see how additional guardrails strengthen your security there. Yeah. Now we've, talked quite a bit about the things that can go wrong and where all the risks are, but I also want to make sure that we talk about the things that can actually go right. When you bring agents into your business, when you look at security what are you seeing there? How are companies doing them really well that are embarking on their agenda, AI journey that are bringing some agents or maybe some more agents in?

Steve Wilson: 22:38

I think there's two big categories to think about. When I talk to people about their approach to this and there's I'll put it in, I work at a big company and I'm trying to figure out how to make my company better, and then all the way, at the other extreme, it's like, how could I make my company. Act like a big company even though it's not. And and there are two very different approaches that may involve a lot of the technology stack. But let's talk about what I'll call the hard case first, which is I have a, battleship and I'm trying to steer it. I work at a global 2000 company and I see. My smaller competitors moving faster, doing things that are AI native and deploying some of these agentic technologies in efficient ways. And I think what, you do see are these very vertical agent stacks that are starting to become productive. The coding one is by far the most mature, even though it's by far the scariest. And the one with the most agency, so to speak. But we've had cases in cases I've been involved with at our own company where we have, one product has a 20-year-old code base. It is the, s gnarliest scariest thing. And there are parts of that code base, no human wanted to touch, but it was, problematic. There were. Things the customers didn't like, and we had one brave engineer weighed in with Claude code and a positive attitude came back four weeks later with this thing rewritten in ways people never thought. They thought, man, if I dedicated a team of 10 for three months to it, I don't know that I could do it. And, they went in and they did something and it meant something to the business. And we're, doing more and more of that now. And I know people across companies are doing that with the software development piece all the way over on the marketing side. We've seen people generating marketing collateral forever. What we're just on the early part of is things like, agent based sales prospecting it's. Your, sales, your main sales manager persona, that's one of the most expensive single units that you have at the company. These people may make a million dollars a year. You don't want them cold calling things on the phone. So traditionally you had given them one or two low level assistants who would place their calls and coordinate meetings. We're, starting to see the ability to replace those or accelerate those. And that, that kind of leads me to that other case, which is I wanna be a one person startup and we hear multiple people talk about it. And we will see. A billion dollar valuation company coming out of one human soon surrounded by agents. And we're gonna see somebody who has an idea for the product and they're gonna use agents to build the first version of the product. They're gonna use agents to sell the first version of the thing that they build. And they're gonna use software to, to do the basic. Compliance needs that they have for filing your taxes and the, those basic things and those are all within reach and we see people really doing that. And I think we know we're in the place where we've, crossed a line where we see what I'll call 20 person startups who are well over a billion dollar valuation. And that would, never have been possible two years ago. We're on our way to a one person startup.

Andreas Welsch: 26:43

That's pretty exciting, right? Having all these capabilities at, your fingertips and whether it's something that you built yourself or it's a collection of, different tools that you assemble and, certainly if you're a one person startup, right? It's pretty straightforward. They're pretty short communication pathways between your head and your hands, probably. Yeah. So just, seeing where that goes nonetheless. At some point they also need to think about security and, how to secure the data that they generate, that users give them and so on.

Steve Wilson: 27:14

You, you do and I know that we see that. Look vibe coding is, part of this, and there have been so much controversy out there about the security of vibe coded things. The one, piece of hope I will give everybody, having worked in what they call AppSec for years, which is just the industry that helps people secure their code, is the code that gets written today by teams of humans is not secure. By and large, we try and we should try and we fight that battle. But that battle is hard and humans are bad at writing secure code. So what agents let us do is often what humans do much faster. The piece of hope that I have going forward is there's so much investment in these AI aided coding. Things. What I could never get the human coders to do was really care about security. They really, they wanna build new stuff. They wanna build exciting stuff. They wanna build new features. The bots don't really want to do anything. And if while we're training them, we incent them to write secure code, they'll write secure code. And I think in the next year or two years, we will rapidly cross the point where he said I don't want the humans doing the securing of the code. I want the next generation of AppSec tools, which are artificially intelligent to be the ones fixing those bugs, not the humans.

Andreas Welsch: 28:43

Talk about rapid development in that space. Speaking of rapid. Seems like last half hour has gone by in no time. Steve, I was wondering if you could summarize the key three takeaways for our audience today before we wrap up.

Steve Wilson: 28:57

Sure. So I think first things first, understand the level of agency that you are giving to your agents. Once you've moved past just being a mouth and the thing has fingers. Think about what. You're allowing that to do on your behalf, and what mechanism is it gonna use to do that? Think about the intelligence level that your bot has. It's just as likely for you to get yourself in a high risk situation by the bot making a bad decision as by, as being attacked by a malicious. Third party. But the malicious case makes it even harder. So you really need to think about what data do I have access to? What privileges am I giving it to execute on those decisions. And then the third one though is be aware of the opportunities. Because the opportunities are massive. You can't look at the first two and say, we don't know how to fully secure these systems. So I'm not, I'm just gonna ignore it. You do that and your name's gonna be Blockbuster or Sears Rose. Sears Roebuck in a few years. You need to start to learn how to do this. Just be really conscious about those things in terms of data access and agency levels, find appropriate use cases, and start to move forward and I think you can change the trajectory of your business.

Andreas Welsch: 30:27

I love that it's very tangible ad advice sums up our conversation really, well. Steve, thank you so much for joining us and for sharing your experience with us. Again, I'm always amazed how quickly this topic of security evolves and what all the different aspects are that now come into focus. So hopeful for those of you in the audience, you are already also have a better understanding of it. So Steve, again, thank you so much.

Steve Wilson: 30:49

Thanks Andreas. It's always a pleasure.

Andreas Welsch

Host