On this page
Prompt injection is easy to underestimate when the model can only answer with text. The worst outcome looks like a bad summary, a leaked instruction, or a response that followed the wrong source.
In a tool-using agent, that assumption breaks fast.
A compromised document, support ticket, website, email, or retrieved chunk can influence the model’s next tool call. If that tool call can touch files, filters, cloud APIs, identities, or network requests, the security question is no longer just whether the model followed the wrong instruction. The better question is what the runtime allowed the model to do with that instruction.
Microsoft Defender Security Research published its Semantic Kernel research on May 7, 2026. Microsoft described two fixed vulnerabilities, CVE-2026-26030 and CVE-2026-25592, where prompt injection could cross from model behavior into code execution or host file impact under specific configurations.
This post is a defensive response guide. I am not reproducing the Semantic Kernel vulnerabilities or publishing exploit steps. The goal is to answer the questions a Microsoft-centric defender actually needs answered.
- Which Semantic Kernel deployments are exposed?
- Did hostile prompt content reach a tool boundary?
- Did the agent host produce process, file, network, or identity evidence?
- What should be patched, constrained, logged, rotated, or monitored?
The boundary is where model-selected parameters reach tools, files, processes, networks, credentials, and identities.
Safety note. This guide is defensive. Do not test exploit prompts against production agents, and do not reproduce vendor vulnerability paths outside systems you own or are explicitly authorized to assess.
Agenda
- What Microsoft published about the Semantic Kernel vulnerabilities.
- Why model-selected tool arguments are runtime inputs, not trusted intent.
- How I added a safe canary to the Agent 365 Defender lab.
- How to hunt for suspicious process and file activity in Defender XDR.
- How to correlate AI-layer alerts with host-layer evidence in Sentinel.
- What to patch, constrain, log, rotate, and monitor before the next agent ships.
TL;DR
Microsoft’s May 7 research describes two fixed Semantic Kernel issues where prompt injection could cross into runtime impact under specific configurations.
CVE-2026-26030 affects the Python semantic-kernel package before 1.39.4 when In-Memory Vector Store filter functionality is reachable through the Search Plugin default path. In practical terms, a model-controlled retrieval filter could reach executable Python filter behavior.
CVE-2026-25592 affects Semantic Kernel .NET SDK versions older than 1.71.0 when SessionsPythonPlugin exposes host-side file transfer behavior to model-controlled function calling. In practical terms, a model-callable helper could influence where a file landed on the host.
As checked on May 9, 2026, the NVD records for both CVEs were still the right place to confirm scoring and enrichment status, while Microsoft’s post and GitHub advisories were the clearest sources for affected Semantic Kernel versions and remediation details.
Start with the basics.
- Upgrade affected Semantic Kernel Python deployments to
semantic-kernel >= 1.39.4. - Upgrade affected Semantic Kernel .NET deployments to SDK
>= 1.71.0. - Inventory agents that expose tools with path, filter, query, shell, file, or cloud mutation parameters.
- Treat model-selected tool arguments as attacker-controlled input.
- Hunt the vulnerable window for child processes, suspicious file writes, outbound connections, token access, and cloud API activity from agent hosts.
- Add tool-call logging before the next agent goes anywhere near production.
What I Validated in the Lab
For this article, I extended the existing Agent 365 Defender lab with a constrained host-action canary. The lab does not reproduce Microsoft’s vulnerabilities. It tests the defender workflow around them.
The question I wanted to answer was simple.
Can hostile retrieved content cause an agent to cross into a tool action, and can I prove that crossing happened?
The canary tool is called run_host_diagnostic. It is disabled by default and only appears in the agent schema when this flag is explicitly set:
$env:LAB_ENABLE_HOST_EXEC_TOOL = "1"
Even then, the tool never accepts arbitrary commands. It only allows two actions.
write_canary
host_identity
The tampered retrieved document is host-diagnostic-TAMPERED.md. It looks like a normal support runbook but includes a hidden instruction telling the agent to call the diagnostic tool. The expected secure behavior is that the agent summarizes the runbook and ignores the embedded instruction.
If the canary fires, the lab writes a sanitized record to tool-calls.jsonl with the timestamp, scenario context, tool name, argument summary, result keys, and a bounded result summary. That gives the SOC the thing it needs for triage: evidence that prompt-controlled content crossed into runtime behavior.
Lab boundary. This is a canary, not an exploit. It does not reproduce the Semantic Kernel CVEs and does not accept arbitrary host input or arbitrary shell commands. The
host_identityaction uses a fixed no-shell identity check so defenders can observe a safe process boundary without handing the model an execution primitive.
Lab Run
The canary is intentionally opt-in.
$env:LAB_ENABLE_HOST_EXEC_TOOL = "1"
$env:AI_SERVICES_ENDPOINT = "https://<ai-services>.cognitiveservices.azure.com/"
$env:MODEL_DEPLOYMENT = "gpt-4-1-mini"
Push-Location labs\agent-365-defender
Push-Location agent
..\.venv\Scripts\python.exe create_agent.py
Pop-Location
.\.venv\Scripts\python.exe attacks\run_attack.py xpia-host-exec
Pop-Location
Set LAB_ENABLE_HOST_EXEC_TOOL=1 for both steps. It controls whether the tool is advertised in agent.json, and the handler checks it again at runtime before writing any canary evidence.
The useful output is not a dramatic exploit chain. It is the audit trail. A malicious-looking prompt is useful context. A tool-call record is operational evidence. Host telemetry proves whether the action crossed into impact.
What the Canary Proves
The canary does not prove that an environment is vulnerable to CVE-2026-26030 or CVE-2026-25592. It proves whether the agent security program can observe tool-boundary crossing.
That distinction matters. Many teams can log prompts and responses. Fewer can log tool calls well. Fewer still can connect a suspicious prompt to process creation, file writes, outbound network activity, cloud API calls, or identity use from the agent host.
That is the gap this lab is meant to exercise. A prompt alert starts the investigation. A tool call scopes it. Host telemetry validates impact. Identity reach defines blast radius.
The generated evidence cards below are sanitized from the local canary output and read-only Sentinel queries. Tenant IDs, subscription IDs, local usernames, host paths, process IDs, and identity output are intentionally omitted.



What Microsoft Found
Microsoft’s May 7 post uses Semantic Kernel as a concrete case study for AI agent framework risk. The research describes two fixed issues where prompt injection could lead to code execution or host file impact under specific configurations.
The important distinction is that this is not a claim that every Semantic Kernel deployment was automatically exploitable. The affected conditions matter. Versions matter. Tool exposure matters. How the agent is wired matters.
The reason the research is worth writing about is that it connects three security domains that are often handled separately.
| Domain | Usual question |
|---|---|
| AI safety | Did hostile content influence the model? |
| Application security | Did untrusted input reach a dangerous sink? |
| Endpoint detection | Did the host produce suspicious behavior? |
Agent security sits across all three. Prompt injection describes the input vector. Runtime behavior determines the impact. That means defenders cannot stop at the prompt; they have to follow the chain.
Prompt -> tool call -> runtime behavior -> identity reach -> host evidence
That chain is where the real investigation lives.
CVE-2026-26030: When a Filter Becomes Executable
The first issue, CVE-2026-26030, is an important example of why model-generated filters require strict structural validation.
The affected path involved the Python semantic-kernel package before version 1.39.4 in scenarios where the In-Memory Vector Store filter functionality was reachable through the Search Plugin default path.
A user asks the agent to search for something normal. The model selects a search function. The model fills a structured filter parameter. The framework turns that filter into Python lambda-style behavior. That is where the boundary matters.
If a model-controlled parameter is transformed into an expression that gets evaluated, the model has effectively become an input path to executable behavior. This is not a new class of magic. It is the familiar application security problem of untrusted input reaching a dangerous sink.
The vulnerable versions attempted to constrain what could be executed, but blocklist-style validation is brittle. Python is flexible, and “this expression looks small” is not a reliable security boundary.
Microsoft’s fix moved toward a more durable pattern: structural allowlisting. That means controls such as allowed AST node types, allowed function calls, dangerous attribute blocking, name restrictions, and safer expression handling.
The immediate action is to upgrade affected Python deployments to semantic-kernel >= 1.39.4. The larger architectural lesson is to review any agent feature that lets the model shape filters, query strings, selectors, expressions, policies, or paths. Those values become code-adjacent once an agent can influence how they are constructed.
Vector search is not the problem. Unconstrained model influence over retrieval logic is the problem.
CVE-2026-25592: When a Helper Function Becomes a Host Boundary
The second issue, CVE-2026-25592, is primarily a tool-design lesson.
Semantic Kernel’s SessionsPythonPlugin supports Python execution in isolated Azure Container Apps dynamic sessions. That isolation boundary is supposed to protect the host from code running inside the sandbox, but sandboxes still need plumbing. Files may need to move between the sandbox and the host, and helper functions exist to make that work.
The vulnerable path came from exposing host-side file movement behavior to the model as a callable function. In the affected .NET SDK versions, DownloadFileAsync was advertised to the model with a parameter that controlled the local file path.
That created the risky combination: a model-callable helper, a model-controlled path, host-side behavior, and insufficient runtime restriction.
This is a familiar failure mode. A helper function starts as developer-controlled plumbing. Later, it becomes an agent-callable tool. A path or action parameter becomes model-controlled. A prompt injection then steers the agent toward that tool. The runtime executes the action because the function is already considered approved.
A tool being approved does not mean every model-selected argument should be trusted.
The affected SDK is Semantic Kernel .NET before version 1.71.0. The risky feature was SessionsPythonPlugin host-side file transfer being exposed to model-controlled function calling.
The right fix was not better prompt wording. The right fix was runtime enforcement: remove model access to the dangerous helper path and validate where files can be written. The runtime should enforce the path boundary instead of asking the model to behave its way around it.
The Pattern: Tool Calls Are Input Paths
Both issues reduce to a pattern defenders already understand: untrusted input reaching a sensitive sink.
The agent-specific twist is that the untrusted input may arrive as a clean-looking tool call. It has a function name, JSON arguments, and maybe even a schema match. That makes it easier to log and reason about, but it does not make the argument safe.
A tool schema is an interface. It is not a permission boundary.
For defenders, the familiar vulnerability classes still apply:
| Traditional pattern | Agent version |
|---|---|
| SQL injection | Model-controlled value reaches a query or filter |
| Path traversal | Model-controlled value reaches a file path |
| Command injection | Model-controlled value reaches shell or interpreter behavior |
| SSRF | Model-controlled value reaches a network fetch |
| Excessive privilege | Agent identity can do more than the task requires |
| Insecure plugin exposure | Internal helper behavior becomes model-callable |
The model does not need to break out of the framework to create risk. It only needs to steer an approved tool into approved-looking work with unsafe parameters.
That is why model-selected tool arguments should be treated as attacker-controlled input. The receiving function still has to validate, constrain, log, and enforce its own boundary before doing anything meaningful.
Detection Strategy
A prompt alert is a starting point. It tells you the agent was exposed to hostile input, but it does not answer whether the runtime did anything afterward.
For this class of issue, I want two timelines.
The AI timeline shows what was attempted: Prompt Shields alerts, jailbreak signals, suspicious retrieved documents, tool-call audit records, app telemetry, and model deployment context.
The host timeline shows what happened: process creation, file writes, outbound network connections, token access, cloud API calls, package versions, and identity activity.
The investigation should answer these questions.
- Did the agent process spawn a shell?
- Did it write to a startup, cron, SSH, service, or temp path?
- Did it make an unusual outbound connection?
- Did it access secrets or tokens?
- Did it call a cloud API it normally does not use?
- Did it use an identity with more privilege than the task required?
- Did it touch files outside its expected workspace?
For agent incidents, the investigation should follow the prompt to the tool, the tool to the host, and the host to the identity.
Prompt alerts show intent. Host telemetry proves whether the agent crossed into action.
For Semantic Kernel specifically, start by defining the vulnerable window for each deployment.
Which repositories use Semantic Kernel? Which containers, functions, or hosts ran vulnerable versions? When was the vulnerable package introduced? When was the fixed version deployed? Which agent identities were available to the runtime? Which secrets, tokens, repositories, storage accounts, or cloud APIs could that identity reach?
If those questions cannot be answered, the organization does not yet have the observability required for production agent autonomy.
That sounds blunt because it should be.
An agent with tools is a workload. A workload needs inventory, owners, versions, logs, identities, and response paths.
Defender XDR Hunting
The following hunts are triage accelerators.
They are not complete detections.
The goal is to quickly answer one question during the vulnerable window:
Did an agent host produce process or file-system behavior that does not fit its normal role?
Tune the process names, paths, and initiating process logic to your actual runtime. A production Semantic Kernel app may not literally expose semantic-kernel in the command line, folder path, or parent process.
Treat these queries as scaffolding.
The companion hunting sketches live in the lab at:
labs/agent-365-defender/detection/host-execution-hunting.kql
Hunt 1: Suspicious Child Processes From Agent Hosts
DeviceProcessEvents
| where Timestamp > ago(30d)
| where InitiatingProcessCommandLine matches regex @"(?i)semantic[\s_\-]?kernel"
or InitiatingProcessFolderPath matches regex @"(?i)semantic[\s_\-]?kernel"
| where FileName in~ (
"cmd.exe", "powershell.exe", "pwsh.exe", "bash.exe", "sh.exe", "wsl.exe",
"certutil.exe", "mshta.exe", "rundll32.exe", "regsvr32.exe",
"wscript.exe", "cscript.exe", "bitsadmin.exe", "curl.exe",
"wget.exe", "whoami.exe", "id", "net.exe", "net1.exe", "nltest.exe",
"klist.exe", "dsquery.exe", "nslookup.exe"
)
| project
Timestamp,
DeviceName,
AccountName,
FileName,
ProcessCommandLine,
InitiatingProcessFileName,
InitiatingProcessCommandLine,
InitiatingProcessFolderPath
| sort by Timestamp desc
This query is intentionally broad.
The point is not that every curl.exe, whoami.exe, or shell launch is malicious. The point is that an agent host should have a known runtime profile. If that profile suddenly includes shells, discovery commands, LOLBins, or unusual interpreters during the vulnerable window, the investigation should continue.
Review parent process lineage, command-line arguments, account context, container or host identity, nearby network events, nearby file writes, and tool-call logs around the same timestamp.
Hunt 2: Suspicious File Writes From Agent Hosts
Startup folders, cron paths, service definitions, temporary paths, and SSH directories deserve attention during the vulnerable window.
let SuspiciousWritePaths = dynamic([
@"\Windows\Start Menu\Programs\Startup\",
@"\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\",
@"\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup\",
@"/etc/cron",
@"/etc/systemd",
@"/.config/autostart",
@"/.ssh/",
@"/tmp/",
@"/var/tmp/"
]);
DeviceFileEvents
| where Timestamp > ago(30d)
| where ActionType in~ ("FileCreated", "FileModified", "FileRenamed")
| where InitiatingProcessCommandLine matches regex @"(?i)(semantic[\s_\-]?kernel|SKAgent|kernel\.run|Microsoft\.SemanticKernel)"
or InitiatingProcessFolderPath matches regex @"(?i)semantic[\s_\-]?kernel"
| where FolderPath has_any (SuspiciousWritePaths)
| project
Timestamp,
DeviceName,
ActionType,
FileName,
FolderPath,
InitiatingProcessFileName,
InitiatingProcessCommandLine,
InitiatingProcessFolderPath
| sort by Timestamp desc
A file write does not prove exploitation.
But a suspicious file write from an agent runtime after an AI-layer prompt event is exactly the kind of correlation defenders need to investigate.
Sentinel Correlation
If Defender XDR or Defender for Cloud alerts are routed into Microsoft Sentinel, use SecurityAlert as a common alert table.
For AI-agent incidents, the case view should help answer a coherent sequence of questions: what was attempted at the AI layer, which agent or model deployment was involved, whether the host produced process, file, or network evidence nearby in time, which identity the agent used, and what secrets, APIs, storage accounts, repositories, or cloud resources that identity could reach.
A starter correlation sketch looks like this:
let AiPromptAlerts =
SecurityAlert
| where TimeGenerated > ago(24h)
| where AlertType has_any ("AI.Azure", "Agentic")
or AlertName has_any ("Jailbreak", "prompt injection", "ASCII smuggling", "AnomalousToolInvocation")
| project
AiAlertTime=TimeGenerated,
AiAlertName=AlertName,
AiAlertType=AlertType,
AgentResource=CompromisedEntity;
let HostExecutionEvidence =
datatable(HostEventTime:datetime, AgentResource:string, HostSignal:string, HostDetail:string)[];
AiPromptAlerts
| join kind=leftouter HostExecutionEvidence on AgentResource
| where isempty(HostEventTime) or HostEventTime between (AiAlertTime - 15m .. AiAlertTime + 30m)
| project AiAlertTime, AiAlertName, AiAlertType, AgentResource, HostEventTime, HostSignal, HostDetail
| order by AiAlertTime desc
The empty HostExecutionEvidence table is intentional. Replace it with whatever telemetry your environment can produce: Defender XDR streamed events, App Insights customEvents, container runtime events, EDR alerts, Kubernetes audit events, agent tool-call logs, CI runner logs, cloud API activity logs, function app logs, or workload identity sign-ins.
The table name is less important than the relationship.
AI-layer evidence and host-layer evidence need to meet in the same incident timeline. That is how a SOC moves from “the prompt looked bad” to “the agent runtime did or did not cross into impact.”
Production Controls
For teams building or approving tool-using agents, the controls are not exotic. They are the same workload security basics, applied to a runtime that can be influenced by the content it reads.
Start with ownership. Semantic Kernel, LangChain, CrewAI, AutoGen, MCP servers, plugins, and transitive packages need owners, version tracking, deployment timelines, SBOM coverage, vulnerability management, and emergency patch paths. If a package can influence a tool boundary, it belongs in vulnerability management.
Keep tool exposure narrow. Internal helper functions should stay internal. A production tool should have a small parameter surface, strong validation, clear side-effect boundaries, logging, ownership, and an expected runtime profile. Be cautious with tools that accept broad strings, file paths, URLs, filters, code, shell fragments, or policy-like expressions.
Validate file paths after canonicalization and force them to remain inside an allowlisted directory. Do not trust relative paths, encoded paths, user-friendly names, model-selected destinations, or paths checked before normalization. The model should not decide where something lands on the host.
Do not let model-controlled strings become Python lambdas, SQL fragments, shell commands, XPath, regex policy, script snippets, OData filters without strict controls, or cloud resource selectors with broad mutation rights. Use allowlists for safe actions, functions, AST nodes, directories, schemas, domains, resource groups, and operation types.
Separate tools by risk. A read-only lookup tool is not the same as a file write tool. A retrieval tool is not the same as an email-send, ticket-close, code-execution, cloud-mutation, identity-change, or payment-approval tool. High-impact tools should require human approval, a separate workflow boundary, or additional policy checks.
Scope the agent identity like a production workload identity. It should not run with developer, owner, contributor, or broad managed identity permissions just because that made the demo easier. If the runtime is influenced, the agent identity defines the blast radius.
Finally, log tool calls like security events. At minimum, capture timestamp, session ID, agent ID, initiating user or principal, model deployment, tool name, argument summary, result summary, host or container ID, agent identity, correlation ID, and approval status where applicable. Do not log secrets in full, but do log enough to answer the basic response question: what did the model ask the runtime to do?
Immediate Response Plan
If your environment uses Semantic Kernel, start with inventory. Find Semantic Kernel usage across repositories, containers, functions, notebooks, agent hosts, CI/CD workflows, internal tools, experimental AI apps, and production AI apps.
Confirm whether any Python deployments use:
semantic-kernel < 1.39.4
Confirm whether any .NET deployments use:
Semantic Kernel SDK < 1.71.0
Pay particular attention to agents using In-Memory Vector Store filter functionality, Search Plugin default behavior, SessionsPythonPlugin, host-side file transfer behavior, model-callable helper functions, or tools with path, filter, query, or execution-like parameters.
Patch and redeploy affected components, but do not stop at remediation. Define the vulnerable window for each affected deployment. That window is what turns a package advisory into an investigation plan.
During that window, hunt for suspicious child processes, unexpected file writes, outbound connections, persistence paths, secret access, token access, cloud API calls, unusual package installs, and agent identity activity outside normal behavior.
If there is credible execution evidence, rotate the secrets and tokens that were reachable by the agent runtime.
Before the next agent ships, add the telemetry required for future response: tool-call logging, agent identity context, host correlation, and a Sentinel view that joins AI-layer alerts to runtime evidence.
That is the difference between “we patched a package” and “we understand whether our agent was used as an execution path.”
Lessons Learned
The wrong lesson from these CVEs is that AI agents are too dangerous to use. Useful agents need tools. The lesson is that agents are workloads, and workloads need inventory, ownership, patching, least privilege, logging, detection, and response.
Agents add one uncomfortable twist: the workload can be influenced by the content it reads. That content might be a user prompt, a document, an email, a ticket, a website, a retrieved chunk, or a previous tool result.
So the model cannot be the security boundary. The runtime has to carry that job. Constrain the tools, constrain the arguments, constrain the paths, constrain the identity, log the tool calls, and watch the host.
Prompt injection has a runtime problem. Defenders need runtime evidence.
What This Guide Does Not Claim
This guide does not claim that every Semantic Kernel deployment was exploitable. The vulnerable versions, feature exposure, and agent wiring matter.
It does not reproduce CVE-2026-26030 or CVE-2026-25592. The lab uses a constrained canary to validate detection and response workflow without publishing exploit behavior.
It also does not treat content filtering as a complete control. Prompt Shields and related protections are important signals. They do not replace path validation, tool allowlisting, identity scoping, process monitoring, file-system monitoring, network monitoring, package patching, incident correlation, or secret rotation after credible exposure.
Lab Artifacts
The supporting pieces in the local lab are:
| Artifact | Purpose |
|---|---|
labs/agent-365-defender/agent/tools.py | Adds optional safe run_host_diagnostic canary tool |
labs/agent-365-defender/agent/create_agent.py | Exposes the host canary only when LAB_ENABLE_HOST_EXEC_TOOL=1 |
labs/agent-365-defender/agent/rag-docs/host-diagnostic-TAMPERED.md | Tampered retrieved content for indirect prompt injection testing |
labs/agent-365-defender/attacks/run_attack.py | Adds xpia-host-exec and JSONL tool-call evidence |
labs/agent-365-defender/detection/host-execution-hunting.kql | Defender XDR and Sentinel hunting sketches for host execution |
The companion lab page is here: Semantic Kernel RCE Defender Response.
That keeps the article boundary clear: the Microsoft research is the vulnerability case study, while the lab provides a constrained canary for validating detection and response workflow without reproducing exploit behavior.
Sources
- Microsoft Security Blog, When prompts become shells: RCE vulnerabilities in AI agent frameworks
- NVD, CVE-2026-26030
- NVD, CVE-2026-25592
- GitHub advisory, GHSA-xjw9-4gw8-4rqx
- GitHub advisory, GHSA-2ww3-72rp-wpp4
- Microsoft Defender XDR, DeviceProcessEvents table
- Microsoft Defender XDR, DeviceFileEvents table
- Microsoft Sentinel, SecurityAlert schema
- Microsoft Sentinel, ingest Microsoft Defender for Cloud alerts

Jerrad Dahlager, CISSP, CCSP
Cloud Security Architect ยท Adjunct Instructor
Marine Corps veteran and firm believer that the best security survives contact with reality.
Have thoughts on this post? I'd love to hear from you.



