Allowing a copilot to search the web at will is extremely dangerous.
Here are two somewhat-understood vulnerabilities and how to mitigate them.
Note: this is an ever-evolving field and this is only what I know today. Who know what you’ll know tomorrow!
The first vuln is prompt injection.
You should never blindly trust things written on the Internet, right?
Well, GenAI is happy to follow any instructions, if you write your them well.
A copilot with Internet access is always one search away from a website with hidden malicious instructions,
getting taken over by an external attacker.
The second vuln is data exfiltration.
You know who else is on the Internet? Everyone else, including them.
If an attacker can compromise your copilot, they can instruct it to search a website they control.
That means copilot will reach out - proactively, without a user in the loop - to a website a hacker controls.
The common exploit is encoding data to-be-exfiltrated into a parameter.
We do have some design patterns to address these. Apply one of the following:
You could limit which website copilot can search. See Content Security Policy (employed by Microsoft Copilot and Google Gemini) and URL Anchoring (employed by ChatPT).
Last August I gave a talk at BlackHat USA titled All You Need Is Guest.
In it, I showed how simple guest access to EntraID could be escalated into full control over Azure resources and SQL servers. This method still works today mostly on the large enterprises and government customers.
I got some heat from MSRC prior to the talk which I wrote about briefly after. When I got to one of the juicy parts – bypassing Power Platform DLP – I had to waive my hands and ask the audience to look the other way.
After the talk I was asked by Microsoft to sit on those details for longer.
I appreciate that some fixes require deep architectural work, so I waited.
On Nov 2023 I finally got my MSRC case resolved and a greenlight to publish.
With everything going on today and the problem unfixed, I feel a strong urge to finally share the details with the broad security community.
It is time to share the full story. Hopefully this helps drive urgency to fix it.
Before I do though I would like to point out the obvious – DLP bypass was not the most interesting or scary part of the talk.
The most striking part was the fact that this mechanism exists in every M365 tenant today:
Power Platform is pushing hard on citizen development – the powerful concept of having every corporate user be an app creator.
To build a useful app users plug in their credentials to any enterprise system (say SQL server) to integrate it with Power Apps.
Microsoft stores those credentials and adds a nice little Share button to be used at the user’s discretion. The user can even share with the Everyone group, which means EVERYONE in your EntraID tenant, guests included. It could also directly be shared with any user of group in your tenant.
Other users who got access can now use those credentials, fully impersonating the connection’s creator. They do not get access to the actual secret, but they can perform any action facilitated by Microsoft infra. There are no logs, no way to distinguish the original user from others who got access.
This is all up to user choice. Admins have no say in the matter (nor usable visibility into those choices).
This issue still is out there available to exploit in every M365 environment today.
We’ve observed credentials SQL servers, Azure blob storage accounts and proprietary systems shared with guests or just with thousands of employees on many major enterprises.
In fact, we’ve observed one everywhere we looked.
This problem is not going away. It’s a feature, and a useful one indeed.
Imagine you wanted to empower every corporate user to become a developer – what would be the number one hurdle to do so? Well, credentials.
Citizen developers can’t provision a service account or ask for a service principal. To allow them to create, Power Platform had to circumvent the entire modern IAM architecture. Instead of OAuth, we get credential sharing.
I call this Credential Sharing as a Service.
Power Platform DLP bypass
Power Platform’s response to my research – and in general to most security concerns raised by the community – has been to emphasize the Power Platform DLP.
This is not a DLP in the Data Loss Prevention sense (data labeling, classification, leak prevention etc’).
Instead, it is an allow/deny list for which service types can be connected to through Power Platform. An admin can choose to deny connections to SQL server, for example. The unfortunate name choice means sometimes people get a false sense of security.
The fact that Power Platform DLP is not a security mechanism is well documented.
It won’t prevent a threat actor. And won’t hold up to most bypass attempts by capable citizen developers as well.
It is built to keep users in check, not prevent hacks.
Regardless, Power Platform’s response to my research has been – if you don’t want people to overshare SQL credentials just deny SQL in the DLP.
In the BlackHat talk, I showed an overshared SQL credential from an attacker’s perspective.
When trying to exploit it to dump the server I got denied by Power Platform DLP because that connection was blocked.
This is the part I had to skim over.
And now for the juice - the exploit!
There is none. It just worked.
If you try to access an app that uses a blocked connection you get blocked, as seen above. But the connection still lives, and you can just use it directly.
This is a fundamental design flaw. Power Platform DLP applied only to applications and automation that use a blocked connection. The connections themselves are simply not in scope.
The fix Microsoft put in place in Nov 2023 – the one I waited for to release these highly sophisticated details – was to prevent the creation of connections blocked by DLP.
There are two major things missing in that fix.
First and foremost, it leaves existing customers vulnerable because existing connections still bypass DLP. You can exploit this issue today just as well as you could exploit it before my disclosure.
Second, DLP policies change because policy changes. Every time you change your policy, you now have a bunch of trailing connections to deal with which you have no visibility into or control over. This is not a comprehensive fix and the problem persist.
Another fix implemented by Microsoft the week of the talk (kudos for the quick turnaround on that one) was to introduce a new tenant-level flag to limit the Everyone special group such that it doesn’t include guests. This is nice, but it too has major shortcomings.
Even if guests don’t have access, sharing credentials with the entire tenant is a major exposure and just plain wrong. Also, guests can always be granted access directly or through EntraID groups even if you’ve limited the Everyone group.
Most importantly, these fixes miss the point.
You are not going to block any useful Power Platform connector because it’s a useful platform. Sharing credentials with guests is crazy, sharing it with all enterprise users is just as crazy.
The problem is credential sharing and user impersonation, and that problem has been left untouched.
Go hack yourself
I highly encourage you to go and check it out on your own.
PowerPwn is a free open source tool that would allow you take an offensive look into your Power Platform tenant so you can fix things before attackers find them.
Timeline
2023-06-27 DLP bypass vulnerability disclosure
2023-07-12 MSRC acknowledge the issue and decides not to issue a CVE (cloud service)
2023-07-26 MSRC and I collaborate to adjust my BlackHat slides, point to Microsoft-suggested mitigation and follow up with a Microsoft technical blog (thank you to the
folks involved at MSRC and BAP security team)
2023-08-10 All You Need Is Guest at Black Hat 2023 does not reveal technical details
2023-08-22 Microsoft decides not a release a technical blog on the issue
2023-08-23 I query about mitigation timeline and agree to wait for a full deployment of the fix
This is a long overdue blog version of a talk I gave at BlackHat USA 2023 titled All You Need Is Guest. Slides and video recording are available as well.
Intro
When you get invited as a guest to an EntraID tenant you get restricted deny-by-default access. You shouldn’t have access to any resource not explicitly shared with you, right?
Well, no. By the end of this post, you’ll see how guests can find credentials to Azure resources and SQL servers and exploit them to get full data dumps of everything behind them.
Why invite guests in?
As a small cybersecurity company every enterprise engagement starts the same – how do we share sensitive data back and forth? We don’t want to use email (you’ve never done that, right?).
How do we share resources securely? EntraID external identities – guests – are the mechanism to do that in a safe way.
To accomplish that, it needs to satisfy two conditions: It needs to be easy for vendors to onboard and for IT/security to control.
Indeed, it’s super easy to gain access and thus for vendors to onboard. Under default configuration, any user on Teams can just invite a guest in by adding them to a new team. In most enterprises, this is up to individual user choice.
For IT/security the promise is incredible - by inviting guests in you can apply your existing Microsoft security infrastructure to them. Conditional access policies, Intune, the entire Microsoft Security stack.
There is a caveat here though, it is crucial that guests don’t get full access to your tenant otherwise you have just compromised your own controls. Guest access should be deny-by-default.
Guest accounts in practice
Reality differs. Grab any corporate account, go to make.powerapps.com and click on connections.
You will see enterprise credentials lying around waiting to be leveraged.
These were overshared due to a simple mistake by a business user. They are available for ANY EntraID account to pick up and use, including guests.
These connections are created by business users. Or more precisely, anyone in your organization can just plug in their credentials and create a connection. There are thousands of connectors available for people to use to any enterprise system you can think of. Including on-prem.
Exploit
These credentials are not just available for use directly.
They are used to give Power Apps or Power Automate access to enterprise resources.
There are a few mechanisms protecting these credentials from a wondering guest (or insider malicious insider, for that matter). Most of the talk was focused on bypassing each and every one of those. Here is a quick overview of each:
Blocked by license
Guests cannot view Power Apps or query connections through Power Apps because they don’t have they right license. While licensing is definitely not a security mechanism, it is somethings used as such nevertheless.
But wait, what if we get a trial license on our home tenant - the one we control? That should work for the guest tenant, right?
Well, it works! Power Apps validate that you have a license in at least one of the tenant you are part of. Not the specific one you are trying to access.
Blocked by DLP
The main security mechanism for Power Platform is their DLP. Do not get confused, this is not a DLP in the cybersecurity sense. It does not allow labeling of sensitive data not does it provide data leakage controls. Instead, it is an allow/deny list for which connectors can be used. It provides very blunt controls - allowing or blocking entire categories of services like SharePoint or SQL Servers. If you want to get to a more granular level, you need to manage a tight and ever-changing URL list.
Nevertheless, it can deny access to connections in case those are blocked in the DLP policy. Here, we have a DLP policy that blocks SQL Server connections.
At this point, I basically had to wave my hands and ask the audience to allow me to move forward. I will share full details on a subsequent post.
Say the connection isn’t blocked by DLP then. What now?
Digging into the API calls made by a Power App using a SQL server connection you can spot a call to a service called API Hub. Though those calls, the app makes both read and write operations on top of the SQL server.
API Hub is a service an intermediary service that allows Power Platform apps and users to use shared credentials without actually getting access to the credentials themselves.
Instead, API Hub generate a REST API interface for any imaginable operation on the underlying service.
Any call to API Hub gets translated to a call to the underlying service using the credentials stored in its internal storage.
Those can be user credentials (OAuth refresh tokens), passwords or long-lived secrets.
This is how connection sharing works.
Sharing a connection in Power Platform means allowing another user to use your credentials which are stored in API Hub.
Blocked by programmatic access to API Hub
Users can’t just generate tokens with the right scope to query API Hub, it is an internal Microsoft resource.
You can’t use a built-in public client app because those need to be pre-approved to query API Hub.
You can’y use your own app because API Hub is an internal resource you cannot grant your apps access to.
FOCI to the rescue
At this point, we are stuck.
We know that these credentials are available for us in the Power Apps UI but we want direct access to API Hub.
We can’t generate the right token though.
We know that the Power Apps app can generate tokens to API Hub, but it is a confidential app so we can’t generate tokens on its behalf.
Or can we?
Recalling FOCI, we can take a look into the list of know FOCI apps.
We can generate a token using Azure CLI (of course we can) and exchange that token for a Power Apps token to API Hub! Actually, it turned out you can just ask Azure CLI for a token to API Hub directly.
The fun part
powerpwn is an offensive toolset for M365 focused on Power Platform.
It combines the methods above we now have full access to the services behinds those credentials shared in the Power Platform.
It can also install a backdoor that persist even if the user gets deleted, deploy a phishing app on a Microsoft owned domain and more. But that is a story for another day.
powerpwn recon -t <tenant_id> finds all of the overshared credentials, apps and automations your user has access to.
powerpwn dump -t <tenant_id> goes through each and every one of those and dumps all data from their underlying services. Every SQL server table, every blob in a storage account.
You also gain access to a full Swagger UI for each credential that allows you to run arbitrary commands using those credentials (whatever is possible in Power Platform). For SQL Server, you can pass any SQL command to run on the server.
I strongly encourage you to play around with it!
Defense
Tactically, use powerpwn.
Find and delete these overshared connections.
Ideally, do it on a schedule or even automated.
But admittedly this is a tactical patch.
We are placing dev-level power in the hands of every enterprise user without guardrails or training.
Of course people will make bad judgment calls.
Still, share with everyone? That it just too much.
I strongly suggest using the OWASP LCNC Top 10 to start getting a handle on citizen development.
As AI continues to capture everyone’s attention, security for AI becomes a popular topic in the market. Security for AI is capturing the media cycle, AI security startups are going out of stealth left and right, and incumbents scramble to release AI-relevant security features. In our small patch of the world, it is clear security teams are concerned about AI. It seems like the race has begun and we can just about see an AI Security category being formed.
But what does AI Security mean exactly?
The problem with AI capturing mindshare is that everyone finds the way to talk about their existing solution with AI language making it difficult to figure out one solution vs another.
We also frankly don’t really know what security for AI means because we don’t know what AI development means. Security for X typically arrives after X has matured, think cloud, network, web apps, .. but AI remains a moving target.
From my perspective right now, there are three distinct solution categories all claiming to be AI Security solutions. These three solve different problems for different people so I argue that these are fundamental distinctions that would not easily merge, though of course they do have some overlap.
These categories are:
AI DLP
AI Firewall
AI SPM / CNAPP
AI DLP
Fast to gain traction, fast to disappear (I claim).
When ChatGPT was first launched every enterprise I know went down the same route of trying desperately to block it. Every week had new headlines about companies losing their IP to AI because an employee copy-pasted highly confidential data to the chat so they could ask for a summary or a funny poem about it. This was really all anybody could talk about for a few weeks.
Point solutions to address this problem have popped up like mushrooms after heavy rain. Since you couldn’t control ChatGPT itself, and other AIs that started appearing on the consumer market, all of these solutions are different types of proxies. Whether it’s on the network layer, with a host agent or through a browser extension, AI DLP solutions promise to capture rogue users from using unapproved public AI bots and in some cases point users to approved enterprise versions like Azure OpenAI. This problem got so much attention that OpenAI, who caused the scare in the first place, changed their policies so users can now opt-out of being included in the training set and organizations can pay to opt-out on behalf of all their users.
I am bearish about AI DLP. While these solutions were quick to gain traction reacting to public emotions, I don’t see why AI DLP is fundamentally different from a regular DLP or its modern uncle, the CASB. At the end of the day, users copy-pasting sensitive data to a random website on the Internet is an old problem. Not sure why AI makes it different.
Another point about AI DLP is that it can only observe user interaction with AI and completely misses applications that use unapproved AI services.
AI Firewall
Think about SQL injection that prompted the rise of the AST industry. It is an issue with data being translated as instructions, resulting in allowing people who manipulate application data (i.e. users) to manipulate application instruction (i.e. its behavior). With years of severe issues wreaking havoc on poor web applications, application development frameworks have raised up to the challenge and now safely handle user input. If you’re using a modern framework and going through its paved road, SQL injection is for all practical purposes a solved problem.
One of the weird things about AI from an engineer’s perspective is that they mix instructions and data. You tell the AI what you want it to do with text, and then you let your users add some more text into essentially the same input.
As you could expect this results in users being able to change the instructions. Using clever prompts lets you do that even if the application builder really tried to prevent it, a problem we all know today as prompt injection.
Some solutions have popped up to try and help application developers avoid prompt injection.
They employ a bunch of techniques to do that including threat intelligence (i.e. a list of prompts that work), crowdsourcing and, of course, using AI to flight AI. For an application developer, this typically involves deploying a middleware that acts as a security mechanism between your application and the AI model and fails any injection attempt.
AI models have also improved their inherent resistance to these kinds of attacks, but whether this problem can ever truly be solved remains an open question.
Prompt injection is not the only concern addressed by the AI Firewall category. In fact, some companies have been working on related problems of model theft, model poisoning and model drift for several years now ever since the AI research community discovered adversarial learning. I place these under the same category because they too act as middleware between your application and the AI model, and I doubt people will deploy more than one middleware.
For AI application developers, trying to control these uncontrollable models is a real challenge. This is a security concern, but it is also a predictability and usability concern. Therefore, I believe these concerns are best served as important features of AI application development platforms.
AI SPM / CNAPP
Once you allow AI to act on the user’s behalf and chain those actions one after the other you’ve reached uncharted territory. Can you really tell if the AI is doing things it should be doing to meet its goal? If you could think of and list everything the AI might need to do then you arguably wouldn’t need AI in the first place.
Importantly, this problem is about how AI interacts with the world, and so it is as much about the world as it is about the AI. Most Copilot apps are proud to inherit existing security controls by impersonating users, but are user security controls really all that strict? Can we really count on user-assigned and managed permissions to protect sensitive data from a curious AI?
The problem here is how to build an AI application that interacts with the world in creative ways, but only the creative ways we actually want without any nasty side effects. This category is the least developed, and it is unclear if it’s even one category or if it’s a job for the application development platform or an external security solution. One thing is clear though, the first step is having a deep and precise understanding of the AI application’s environment. Which identities is it using, what components can it run, on which network, and how do they interact with other regular or AI-powered applications.
A finishing thought
Trying to say anything about where AI or by extension AI security will end up is trying to predict the future. We all know the saying, it’s difficult to make predictions, especially about the future. Let’s see how it holds up and where I’ll be most wrong.