Updated: Mar 8, 2022
Broken authorization is the most pervasive vulnerability in microservices. 100% of financial services and healthcare apps recently tested by security researchers had flawed authorization. But authorization for access control isn’t nearly as complex as most security controls. If you don’t want A to access B, write a rule to disallow it. Otherwise, don’t. Easy peasey. If only autonomous development and loose coupling of services – fundamental properties of microservices architecture – hadn’t made this seemingly simple security control so darn difficult.
Tightly coupled monolithic applications could keep state from the front-end user authentication to the back-end data the application accessed on that user’s behalf. In microservices, however, communicating state between the front-end and the back-end would mean a change to one requires a change to the other. That’s a no-no. We want services to be like Legos, reusable, independently scalable blocks.
From the perspective of least-privilege, the implication is we secure which blocks can connect to which other blocks just as we did with micro-segmentation in the data center. The front-end service is allowed to call the back-end service. Today’s API security controls allow further granularity. We can authorize the inventory service only to call the order processing service’s API to get sales updates. What we can’t do is authorize the inventory service to only access SKUs and quantities, and not the customer order data those sales updates contain.
Fine-Grained Authorization Intentions
Complying with the data access control requirements of GDPR, HIPAA, FINRA and other regulations means knowing whose PII an API contains, where it comes from, and where it's going. However, there’s been no good way to enforce fine-grained access policies on data moving in APIs. We want to control where that data can go, but we have no good way to see what that data is.
An API at First American Corp leaked 800 million customer documents because it didn’t know it was giving customer ‘A’ data belonging to customer ‘B’. Security researchers disclosed similar problems at Atlassian, Uber, Peloton, Parler, and 2400 other companies last year. API security tools aren’t built to inspect API payload data and can’t detect APIs that share more data than they should. DLP tools can classify data in API payloads but would struggle to distinguish which customer they belong to. What regex rule or keyword search could distinguish who a random snippet of code, tweet, or image belongs to?
Security tools may analyze API request parameters or learn historical request-response patterns to detect anomalies but do so without knowing what data those APIs actually contain. With imperfect knowledge of what the services in an application actually do, we can only make assumptions about what data APIs return.
Services are Data Shredders – APIs are Blowers
80% of sensitive data accessible by APIs is unstructured in files and objects. It’s a common design pattern in microservices to have back-end services that copy objects from storage systems, combine, rearrange, and reformat the data they contain before sending it on to other services in APIs. The trouble is none of those services know what policy applies to the data they handle.
Metadata and access control lists (ACLs) on storage systems identify the data those objects contain, and the entities allowed to access it. Security best-practices, however, don’t allow broad access to such privileged information. Back-end services authorized to copy the data must leave the access policies behind. Even if they could access those policies, logic to combine them and determine a policy for output data would need to be written into the service. That’s a problem because services are built autonomously before developers know how the service will be deployed – before they can know what that policy logic should do.
Encrypting the data objects provides no better control over data propagation and comes with a cost. The back-end service authorized to copy the data has to be allowed to decrypt it. Still the service doesn’t know what access policies apply to it and other data it might be combined with. Encryption is a performance killer, and it’s costly to deploy. Every service needs to call an API or have an agent installed to decrypt the data. Agents are easy to deploy but force retesting of the services they are installed in.
Autonomous Services Over-Share Data
70% of enterprises have adopted microservices because autonomously built services are easy to deploy and scale in the cloud. The whole point of service autonomy is loose-coupling – a service publishing an API doesn’t need to know what other services will do with the data they’re sent. Developers want their services to be used, and often scope their APIs broadly to cover many use cases. Why shouldn’t an API to read the day’s order history not return the customer account information along with what they ordered? Nobody has a crystal ball.
Oversharing of data in APIs, or what OWASP terms Excessive Data Exposure, can only be detected when a policy defines what excessive means, and the policy can’t be defined until the service is deployed when you know what other services will be receiving the data.
On the flip side, services that call an API could be a little more specific about what data they want. GraphQL does just that by giving SQL-like query capabilities services and APIs. Loose coupling throws a monkey wrench into the works there as well – how does the service know it’s authorized to receive what it asks for it? An order processing service may scope its requests to ask an order entry service only for customer account details. However, if the order processing service is deployed in the USA and the order entry service deployed in Germany that request may violate privacy regulations.
APIs Need Access Control on the Data They Carry
If customer A’s identity is known only to a front-end service, yet by fraud, bug, misconfiguration, privilege escalation, or excessive data exposure, they are about to receive data that only a back-end storage system can tell belongs to customer B, authorization is broken. The front-end can be built to pass A’s identity to the back-end storage system, but then the back-end has to be written to receive it. Service autonomy loses.
Security controls deployed as overlays, after all the services have been composed into application, are the preferred way to enforce authorization policies and preserve service autonomy. Yet to these overlays, services are black boxes. Services don’t fully know what data they have. AppSec and DevOps teams writing the policies on these controls certainly don’t fully know what data services move between them.
Broken authorization is destined to remain the top vulnerability in microservices until API security controls can connect the data APIs carry to a policy and evaluate that policy at the point of access.