RPC Security Workshop on September 21st, 2022

# RPC Security Workshop on September 21st, 2022 ## Attendees - Rifaat Shekh-Yusef (Okta / Auth0) - Kelley Burgin (MITRE) - Michael Jenkins (NSA) - Dean Saxe (AWS) - Romain Lenglet (SGNL) - Erik Gustavson (SGNL) - George Fletcher (Capital One) - Will Zhang (Google) - Yang Gao (Google) - Gaurav Agarwal (Google) - Pieter Kasselman (Microsoft) ## Notes ### Introductions - Rifaat - Auth0, IETF OAuth WG Chair, started the discussion at IETF 114 - Presented Multi-subject JWTs in the IETF 114 - - George - Identity Architect at CapitalOne, previously Verizon / Yahoo - Difference in authorization zones (client / edge / internal) - Some form of multi-service architecture - Performant implementation - Presented thoughts at Identiverse 2020 - Will - Tech Lead at GCP working on workload identities - Looking for security features for customers - Build something to help customers secure microservices - Dean - Senior Security Engineer at AWS Identity Team - We all have bespoke ways of conveying authz/authn info into the stack - How can we interact with each other in a standardized manner - Agree with George - Erik - Chief Product Officer at SGNL - Formerly Google workspace - In a Microservices call stack, how do you propagate identity and authz info - Casually chatted with Microsoft and AWS at Identiverse 2022 and found out there is similar interest - Having a standard would help overall security in the cloud and our businss - Romain - Chief Architect at SGNL - Lead for service mesh at Google - Bridge microservices and service mesh - Yang - Google GCP Identity Infrastructure - Worked on GRPC - Learn about the identity problem in the open source space and understand the common requirement - Gaurav - Engg Manager at GCP Identity Infrastructure - How are things done at Amazon, SGNL, Okta and others - Drive standardization - Kelley - MITRE Corporation - Interested in perpetuating authz and identity information across authorization domains - Federation of enterprise environments more than cloud - Presented at IETF 114 - Michael - NSA Center for Cybersecurity Standards - Look for technologies applicable to defense industry / DoD - Ability of an agency / contractor or one security domain to create authz requests into another security domain - Pieter - Identity Standards Architect at Microsoft - Customers are interested in multi-cloud - How do you present identity / authz information across clouds - Few conversations at IETF and Identiverse - Atul - CTO SGNL ## Problem Scope - George: Security boundaries is a better definition than "cloud provider" boundaries. There could be boundaries within a cloud provider. - Cloud vs on-prem are still different security domains - Dean: Are we talking about sync only or sync and async calls, they may have similar / overlapping solutions - Romain: Domains should be thought of as being decoupled from cloud. In terms of Kubernetes, it could be at the workload level, not just VPC - Romain: To handle sync, you are tying an incoming request to an outgoing, but for asynch, this is very different - Romain: Workloads are very granular - Erik: Administrative domain being a larger boundary, but the microservices boundary - Erik: VPC term should be removed from the architecture diagram, instead be replaced by an administrative boundary - Erik: Sync / Async - could be thought of as user-initiated vs. batch. User initiated could be async (e.g. queue based solution), versus batch is not initiated by any user - George: Sync means I get a response to my request (i.e. I'm waiting for the response), whereas async is different - George: Actions are user initiated, e.g. account delete. This would be async, since some actions will happen much later. Inbound mail is an example of something that is not user initiated. There isn't any user action from which these actions originated - Erik: User initiated versus user-authorized - Romain: Policies like: tie incoming request to an outbound request: that's sync. If we don't care about this distinction it won't matter. Request identity may be important - Atul: We could have 3 categories - user initiated inline, user initiated async, and non-user-initiated - George: It's only important if at the receipt of a token, the call chain information is important. However, if the call chain is not important, then this distinction may not matter - George: So is the call chain information important? - Kelley: When we get to the final Authz decision, it's dependent on identities in the entire chain. All the identities are important to making the decision - George: There are lots of use cases that don't require that. Can we talk about concrete use-cases that require this? - Dean: A concrete use case is a cross-cloud use case: A clear chain of callers is going to be useful for incident response, as well as for general debugging and diagnosing - George: If within my authz domain, the call went through 5 microservices, do you really want the internal services to be known to the other parties - Dean: Good question, in general, I'd say no. But it may not be important. Need to think about this. There may be a reason to blind some of the call chain, but the developer / incident response handler may want to unwind the stack. - Pieter: Should be a policy decision for the customer / app developer if it is appropriate to blind / not-blind. (George agrees) - Rifaat: For the audit trail you may require the whole chain, in the multi-subject JWT we had a number of use cases that required the inclusion of the JWT chain - Yang: I'd also like to know about more concrete use cases for call chain information. There may be other mechanisms for debugging, so it may not be needed for authorization - Kelley: We have US government agencies willing to share information with one another. It would be useful to be able to backtrace the entire chain. In our case we don't need to hide information in the call chain - George: In Kelley's case: Is it really true that if it reached one agency, they would really care what systems it went through in another agency? - Kelley: In terms of OAuth, one RS could need to access another RS in another authorization domain. You want to chain those identities to ensure you are allowed to access the ultimate resource - George: Is there a difference between the requirements internal to a security domain versus across security domains. The call chain visibility requirement may be dependent on the type of data being accessed - Kelley: Within an authz domain, it's a straightforward application of token exchange. It's only when you go across domains, the call chain becomes interesting - Rifaat: See STIR and NSM use cases in: https://datatracker.ietf.org/doc/draft-yusef-oauth-nested-jwt/ - Rifaat: Audit trail benefit is not the main one - George: We could get them through something else if the logs themselves are integrity protected, it could meet compliance requirements - George: We should consider the impact on performance / security - George: The call-chain information may not add much value from a security perspective - George: Proposal I had was to use bearer tokens within an authorization domain. Since the call chain execution time is small, the security implication of using a bearer token is sufficient. Do we still need to have "hop-to-hop" protection - Dean: It's not clear that it isn't addressed in another way (auditability). It will be helpful to build the call chain information in here for autit purposes, if it cannot be done some other way - Dean: I can envision the call chain being a part of the authorization decisions for some high value information - Dean: We could blind some of the call chain (like Pieter said), I don't want to hold up this discussion for that. - Rifaat: There are use cases, but it may be hard to do this, but it may be too early to make a decision right now - Dean: We should consider the use cases, see if it helps or is damaging - Pieter: When you are crossing trust domains, how do you deal with revocation. You could have long-lived tokens, and you need to revoke them. I don't know if it is in scope for this group. Short-lived tokens may be one way to do this, but the resiliency and reliability requirements go very high. Long-lived tokens may be an answer, but we need a way to revocation - Atul: Replay attack is a possibility if the tokens are not bound to the call chain - Rifaat: We're assuming that we will be using bearer tokens, we could use DPoP - Pieter: There needs to be some capability to manage the lifecycle of tokens across security domains ### Workgroup Charter Discussion - Gaurav: Why do we need to limit it to authorization? Why not authentication too? To answer the question: Is this request genuine, is it coming from the right principal - Erik: How to authenticate / how is someone authenticated should not be a part of this discussion - Erik: Asserting the identity immutably in a call chain could be in scope, but the system that does the authentication may not be in scope - George: I agree that it applies that way to end-user authentication, but when you get into machine identities, it could be in scope. We could consider the authentication as a sub-component of the authorization - Erik: Machine to machine tokens may be minted by some system. Along the line, ... - George: We're not doing token issuance, but as an identity is received at the outer edge, we are trying to verify that the identity is assured - Gaurav: To me identity includes token issuance and verification in the call stack. There may be an incoming token, but verifying the token is also authentication - Pieter: There's an initial authn, after which there is authz, one solution may be that there is PoP, but I hand't considered that as an authentication step - Dean: It seems like we are talking about authenticating the end user being out of scope, but as we go each layer in the stack, we are authenticating the client to take the action. We're authorizing the user on behalf whom the action is being taken - Rifaat: Can we assume that client authn is done at a different layer. Microservices may use MTLS to authenticate each other - Dean: Which layers? - Rifaat: If we are talking about different layers e.g. - Pieter: How this works in SPIFFE - it gives a way to bootstrap an identity, that is separate from the authorization. SPIFFE can give the machine identity layer, but the fine-grained authz is missing there - George: This comes down to - request comes in, e.g., "add a stock to a watch list" - gateway verifies the incoming token - user identity, scope (e.g. finance app). Authorization token is just flowing through as a bearer token. The next step is to bind every hop. Token exchange is a way of doing this binding. The use-case requires that the receiving entity needs to do the verification - Gaurav: Bearer token alone should not be the only mechanism, if it gets stolen, we expose ourselves to compromise. The bearer token makes sense only if it is with the right client. How do we carry the context in a multi-hop scenario - Pieter: Proof of Possession in a multi-hop scenario can get complex. As we think about the solution, we should address the risks around bearer tokens, but we may not be able to preclude them. - Rifaat: We're going to the solution. The question is whether we can rely on SPIFFE for the identity and restrict this to just the authz - Pieter: If we can focus on fine-grained authz, that will be a good start for me - Dean: I agree that that makes sense, as cloud providers we all have our internal mechanisms, and changing that is going to be hard - Atul: This applies to the way customers use the cloud platforms - Pieter: It may be hard to draw that line, because there may not be hard boundaries - Gaurav: I agree with Atul that none of the things that we have going inside Google / internal credentials we use - none of that is are exposed to customers. If the goal here is to provide a standard for customers to secure their RPCs - Pieter: Some customers may already be deploying SPIFFE. We should leverage what's out there, or at least what we come up with should be compatible with what's out there. It's not exactly a green field either - George: We need to be able to maintain the immutability of the subject in the call chain. Often the user is referenced using a parameter, and it can be replaced. It has to be a key element of a solution. There may be interesting transformations of the identity, and how do you maintain immutability across such transformations / trust domains, but I believe that principal has to be there as a part of the solution - Pieter: Please clarify subject of the request - George: In the watch list example, "George" may be the subject. In an IoT case, it may be a specific lightbulb. Identities are not necessarily people - Gaurav: Immutability of a subject is critical Just a thought: Do we even have a standardized way of naming subjects across cloud platforms? Is SPIFFE used that much? - Pieter: May be a requirement is that the solution needs to be able to plugin different identity types (people, workloads, IoT, etc.) - Pieter: Important to tie the identity back across different boundaries. There may be privacy issues, but for audit and compliance we need to be able tie them back - Gaurav: If we define authorization as "who can do what", there needs to be a way to represent the "who" across multiple clouds. That seems to be missing today. - George: Today we don't really have that concept at all, so a cross-domain call ends up being a service-to-service trust call, with an identifier as a parameter. This is because we don't have a good way to translate identities across boundaries. In some cases, the user may not even existing across boundaries. - Pieter: If nothing in the world had an identity today, we could talk about naming identities here. You may have to have a way to plugin various identities - getting to consensus will be very very hard - George: There have been multiple attempts at this, XRI, DIDs. May be it's sufficient that for internal things, we just say such and such is usable. - Pieter: There is some work done on the workload part. It should be possible to plugin different identity types - Gaurav: Context is important to authorization - Pieter: Establishing the context to the authorization server. That is treated within the ecosystem - Pieter: Someone is issuing a token, and they have the repsonsibility of establishing the context - Gaurav: That makes sense, the entity issuing the token has the responsibility of establishing the context. The point I'd like to make is that we should be able to maintain the immutability of the originating context - George: We should be able to augment the context as the call proapagtes through the call chain - Pieter: There is an aspect of policy here. How do you orchestrate all of this - Pieter: We should not invent another XACML / Rego, etc. - George: It may depend on how the policy framework is implemented. The PDP may need a lot of data from the authentication context. We should try to work with all the policy agents out there and not have to invent a new one - George: Based on the scenario, there may be a policy validation up-front, and not downstream. In other cases, you may need to evaluate it on a hop-by-hop basis - Gaurav: In a multi-cloud situation, there may be a large number of tuples that represent a policy. Are we saying that there could be a one piece of software (e.g. an authz server in each cloud) that makes these decisions - George: I worry about trying to attach policy to authorization. Whether someone uses a centralized / distributed policy framework, which language they use should be up to them. - Gaurav: Say atul@sgnl.ai is a subject. How do we define a policy for this subject inside GCP. - Rifaat: This should be out of scope, we should not be dealing with this, otherwise the scope of the work will be enormous - Gaurav: How does AWS know who the subject of a particular call is? If that cannot be done, does it make sense for this authorization standard to even exist? - George: We need to think about how a cloud platform identifies a user as they cross boundaries. If a user is transitioning across boundaries, is there a way to replace the identifier in a secure way? We should think about it within the group. What is out of scope is, once you have an identifier, what policies you apply to it. - Rifaat: Agree with that - Pieter: Agree. We can acknowledge policies will be needed, but we do not want to solve that problem - George: We may need common taxonomy for certain kinds of authorization. ### Standardization Venues - Rifaat: It may be too early to discuss this, we should keep digging, come up with a high level thinking and then make a decision - Rifaat: We have two official meetings at IETF 115 and two side meetings. This could be good for a side meeting - Romain: SPIFFE is in CNCF, we could do the same here - Rifaat: Why do you think this needs to be in the same standards body as SPIFFE - Dean: If we go down CNCF, it may preclude a non-cloud centric discussion - Pieter: It could be achieved through liason. IETF has the benefit that there are many experts in the OAuth working group in the IETF. Many stakeholders know how to operate in the IETF - Atul: OpenID? - George: OpenID is fine, FAPI 2 is an OAuth specific profile, and is done within the OpenID Foundation. I'm not too particular as long as my lawyers have already approved it. On the flip side, getting somewhere sooner rather than later is useful, because it gives the IP protection. - Rifaat: Agree - Dean: Not being as familiar with IETF / OpenID: What would be the benefits of one versus the other? Are there some specifics? - Pieter: We could do this as a work item within the OAuth working group. If we took this into OpenID, we may have to establish a new working group. The other advantage is that there is a regular meeting cadence, whereas OpenID foundation operates more offline. Cannot comment on CNCF, because I don't know how it works. It could also be exclusionary - Rifaat: Would be happy to help as chair of OAuth work group - George: Not too worried about getting a new WG setup within the OpenID Foundation. Otherwise agree with Pieter - Yang: No strong preference either way - George: Q to Rifaat: Can we just get this approved as a stream within the working group charter? We can meet at IETF 115 in London and discuss this. - George: Virtual participation is also possible with the IETF 115. ## Action Items

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.