owned this note
owned this note
Published
Linked with GitHub
---
title: Domains Techincal Deep Dive
tags: Pulp, Domains, Multitenancy
description: Current state of Domain development as of 10-26-2022
slideOptions:
theme: beige
---
# Domains Techincal Deep Dive
Oct 26, 2022
slides: https://hackmd.io/@gerrod/Hk40ibLEj#/
---
### What are Domains
```python
class Domain(BaseModel, AutoAddObjPermsMixin):
name = models.TextField(null=False, unique=True)
description = models.TextField(null=True)
# Storage class is required, optional settings are validated by serializer
storage_class = models.TextField(null=False)
storage_settings = models.JSONField(default=dict)
def get_storage(self):
"""Returns this domain's instantiated storage class."""
if self.storage_class == "default":
return default_storage
storage_class = get_storage_class(self.storage_class)
return storage_class(**self.storage_settings)
```
A Domain is a namespace object to separate users' Pulp objects into their own isolated silos.
---
### Pulp with Domains
```python
domain = ForeignKey(
"Domain", default=get_default_domain, on_delete=PROTECT
)
```
* Most objects will now have a domain relation
* https://hackmd.io/@gerrod/rJxctHjlbi
* Default domain is needed to allow domain to become apart of objects `unique_together` constraint
* Notable objects without domains: `User`, `Group`, `AccessPolicy`, `Role`
---
### A look at `get_default_domain`
```python
def get_default_domain():
global default_domain_pk
# This can be run in a migration, and once after
if default_domain_pk is None:
try:
Domain = models.Domain
except AttributeError:
return None
try:
domain = Domain.objects.get(name="default")
except Domain.DoesNotExist:
domain = Domain(name="default", storage_class="default")
domain.save(skip_hooks=True)
default_domain_pk = domain.pk
return default_domain_pk
```
Located in `pulpcore.app.util`, special cached function that is run at startup.
---
### Artifacts in Domains
Artifacts are now deduplicated on a domain-wide level as each domain can have a different backend storage.
```python
class FieldFile(BaseFileField.attr_class):
@property
def storage(self):
if domain := getattr(self.instance, "domain", None):
return domain.get_storage()
return self._storage
@storage.setter
def storage(self, value):
self._storage = value
```
----
Artifact storage paths are now separated by domain's `pulp_id`, except for `default` domain.
```python
def get_artifact_path(sha256digest, domain=None):
args = ["artifact", sha256digest[:2], sha256digest[2:]]
# Prevent collisions if two domains have the same backend settings
if domain is not None:
if domain.storage_class != "default":
args.insert(1, str(domain.pulp_id))
return os.path.join(*args)
```
----
There is a new way to access Artifact contents
```python
artifact = Artifact.objects.get(sha256=sha256, domain=domain)
# Old way
with default_storage.open(artifact.file.name) as f:
...
# New domain compliant way, uses domain's storage of the artifact
with artifact.file.open() as f:
...
```
---
### Enabling Domains in Pulp
```python
class PulpFilePluginAppConfig(PulpPluginAppConfig):
...
domain_compatible = True
```
* Domains are an optional feature enabled by the user
* Turning on domains is controlled through setting: `DOMAIN_ENABLED`.
* Pulp will only start if each plugin is compatible with domains. Done through a new attribute on the `PulpPluginConfig`
---
### Domains Interfacing in Pulp
* API Url routing will change to use new setting: `V3_DOMAIN_API_ROOT`
* `/pulp/<domain_path>/api/v3/`
* Content app routing will now have `{domain_path}` added to the end of `CONTENT_ORIGIN`
* `/pulp/content/{domain_path}/`
* All current objects in Pulp will be found under the `default` domain
* This includes objects that are not apart of domains
---
### Backwards Compatibility and Hidden Helpers
New Domain middleware to prevent breaking current ViewSets by removing the `domain_path` from the handler's args.
```python
# In pulpcore.app.middleware
class DomainMiddleware:
def process_view(self, request, view_func, view_args, view_kwargs):
domain = view_kwargs.pop("domain_path", "default")
if not Domain.objects.filter(name=domain).exists():
raise Http404()
setattr(request, "domain", domain)
return None
```
----
To avoid having to modify every serializer to handle the new domain parameter for object creation, a new hidden field has been added to the `ModelSerializer`
```python
class ModelSerializer(serializers.ModelSerializer):
...
def get_domain(self):
context = self.root.context
if domain := context.get("domain", None):
if isinstance(domain, Domain):
return domain
name = domain
elif request := context.get("request", None):
name = getattr(request, "domain", "default")
else:
name = "default"
return Domain.objects.get(name=name)
domain = serializers.HiddenField(default=get_domain)
```
----
This field is only present when the model supports domains.
```python
class ModelSerializer(serializers.ModelSerializer):
...
def __init__subclass():
...
if "domain" in meta.fields and model:
if not hasattr(model, "domain"):
meta.fields = tuple(set(meta.fields) - {"domain"})
```
---
### Tasks with Domains
Tasks need to know the domain they were created in, so `dispatch` now accepts a `domain` parameter.
```python!
def dispatch(...
domain=None,
):
...
if domain:
if isinstance(domain, str):
domain = Domain.objects.get(name=domain)
else:
domain = Domain.objects.get(name="default")
resources = _validate_and_get_resources(exclusive_resources, domain)
...
```
----
Calls to `reverse` should use `kwargs` when Domains are enabled
```python!
def get_url(model, domain=None):
kwargs = {"pk": model.pk}
if settings.DOMAIN_ENABLED:
kwargs["domain_path"] = getattr(domain, "name", "default")
viewname = get_view_name_for_model(model, "detail")
return reverse(viewname, kwargs=kwargs)
```
----
Domains are strictly isolated and can not be used with other domains
```python!
def check_cross_domains(self, data):
domain_set = set()
for name, value in data.items():
if name == "domain":
domain_set.add(
value.pk if isinstance(value, Domain) else value
)
elif isinstance(value, Model) and (domain := getattr(value, "domain_id", None)):
domain_set.add(domain)
if len(domain_set) > 1:
raise serializers.ValidationError(
_("Objects must be all of the same domain.")
)
```
---
### Domain Compatible Tasks
| Compatible | Compatible with some Work | Not Compatible |
| -------- | -------- | -------- |
| general_create | publishing | export |
| orphan_cleanup | syncing | import |
| task_purge | acs_refresh | signing |
| reclaim_space | | |
| repair_artifact| | |
| upload | |
---
### RBAC w/ Domains
There is a new level for permissions: Domain-level, and with it comes new Global Access Conditions.
```python
class UserRole(BaseModel): # Same change on GroupRole
...
domain = models.ForeignKey("Domain", null=True, on_delete=CASCADE)
def has_domain_perms(request, view, action, permission):
if settings.DOMAIN_ENABLED:
domain_name = request.domain
domain = Domain.objects.get(name=domain_name)
return request.user.has_perm(permission, obj=domain)
return False
def has_model_domain_or_obj_perms(...):
return has_model_perms(...) or has_domain_perms(...) or has_object_perms(...)
```
* Roles and AccessPolicies will remain system wide objects.
----
```python!
# In NamedModelViewSet
def get_queryset(self, qs):
...
if settings.DOMAIN_ENABLED and self.filter_by_domain:
if hasattr(qs.model, "domain"):
domain = getattr(request, "domain", "default")
qs = qs.filter(domain__name=domain)
return qs
```
* Objects will be scoped to their domain on top of RBAC scoping
----
```
http :5001/pulp/test/api/v3/domains/
[
{
"name": "default",
"pulp_href": "/pulp/default/api/v3/domains/.../",
...
},
{
"name": "test",
"pulp_href": "/pulp/default/api/v3/domains/.../",
...
}
]
```
* Currently endpoints/objects without domains are readable in every domain (to users with correct permissions), but their href will always show as being in the 'default' domain
---
### Adding Domain Compatibility to a Plugin
1. Add `domain` relation to all models without it
* This includes the `Content` models as base `Content` does not have it.
```python!
class FileContent(Content):
...
domain = ForeignKey(
"core.Domain", default=get_default_domain, on_delete=PROTECT
)
class Meta:
related_together = ("digest", "relative_path", "domain")
```
----
2. Update each task that uses objects to include the `domain` field.
```python!
def purge(finished_before, states):
...
domain = Task.current().domain
tasks_qs = Task.objects.filter(
finished_at__lt=finished_before,
states__in=states,
domain=domain,
)
...
```
----
3. Update ViewSet's AccessPolicies to use domain access conditions
4. Update any extra URL routes to include `<domain_path>`
```python!
if settings.DOMAIN_ENABLED:
custom_path = "/custom/plugin/<domain_path>/"
else:
custom_path = "/custom/plugin/"
...
urlpatterns = [path(custom_path, include(urls))]
```
----
5. Finally Add `domain_compatible = True` to `PluginAppConfig`
```python
class PulpFilePluginAppConfig(PulpPluginAppConfig):
"""
Entry point for pulp_file plugin.
"""
name = "pulp_file.app"
label = "file"
version = "1.12.0.dev"
python_package_name = "pulp-file"
domain_compatible = True
```
---
### Questions?
pulpcore: https://github.com/pulp/pulpcore/pull/3190
pulp_file: https://github.com/pulp/pulp_file/pull/810/
discourse: https://discourse.pulpproject.org/t/new-multi-tenancy-feature-domains/635