Galaxy_NG approach to track wisdom flag on objects

Given 2 levels of scope:

  • namespace
  • legacy_namespace
We have a requirement to display opt-outs at a granular content level, so they will be displayed in the content apis. However we will only set the boolean at the namespace and legacy_namespace level - and it will apply to all content under that namespace. This can altered in the future to individually opt-out some content and at that point determine how to resolve conflicts with the namespace.

A flag will be placed on the Wisdom table telling
if the content under that scope is opt-out for wisdom
scanner.

By default all content is enabled for wisdom indexing, user may click a
button to opt-out via UI.

NOTE user scope is not possible because: we don´t relate
content with a specific owner, we don't have a public user profile API.

NOTE 2 No need to track the repository as the wisdom mark
applies for community content only.

NOTE 3 No need to sync down the marks to PAH or export it,
The booleans exists only in the beta-galaxy deployment.

Note 10 Feb 2022

With the requirement of showing badges removed, we can keep the api implementation to be a widom-based api endpoint and not alter namespace and content endpoints at this time.

Usage

Option 1

The wisdom scanner can traverse the Hub API for all namespaces, and if the namespace has the opt-out flag then skip all content under the namespace.

Option 2

The wisdom scanner can traverse the Hub API for all content, and if the content has opt-out then skip it.

Goal

The goal is to provide on the API a boolean flag wisdom_index: bool under the respective APIs:

  • v3 /content/namespace
  • v3 /content/namespace/collection/collection_version
  • /api/v1/namespaces/
  • /api/v1/roles/
  • Also we likely will have an endpoint to list out AiIndexDenylist objs (UI could use this rather than the namespaces endpoints)

Data Model


class AiIndexDenylist(BaseModel):
    scope = CharField(choices=["namespace", "legacy_namespace"])
    # scope field is added to make querying faster
    reference = CharField(
      validators=[ReferenceRegexValidator("...")]
    )  # cisco, 123__legacy_namespace

    class Meta:
      unique_together = ["scope", "reference"]

How data is updated when user does an opt-out

AiIndexDenylist.objects.create(scope="namespace", reference="cisco")
AiIndexDenylist.objects.create(scope="legacy_namespace", reference="123__legacy_namespace")

How data is updated when user reverses an opt-out

AiIndexDenylist.objects.delete(scope="namespace", reference="cisco")

How data is queried

# On namespace serializer
AiIndexDenylist.objects.filter(
    scope="namespace",
    reference="cisco"
)

# legacy_namespace ...

When settings.WISDOM_ENABLED=true the query adds a widsom_index: true using annotations whenever respective
value .exists in the Wisdom table.

When settings.WISDOM_ENABLED=false, then the annotation defaults to null

# namespace
queryset = Namespace.objects.all().annotate(
  wisdom_index=When(
    Subquery(AiIndexDenylist.objects.filter(scope="namespace", reference=OuterRef("name")))
  )
) if settings.WISDOM_ENABLED else False

# collection version
queryset = CollectionVersion.objects.all().annotate(
  wisdom_index=When(
    Subquery(
      AiIndexDenylist.objects.filter(
        scope="namespace",
        reference=OuterRef("namespace")
      )
    )
  )
) if settings.WISDOM_ENABLED else False

# legacy_namespace
# ...

# legacy_role
# ...

# consider collection api `wisdom_index` out of scope for 31Mar

Object consideration across galaxy_ng & pulp_ansible

CollectionVersion views & serializers

These are located in pulp_ansible not galaxy_ng. Since this is a temporary solution consider the pros and cons of adding wisdom_index to pulp_ansible (and maybe needing to remove later), or subclassing in galaxy_ng (which we have been moving away from). If we subclass in Feb, we can always add to pulp_ansible later (after next round of requirement adjustments)

Namespace model, view, & serializer

Write in a way that the AiIndexDenylist table can be used by galaxy_ng.Namespace and pulp_ansible.Namespace, since this model will be moved in the middle of galaxy wisdom work

In addition to the Namespace model being moved, the view and seralizer will be moved as well (in the middle of galaxy wisdom work). Consider making the wisdom work as decouped as possible to not make these conflict, we can adjust later after AAP2.4 dates.

How data is exposed?

On each serializer the queries will be annotated with wisdom data
conditioned by a setting wisdom_enabled: true and when this
flag is turned off then the wisdom_index field will default to null
to maintain API schema compatibility but save querying time,
the default will be OFF.

How to manage marks via API?

GET|POST|DELETE /_ui/v3/wisdom/

{
  "scope": "string",
  "reference": "string"
}
  • If mark already applied it will return 409 Conflict (UI will theat the exception accordingly,
    for example by showing, "this content is already marked")
  • If mark doesn't exist API will return 404 on DELETE and GET
  • Validation will be performed:
    • reference matches the pattern (with only namespace and legacy_namespace may not be as important)
    • referenced object will be checked for existence (regardless of the repository)

Permission

Namespace Policies will be applied to wisdom view,
which means any user with permission to change a namespace
will be able to manage wisdom mark.

Caveats

  • Content can be marked in multiple scopes
  • Deleting mark from higher scope doesn't clean up specific marks
  • Deletion of referenced object doesn't propagate as it is not a normalized
    SQL Relationship.
    • There will be the need for a scheduled clean up if deemeded important. However this can is low priority and come later, if obj is deleted, then the seralizer won’t be querying Wisdom table
Select a repo