# Ginkgo Proposal - Label Sets
## The Problem Statement
As Ginkgo 2.0 adoption spreads we are starting to see real-world usage of labels reveal limitations in the current implementaiton of labels and label filters. Specifically:
- while support for regular expressions in the filter language allows for some flexibility, this ccomes at a high complexity cost that leads to difficult-to-maintain filters
- some filters cannot be expressed even wtih regular expressions. For example the kubernetes e2e test suite would like to express "run all specs that contain `Feature:A` or `Feature:B` but not any other `Feature:XXX` flag." In particular, expressing this filter is made or the more challenging by the fact that some specs include multiple different `Feature:XXX` flags.
## Proposal
We propose extending Ginkgo's labels and label filter language in an additive way that is (mostly) backward compatible.
Throughout when we call a string "empty" we mean the empty string or a string containing only whitespace.
### Label Sets
Labels of the form `Label("X:Y")` where `X` and `Y` are any string will be parsed both as a single label with value `"X:Y"` **and** as a set with key `X` containing element `Y`. Any labels in the spec's hierarchy with `Label("X:Z")` will contribute element `Z` to the set with key `X`. Duplicate elements are combined into one unique element. The order of elements in the set is not defined.
When `Label("X:Y")` is parsed, any whitespace around `X` and `Y` will be trimmed. Only the first `:` serves as the delimiter . If either `X` or `Y` is empty an error will be raised and the suite will not run.
Thus:
```go
Describe("outer", Label("Feature:Alpha"), func() {
It("reticulates splines", Label("Feature: Gamma"), func() {
// this spec will have the labels:
// "Feature:Alpha"
// "Feature: Gamma"
// "Feature": {"Alpha", "Gamma"} // a label set
}))
Describe("inner", Label("Feature :Beta") Label("Feature: Alpha"), func() {
It("is nested", Label("Feature:Delta:Epsilon"), func() {
// this spec will have the labels:
// "Feature:Alpha"
// "Feature :Beta"
// "Feature:Delta:Epsilon"
// "Feature: Alpha"
// "Feature": {"Alpha", "Beta", "Delta:Epsilon"} // a label set
})
})
})
```
When emitting the labels for a spec (e.g. in the spec report) the label sets will be suppressed. Only the individual labels will show (e.g. the "is nested" test above would show: `[Feature:Alpha, Feature :Beta, Feature:Delta:Epsilon]`).
### Filter Query Language Extensions
From the [Ginkgo docs](https://onsi.github.io/ginkgo/#spec-labels), the filter language currently consist of:
1. The `&&` and `||` logical binary operators representing AND and OR operations.
2. The `!` unary operator representing the NOT operation.
3. The `,` binary operator equivalent to `||`.
4. The `()` for grouping expressions.
5. Regular expressions can be provided using `/REGEXP/` notation.
6. All other characters will match as label literals. Label matches are **case insensitive** and trailing and leading whitespace is trimmed.
We propose to keep this largely intact but to replace item 6 with:
- If the characters match `KEY: SET_OPERATION <OPTIONAL_ARGS>` then a set operation is performed.
- Otherwise the characters will match as label literals. Label matches are **case insensitive** and trailing and leading whitespace is trimmed.
When parsing `KEY: SET_OPERATION <OPTIONAL_ARGS>`:
- any leading and trailing whitespace around `KEY` is trimmed.
- there **must** be a space between `:` and `SET_OPERATION`
- `SET_OPERATION` must be one of the valid set operations (see below). The match is case insensitive. If `SET_OPERATION` does not match an error is thrown.
- the format of `<OPTIONAL_ARGS>` depends on the value of `SET_OPERATION` (see below)
- regular expressions are not supported in `KEY`, `SET_OPERATION`, and `<OPTIONAL_ARGS>`
#### Set Operations:
The following `SET_OPERATION`s are supported:
---
##### `isEmpty`
`<OPTIONAL_ARGS>`: **must** be empty
Behavior: matches if the spec does not have any label sets with `KEY`
Examples:
- `Feature: isEmpty` will match if the spec does not have any labels of the form `Feature:Y`.
- `!(Feature: isEmpty)` will match if the spec does have any labels of the form `Feature:Y`.
---
##### `containsAll`
`<OPTIONAL_ARGS>`:
- must **not** be empty
- can be a single literal
- can be a set of the form `{X, Y, Z}`
Once parsed, `<OPTIONAL_ARGS>` forms a set of values. If only a single literal is provided the set contains only that one element.
Behavior: matches if the spec has a label set with key `KEY` that contains **all** the provided elements in `<OPTIONAL_ARGS>`. The match is not case sensitive and leading/trailing whitespace is trimmed. Regular expressions are **not** supported. Order is irrelevant.
Examples:
- `Feature: containsAll Alpha` matches any specs that have `Feature:Alpha`
- `Feature: containsAll {Alpha, Beta}` matches any specs that have both `Feature:Alpha` and `Feature:Beta`
- `(Feature: containsAll Alpha) && !(Feature: containsAll Beta}` matches any specs that have both `Feature:Alpha` and *not* `Feature:Beta`
---
##### `containsAny`
`<OPTIONAL_ARGS>`:
- must **not** be empty
- can be a single literal
- can be a set of the form `{X, Y, Z}`
Once parsed, `<OPTIONAL_ARGS>` forms a set of values. If only a single literal is provided the set contains only that one element.
Behavior: matches if the spec has a label set with key `KEY` that contains **at least one** of the provided elements in `<OPTIONAL_ARGS>`. The match is not case sensitive and leading/trailing whitespace is trimmed. Regular expressions are **not** supported. Order is irrelevant.
Examples:
- `Feature: containsAny Alpha` matches any specs that have `Feature:Alpha`
- `Feature: containsAny {Alpha, Beta}` matches any specs that have either `Feature:Alpha` or `Feature:Beta` or both
---
##### `consistsOf`
`<OPTIONAL_ARGS>`:
- must **not** be empty
- can be a single literal
- can be a set of the form `{X, Y, Z}`
Once parsed, `<OPTIONAL_ARGS>` forms a set of values. If only a single literal is provided the set contains only that one element.
Behavior: matches if the spec has a label set with key `KEY` that exactly consists of **all** the provided elements in `<OPTIONAL_ARGS>`. The match is not case sensitive and leading/trailing whitespace is trimmed. Regular expressions are **not** supported. Order is irrelevant.
Examples:
- `Feature: consistsOf Alpha` matches any specs that have `Feature:Alpha` and no other `Feature:X` labels
- `Feature: consistsOf {Alpha, Beta}` matches any specs that have both `Feature:Alpha` and `Feature:Beta` and no other `Feature:X` labels
---
##### `isSubsetOf`
`<OPTIONAL_ARGS>`:
- must **not** be empty
- can be a single literal
- can be a set of the form `{X, Y, Z}`
Behavior: matches if the spec has a label set with key `KEY` in which all values for `KEY` appear in `<OPTIONAL_ARGS>`. The match is not case sensitive and leading/trailing whitespace is trimmed. Regular expressions are **not** supported. Order is irrelevant.
The empty set is considered a subset of `<OPTIONAL_ARGS>`. Thus if the spec does **not** have a label set wtih key `KEY` it will nonetheless match.
Examples:
Consider a spec with the label set `Feature: {Alpha, Beta, Gamma}` and no other label set
- `Feature: isSubsetOf Alpha` will not match
- `Feature: isSubsetOf {Alpha, Gamma, Beta, Delta}` will match
- `Feature: isSubsetOf {Gamma, Beta, Delta}` will not match
- `Environment: isSubsetOf Prod` will match (the spec does not have an `Environment:` label set and so it will match)
If we wanted to stipulate that **only** specs that both **have** a `Feature` and are in the desired subset of `Feature` (e.g. `{Alpha, Beta}`) run we could write:
- `!(Feature: isEmpty) && (Feature: isSubsetOf {Alpha, Beta})`
now specs without any `Feature:` wil not match. Any any specs that have `Feature:XXX` where `XXX` is not one of `Alpha` or `Beta` also will not match. This, effectivley, allows us to say "has a feature and that feature is `Alpha` or `Beta` but nothing else". Note that this is different from `(Feature: contains Alpha) || (Feature: contains Beta)` as that would allow `Feature: {Alpha, Delta}`. It is equivalent to `(Feature: consistsOf Alpha) || (Feature: consistsOf Beta) || (Feature: consistsOf {Alpha, Beta})` but is less verbose!
### Potential Backwards Incompatibilities
Users with labels of the form `Label(":")`, `Label(":X")` or `Label("Y:")` will run into issues and will need to change their labels. We expect these scenarios will be rare enough to warrant proceeding with this proposal.
### k8s e2e Cases
These are from [Antonio's doc](https://docs.google.com/document/d/1zdj-QGfxwlY39Q9hPHDkZGM4DsJbWobNZz87lxBmboE/edit#heading=h.4mws571l24ak)
- Run SIG Network tests: `SIG: consistsOf Network`
- Run tests that are SIG Network or SIG Storage: `SIG: isSubsetOf {network, storage) && !(SIG: isEmpty)`
- Run all SIG tests except Network: `!(SIG: consistsOf Network)`
- Run tests that require load balancers and no other unknown feature: `Feature: isSubsetOf loadbalancer && !(Feature: isEmpty)`
- Run all networking tests with IPv6 except the network policy ones: `SIG: consistsOf Network && Feature: containsAny IPv6 && !(Feature: containsAny network-policy)`
- Run storage tests that require the NFS driver: `SIG: consistsOf storage && Feature: containsAny DriverNFS`
- Run all stable tests which have no special requirements: `Feature: isEmpty`
- Run the tests for the new ServiceCIDR feature: `FeatureGate: containsAny ServiceCIDR`
- Run all linux tests: `Linux`
- Run all tests that are not both serial and slow (i.e. slow parallel tests are ok, and fast serial tests are ok): `!(Slow && Serial)`
- Run all tests that are neither serial nor slow (i.e. no serial tests ever, no slow tests ever): `!Serial && !Slow`
### Questions:
- Do we need both all these set operations? Do we need both `containsAny` and `constainsAll`? Do we need `consistsOf`? Are we missing any?
- Is the behavior of `isSubsetOf` for the empty-set case reasonable and intuitive? Or is it a foot-gun?
- Are any of these names ambiguous?
- Is this language getting too complex? What sort of (cheap!) support structures do we need to provide users with?