# Query "chain-of-repos provenances"
## Example use case within omnibenchmark
[This method project](https://renkulab.io/projects/omnibenchmark/omni_batch/mnn-omni-batch) runs a method called *mnn* as part of omnibenchmark. As input it uses preprocessed (e.g., normalized counts) files, that are imported as renku dataset **omni_batch_processed** and generated in [this preprocessing project](https://renkulab.io/projects/omnibenchmark/omni_data/omni-batch-processed). Preprocessing is done in a standardized way on all renku datasets with the keyword *omni_batch* (e.g, **cellbench** and **csf_patient** datasets, which are generated [here](https://renkulab.io/projects/omnibenchmark/omni_data/cellbench) and [here](https://renkulab.io/projects/omnibenchmark/omni_data/csf-patients)). Besides the processed counts files **omni_batch_processed** dataset contains meta data files for each of the original datasets:
``` bash
renku dataset ls-files omni_batch_processed
omni_batch_processed 2021-06-04 12:59:51 175 KB data/omni_batch_processed/meta_csf_patient.json
omni_batch_processed 2021-06-04 12:59:51 102 KB data/omni_batch_processed/meta_cellbench.json
omni_batch_processed 2021-06-04 12:59:51 44 MB data/omni_batch_processed/norm_counts_cellbench.mtx.gz
omni_batch_processed 2021-06-04 12:59:51 9.2 MB data/omni_batch_processed/norm_counts_csf_patient.mtx.gz
```
In the method project we want to run the method *mnn* on all `norm_counts_*.mtx.gz` and their corresponding `meta_*.json` by generating one renku workflow per original dataset (so one workflow per `norm_counts_*.mtx.gz`).
In the moment we find the corresponding meta file by matching names between `meta_*.json` and `norm_counts_*.mtx.gz`. As we can not be sure that our user comply with our naming scheme, a cross-repository query to identify the corresponding original dataset and so `meta_*.json` for each`norm_counts_*.mtx.gz` would be more robust. This is even more relevant for final output files of omnibenchmark, when we would like to track back which dataset and method a result file originates from.
### Example query
From within [mnn_omni_batch](https://renkulab.io/projects/omnibenchmark/omni_batch/mnn-omni-batch) find the **dataset id** of the dataset, that was used as input to generate `data/omni_batch_processed/norm_counts_cellbench.mtx.gz`. Query should give the **dataset_id** of the *cellbench* dataset.
### Scheme
