We are moving away from Stardog as a graph backend,
mostly because they no longer provide a free academic license
but instead provide short-term "trials".
Take a look at https://github.com/neurobagel/planning/issues/9
to see our progress in picking a replacement.
In the meantime, here are instructions for deploying
[graphDB](https://graphdb.ontotext.com/) as our graph
backend instead of Stardog.
## Configure the environment variables
Follow the [Launch the API](https://neurobagel.org/infrastructure/#launch-the-api-and-graph-stack)
section of our public docs,
but change the following variables in the `.env` file from
[the defaults described in the docs](https://neurobagel.org/infrastructure/#set-the-environment-variables):
```sh
NB_GRAPH_IMG=ontotext/graphdb:10.3.1
NB_GRAPH_ROOT_CONT=/opt/graphdb/home
NB_GRAPH_PORT=7200
NB_GRAPH_PORT_HOST=7200
NB_GRAPH_DB=repositories/my_db # NOTE: for graphDB, this value should always take the the format of: repositories/<your_database_name>
```
Make a copy of [the default `docker-compose.yml`](https://github.com/neurobagel/api/blob/main/docker-compose.yml) file in the same directory
and then run `docker compose up -d` to launch
the Neurobagel services.
Refer to [the API readme](https://github.com/neurobagel/api/blob/main/README.md) for additional instructions.
## First time setup commands
When the API, graph, and query tool have been started
and are running for the first time, you will have to
do some first-run configuration.
### Setup security and users
Also refer to https://graphdb.ontotext.com/documentation/10.0/devhub/rest-api/curl-commands.html#security-management
First, change the password for the admin user that has been automatically
created by graphDB:
```
curl -X PATCH --header 'Content-Type: application/json' http://localhost:7200/rest/security/users/admin -d '
{"password": "NewAdminPassword"}'
```
make sure to replace `"NewAdminPassword"` with your own, secure password.
Next, enable graphDB security to only allow authenticated users access:
```
curl -X POST --header 'Content-Type: application/json' -d true http://localhost:7200/rest/security
```
and confirm that this was successful:
```
➜ curl -X POST http://localhost:7200/rest/security
Unauthorized (HTTP status 401)
```
Now we can create a user for the API:
```
curl -X POST --header 'Content-Type: application/json' -u "admin:newpassword" -d '
{
"username": "DBUSER",
"password": "DBPASSWORD"
}' http://localhost:7200/rest/security/users/DBUSER
```
### Create a graph database
In graphDB, graph databases are called resources.
To create a new one, you will also have to prepare a `data-config.ttl` file
that contains the settings for the resource you will create ([see the graphDB docs](https://graphdb.ontotext.com/documentation/10.0/devhub/rest-api/location-and-repository-tutorial.html#create-a-repository)).
**make sure to that the value for `rep:repositoryID`
in the `data-configl.ttl` file matches the value of
`NB_GRAPH_DB` in your `.env` file**.
For example, if `NB_GRAPH_DB=my_db`, then
`rep:repositoryID "my_db" ;`.
You can use this example file and save
it as `data-config.ttl` locally:
```
#
# RDF4J configuration template for a GraphDB repository
#
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix graphdb: <http://www.ontotext.com/config/graphdb#>.
[] a rep:Repository ;
rep:repositoryID "my_db" ;
rdfs:label "" ;
rep:repositoryImpl [
rep:repositoryType "graphdb:SailRepository" ;
sr:sailImpl [
sail:sailType "graphdb:Sail" ;
graphdb:read-only "false" ;
# Inference and Validation
graphdb:ruleset "rdfsplus-optimized" ;
graphdb:disable-sameAs "true" ;
graphdb:check-for-inconsistencies "false" ;
# Indexing
graphdb:entity-id-size "32" ;
graphdb:enable-context-index "false" ;
graphdb:enablePredicateList "true" ;
graphdb:enable-fts-index "false" ;
graphdb:fts-indexes ("default" "iri") ;
graphdb:fts-string-literals-index "default" ;
graphdb:fts-iris-index "none" ;
# Queries and Updates
graphdb:query-timeout "0" ;
graphdb:throw-QueryEvaluationException-on-timeout "false" ;
graphdb:query-limit-results "0" ;
# Settable in the file but otherwise hidden in the UI and in the RDF4J console
graphdb:base-URL "http://example.org/owlim#" ;
graphdb:defaultNS "" ;
graphdb:imports "" ;
graphdb:repository-type "file-repository" ;
graphdb:storage-folder "storage" ;
graphdb:entity-index-size "10000000" ;
graphdb:in-memory-literal-properties "true" ;
graphdb:enable-literal-index "true" ;
]
].
```
Then you can create a new graph db with the following command (replace "my_db" as needed):
```bash
curl -X PUT -u "admin:newpassword" http://localhost:7200/repositories/my_db --data-binary "@data-config.ttl" -H "Content-Type: application/x-turtle"
```
and add give our user access permission to the new resource:
```
curl -X PUT --header 'Content-Type: application/json' -d '
{"grantedAuthorities": ["WRITE_REPO_my_db","READ_REPO_my_db"]}' http://localhost:7200/rest/security/users/DBUSER -u "admin:newpassword"
```
- `"WRITE_REPO_my_db"`: Grants write permission.
- `"READ_REPO_my_db"`: Grants read permission.
**Note**: make sure you replace `my_db` with the name of the graph db you
have just created.
### Upload test data to the graph
To test that the above setup steps worked correctly, we can add some example graph-ready data (JSONLD files) to the new graph db from the [neurobagel/neurobagel_examples](https://github.com/neurobagel/neurobagel_examples) repository.
First, clone `neurobagel/neurobagel_examples`:
```bash
git clone https://github.com/neurobagel/neurobagel_examples.git
```
The `neurobagel/api` repo comes with a helper script [add_data_to_graph.sh](https://github.com/neurobagel/api/blob/main/add_data_to_graph.sh) to automatically upload all JSONLD files in a directory to a user-specified graph database, with the option to clear the existing data in the database first.
_**A version of this script for a GraphDB endpoint is available from [here](https://gist.github.com/alyssadai/e10d0ba1d8e89d1564b7029b386e6637).**_
Download the `add_data_to_graph_graphdb.sh` script:
```bash
git clone https://gist.github.com/e10d0ba1d8e89d1564b7029b386e6637.git
```
To view all the command line arguments for the script:
```bash
./add_data_to_graph_graphdb.sh --help
```
> ℹ️ **Note: If you prefer to directly use `curl` requests to modify the graph database instead of the helper script**
>
> Add a single dataset to the graph database (example):
> ```bash
> curl -u "<USERNAME>: <PASSWORD>" -i -X POST http://localhost:7200/repositories/<DATABASE_NAME>/statements \
> -H "Content-Type: application/ld+json" \
> --data-binary @<DATASET_NAME>.jsonld
> ```
>
> Clear all data in the graph database (example):
> ```bash
> curl -u "<USERNAME>: <PASSWORD>" -X POST http://localhost:7200/repositories/<DATABASE_NAME>/statements \
> -H "Content-Type: application/sparql-update" \
> --data-binary "DELETE { ?s ?p ?o } WHERE { ?s ?p ?o }"
> ```
Now, we will upload to the graph db we created above the data in the directory `neurobagel_examples/data-upload/pheno-bids-output`. To do this, run the helper script as follows:
```bash
./add_data_to_graph_graphdb.sh PATH/TO/neurobagel_examples/data-upload/pheno-bids-output localhost:7200 repositories/my_db DBUSER DBPASSWORD \
--clear-data
```
**NOTE:** Here we added the `--clear-data` flag to remove any existing data in the database (if the database is empty, the flag has no effect). You can choose to omit the flag or explicitly specify `--no-clear-data` (default behaviour) to skip this step.