CKAN Core Development for Humans
===
Complete Guide by Sivan Grünberg
[TOC]
## Premable
Ahoy! I'm glad you're here. If you arrived in this page, I assume that you are interested in contributing to [CKAN](https://ckan.org/) , the world's leading open source data management system. This page is a contributor's companion and step by step walkthrough to set up the CKAN stack (CKAN relies on [Redis](https://redis.io/), [SOLR](https://solr.apache.org/) and [PostgreSQL](https://www.postgresql.org/) for it's search and storage functionality) on one's local machine using [docker](https://docs.docker.com/get-started/overview/) - which is a software used to contain other software, particularily for development. That way you isolate the CKAN stack from you host computer - usually your laptop or desktop and are able to experiment with different versions of software runtimes and dependencies without affecting your host system.
While **docker** knowledge is recommended it is not mandatory, as this document walks through the requisite commands to get up and running. But if you haven't used docker in the past I recommend getting acquainted with it as it can make the development life cycle easier and more straightforward, but also I hold the opinion that knowing actually what one's doing, is better rather than just copying commands blindly.
As CKAN is written using [Python](https://www.python.org/) a good command of Python is neccessary to be able to suggest code changes via **Git** pull requests. To learn more about Python visit the aforementioned link.
## Software Prerequisites
As we're going to use **docker** for our local development setup, you should have it already installed before starting with this document. **Docker** is available for all major platforms and you can follow this [page](https://docs.docker.com/get-docker/) to get installed.
The advantage of using CKAN **docker** setup is once we have docker installed and ready, all other dependencies and rest of the building, setting up, initializing and deploying happens within **docker** almost automatically, after specifying several configurable parameters via environment variables.
This repository uses [docker-compose](https://docs.docker.com/compose/) to setup the multiple containers needed for runnig the CKAN data discovery portal web application. `docker-compose` is essentially a tool that allows close-to-production style bootstrapping of the webapp together with the (micro-)services it depends on using a one line *shell* command- so is very useful in our case here.
The other tool you'd need is **git** which is now the world's most famous and popular source control management software, read all about it and get get it [here](https://git-scm.com/)
## Cloning the `docker-ckan` repository
We'll use this [OKFN](https://okfn.org/) repository, which is frequently maintianed and is now updated with the latest major upgrade of CKAN, version 2.9 that's using Python 3. Current development if CKAN happens with this version, which is a major upgrade form the previous 2.8 that used Python 2.7, now already [End Of Life](https://https://www.python.org/doc/sunset-python-2/) for over a year and a half.
It is recommended that you create a dedicates folder for this, and inside run the following commands:
```bash=
$ git clone https://github.com/okfn/docker-ckan.git
```
This will result in a `./docker-ckan` directory available where you cloned the repo, where which you can configure parameters and start the containers neccessary for running your local CKAN portal.
## Configuration
You configure this `docker-compose` setup via environment variables. The way to do so is to copy the template `.env.example` file to `.env` and configure whatever customization you'd like to introduce as values to the environment variable already introduced in this file:
```bash=
$ cd docker-ckan
$ cp .env.example .env
$ vi .env
```
You can use whatever editor you're comfortable with, here I'm using *vim*.
Some explanation of those configuration variables is availabe inline in the config file snapshot below:
```bash=
# DB image settings
POSTGRES_PASSWORD=ckan ## The password to configure PostgreSQL with
DATASTORE_READONLY_PASSWORD=datastore # Used for dataset resource preview, but not only.
# Basic
CKAN_SITE_ID=default # Used to identify the particular CKAN instance (useful in multi-instance environment)
CKAN_SITE_URL=http://ckan:5000 # where this setup will listen to CKAN API request and serve its UI.
CKAN_PORT=5000
CKAN_SYSADMIN_NAME=ckan_admin # admin user to log into CKAN with
CKAN_SYSADMIN_PASSWORD=test1234 # its matching password
CKAN_SYSADMIN_EMAIL=your_email@example.com
TZ=UTC
# All the required database connection configurations for both
# the webapp (CKAN portal) and the Datastore (which is where CKAN stored CSV and other resource data)
CKAN_SQLALCHEMY_URL=postgresql://ckan:ckan@db/ckan
CKAN_DATASTORE_WRITE_URL=postgresql://ckan:ckan@db/datastore
CKAN_DATASTORE_READ_URL=postgresql://datastore_ro:datastore@db/datastore
# Same, but for the Tdatabase connections
TEST_CKAN_SQLALCHEMY_URL=postgres://ckan:ckan@db/ckan_test
TEST_CKAN_DATASTORE_WRITE_URL=postgresql://ckan:ckan@db/datastore_test
TEST_CKAN_DATASTORE_READ_URL=postgresql://datastore_ro:datastore@db/datastore_test
# Other services connections
CKAN_SOLR_URL=http://solr:8983/solr/ckan # CKAN's search service
CKAN_REDIS_URL=redis://redis:6379/1 # caching and message queue for background jobs
CKAN_DATAPUSHER_URL=http://datapusher:8800 # Datapusher uploads resource data in the background from remote URLs
CKAN__DATAPUSHER__CALLBACK_URL_BASE=http://ckan:5000 # Callback URL for when an upload has finished
TEST_CKAN_SOLR_URL=http://solr:8983/solr/ckan
TEST_CKAN_REDIS_URL=redis://redis:6379/1
# Core settings
# Where CKAN stores uploaded resource files and webassets. (see
# https://webassets.readthedocs.io/en/latest/)
CKAN__STORAGE_PATH=/var/lib/ckan
# CKAN can send notificaiton emails and other alerts,
# here you configure via which SMTP server it can do so
CKAN_SMTP_SERVER=smtp.corporateict.domain:25
CKAN_SMTP_STARTTLS=True
CKAN_SMTP_USER=user
CKAN_SMTP_PASSWORD=pass
CKAN_SMTP_MAIL_FROM=ckan@localhost
# Extensions
# Extension are how you extend CKAN, improve it and customize it to whatever needs you may have for the system.
# Here you configure the list of extension that will be
# loaded in this docker-compose setup.
# NOTE: The order is important! So if you have extension B, that
# uses base code in A, A must come first.
# NOTE: In the filesystem, CKAN extensions are
# usually as "ckanext-{EXTENSION_NAME}", note
# that in the configuration parameter you only
# specify the actual name of the extension rather the "ckanext-" prefix.
CKAN__PLUGINS=envvars image_view text_view recline_view datastore datapusher
# Harvester setup, to read more about
# the harvester (which is an extension
# to "harvest" other CKAN portals etc visit:
# https://github.com/ckan/ckanext-harvest)
CKAN__HARVEST__MQ__TYPE=redis
CKAN__HARVEST__MQ__HOSTNAME=redis
CKAN__HARVEST__MQ__PORT=6379
CKAN__HARVEST__MQ__REDIS_DB=1
```
Once you're satisfied with the settings, it's time to move to the next section.
## Booting Up Containers
As promised, after doing all this configuration we can kick start our multi container CKAN app. Get into the `docker-ckan` directory and from the command prompt issue the followin:
```bash=
$ docker-compose -f docker-compose.dev.yaml up
```
If this is the first time you are executing this, you'd see a lot of interesting and verbose message of building the dependent continaers (see a list of software in the *Premable*) until you reach a state where the setup has finished building and the webapp is accepting HTTP connections, as an example it could similar to this:
```bash=
$ docker-compose -f docker-compose.dev.yml up
Docker Compose is now in the Docker CLI, try `docker compose up`
Creating network "docker-ckan_default" with the default driver
Building datapusher
[+] Building 395.7s (10/10) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.62kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.13 3.5s
=> [1/6] FROM docker.io/library/alpine:3.13@sha256:f51ff2d96627690d62fee79e6eecd9fa87429a38142b5df8a3bfbb26061df7fc 0.0s
=> => resolve docker.io/library/alpine:3.13@sha256:f51ff2d96627690d62fee79e6eecd9fa87429a38142b5df8a3bfbb26061df7fc 0.0s
=> => sha256:f51ff2d96627690d62fee79e6eecd9fa87429a38142b5df8a3bfbb26061df7fc 1.64kB / 1.64kB 0.0s
=> => sha256:def822f9851ca422481ec6fee59a9966f12b351c62ccb9aca841526ffaa9f748 528B / 528B 0.0s
=> => sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec 1.47kB / 1.47kB 0.0s
=> [2/6] WORKDIR /srv/app 0.0s
=> [3/6] RUN apk add --no-cache python3 py3-pip py3-wheel libffi-dev libressl-dev libxslt uwsgi uwsgi-http uwsgi-corerouter uwsgi-python && apk add --no-ca 34.4s
=> [4/6] RUN mkdir /srv/app/src && cd /srv/app/src && git clone -b 0.0.17 --depth=1 --single-branch https://github.com/ckan/datapusher.git && cd datapusher && python3 setup.py install && 352.2s
=> [5/6] RUN apk del .build-deps && cp /srv/app/src/datapusher/deployment/*.* /srv/app && sed -i '/http/d' /srv/app/datapusher-uwsgi.ini && sed -i '/wsgi-file/d' /srv/app/datapusher-uwsgi. 0.5s
=> [6/6] RUN addgroup -g 92 -S www-data && adduser -u 92 -h /srv/app -H -D -S -G www-data www-data
............
.........
.....
..
```
For brevity I bring here only a portion of it, as the scroll log grows big during this first time operation.
To see that it has finished and your portal is ready to accept connections, make sure to stop the following in this scroll log if the output from the compose commmand:
```bash=
ckan-dev_1 | 2021-06-27 14:49:05,536 INFO [ckan.cli.server] Running CKAN on http://0.0.0.0:5000
```
Once you see this line you can direct your web browser to the aforementioned address, and see if you get the first CKAN landing page (with the **Welcome to CKAN** text) - if you do, the setup is a success. If not, try to spot any error messages coming in the terminal output to try and understand what went wrong and potentially get an idea how to fix it (using Google, Stackoverflow et al).
On subsequent invocations of commands, since docker will already have container cache for the built containers, the process should be much quicker and also somewhat less verbose.
## Hacking the CKAN Core Code
We've reached the coolest part of this guide, where we embark on the very satisfying experience of modifying CKAN core's code, and fix or improve its behaviour right infront our own eyes!
To do so, we'd edit some UI visible change onto the core code, and witness the change manifested in the respective UI part we edit. A cool feature of this docker setup we use here, is that Python files edited are automatically reloaded to allow **Live-Editing** which can make developemnt faster to test changes.
Now, let's get back to the terminal and some editing. While I'm using `vim` here you're free to use whatever editor you're comfortable with be it **VSCode** , **PyCharm** etc.
Get back to the root directory - **docker-roche** repository you cloned. Then let's introduce an edit to CKAN core file mentioned below in the terminal transcript:
```bash=
$ pwd # this gives the current working dir on *nix systems
/Users/sivan/CKAN3/docker-ckan
```
This will be different for you depending on your OS and where you cloned `docker-ckan`.
Now if your docker-compose recipe isn't already running, go back to [Booting Up Containers](#Booting-Up-Containers) and make sure your `docker-compose` recipe is up and running. When you do so, make sure you use the `docker-compose.dev.yaml` to bootstrap the containers in **development mode** (which allows, among other things - hot reload of code changes to test them immediately).
Now, since the CKAN core code lives inside the container, the only way to access and edit it as by executing a shell into the container, which has somewhat limited editing environment (unconfigured `vim`) but since in this setup there's no other easy way to edit the code, will stick to that. Knowing some basic `vim` editing is beneficial also when working on remote servers (which is essentially the same) so it is recommended to take some time to familiarize with `vim`'s basic editing commands, or alternatively you can install `nano` on the container in question and use it to edit code.
Now let's edit some code following this workflow:
* We execute a `bash` shell onto the web app container itself.
* We edit a Python file, watch as it reloads
* We test our changes interactively.
Note that template changes workflow is the same, only in that case you just need to reload the web page.
So, open another terminal at the same root folder for the `docker-ckan` cloned repository and issue:
```bash=
$ docker-compose -f docker-compose.dev.yml exec ckan-dev bash
```
Then you should get a prompt similar to:
```bash=
bash-5.1#
```
This prompt is actually coming form the running CKAN Web App container. We've now entered it with a shell running, so we can issue commands there as if it was our host machine.
Let's modify dataset search operation a bit, such that it provides us more clarity into what is happening when a search for datasets is issued by the user- this is helpful for debugging but also for planning upcoming code changes and is a favorite way of mine to *"get a persepective"* before embarking on business logic change.
Issue the following:
```bash=
bash-5.1# vi src/ckan/ckan/lib/search/query.py
```
This will open **vi** editor editing the file where the query behaviour for datasets search is defined.
Go to line **293** you should see the run query method definition:
```python=
def run(self, query, permission_labels=None, **kwargs):
'''
Performs a dataset search using the given query.
:param query: dictionary with keys like: q, fq, sort, rows, facet
:type query: dict
:param permission_labels: filter results to those that include at
least one of these labels. None to not filter (return everything)
:type permission_labels: list of unicode strings; or None
:returns: dictionary with keys results and count
May raise SearchQueryError or SearchError.
'''
assert isinstance(query, (dict, MultiDict))
# check that query keys are valid
....
...
```
Now let's add a `log` call to show us the the resulting search query parameters that will be passed to SOLR.
Locate line **376** , if you already see code there to debug print the value of `query` change the `debug` to `info` , if not add the line mentioned below and this is also how line **376** should look like after your changes:
```python=
log.info('Package query: %r' % query)
```
Save you changes, and watch on your `docker-compose` terminal window notify you about reloading files since you made a change. BTW- if you're not comfortable with editing using `vim` (it has a bit different user interface from most line editors) you can install `nano` which is much easier and self explantory editor to use for this purpose. In the containr you can install it like so:
```bash=
bash-5.1# apk add nano
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.13/community/x86_64/APKINDEX.tar.gz
(1/1) Installing nano (5.4-r3)
Executing busybox-1.32.1-r6.trigger
OK: 778 MiB in 112 packages
```
Once it has been installed, just swap `vim` with `nano` in the editing commands above.
Horey! You survived so far ;) And the code is now modified to bring us joy and insight into what's happening at the dataset search query.
Navigate to the dataset search page:

You can type in the same query string I typed as shown in the screenshot, or just any other search text. Make sure to have the `docker-compose` window open as well so you can look at it at the moment you hit **ENTER** at the dataset search text bar.
So type your search text and hit ENTER. You should see a similar ***INFO** line output to the below mentioned when doing so:
```dockerfile=
2021-08-11 17:31:44,296 INFO [ckan.lib.search.query] Package query: {'facet.field': ['organization', 'groups', 'tags', 'res_format', 'license_id'], 'fq': ['', '+site_id:"default"', '+state:active'], 'q': 'test search', 'rows': 21, 'start': 0, 'df': 'text', 'sort': 'score desc, metadata_modified desc', 'fl': 'id validated_data_dict', 'facet': 'true', 'facet.limit': '50', 'facet.mincount': 1, 'wt': 'json', 'defType': 'dismax', 'tie': '0.1', 'mm': '2<-1 5<80%', 'qf': 'name^4 title^4 tags^2 groups^2 text'}
```
A JSON object is printed out:
```json=
{
"facet.field":[
"organization",
"groups",
"tags",
"res_format",
"license_id"
],
"fq":[
"",
"+site_id:\"default\"",
"+state:active"
],
"q":"test search",
"rows":21,
"start":0,
"df":"text",
"sort":"score desc, metadata_modified desc",
"fl":"id validated_data_dict",
"facet":"true",
"facet.limit":"50",
"facet.mincount":1,
"wt":"json",
"defType":"dismax",
"tie":"0.1",
"mm":"2<-1 5<80%",
"qf":"name^4 title^4 tags^2 groups^2 text"
}
```
Here we can see that we're filtering for facets "groups" , "tags", "organization" etc., and this gives us very intimate information of exactly which parameters we're directly passing to **SOLR**, so if we're trying to debug in issue with search this would be a perfect way to start! To understand exactly what those parameters mean, I highly recommend reading some of **SOLR**'s documentation such as [this](https://solr.apache.org/guide/6_6/common-query-parameters.html).
Now, editing the source code within the container is all cool and nice, but what if we want to actually use our favorite gui editors like [VSCode](https://code.visualstudio.com/) or [PyCharm](https://www.jetbrains.com/pycharm/) to have numerous productivity features like runtime and library code completion, project wide search, definition location and code navigation to allow us to easily navigate inside the very large CKAN source code?
For that we'd have to set up a volume mount from docker to the outside host machine, but more on this in the next section!
## Setting up CKAN Source Outside the Container
Since the source code is being deployed inside the container, to be able to access it we would need to use ["bind mounts"](https://docs.docker.com/storage/bind-mounts/). Bind mounts are a way to enable propogation of changes from the container to the host and back, but due to a [bug](https://github.com/docker/for-mac/issues/3431) in the Windows/Mac OS versions of the underlying docker runtime layer, we'd need to do a quick workaround to have it working as expected.
The ideal way for us would have been to set up a bind mount from `/srv/app/src/ckan` off the CKAN container to our host, but this wouldn't work as long as this bug is still open.
To workaround it, we'd copy the built `srv/app/src/ckan` folder outside of the container to the volume mount the container setup already has for developing extensions. I.e. into the `/srv/app/src_extensions` inside the CKAN container, which would result in the CKAN source folder available to our host machine, in the already designated folder `./docker-ckan/src` under where we cloned the repo in the first place.
So let's get to work:
First, if not already booted up, boot up your container setup as instructed in [Booting Up Containers](#Booting-Up-Containers).
Then, after it's all up and running, let's shell into the CKAN container:
```bash=
$ docker-compose -f docker-compose.dev.yml exec ckan-dev bash
```
Once we get the shell prompt, we proceed to copy the CKAN source folder outside to the host->container volume mount, after we verify that we're in the desired folder inside the container (make sure you're issuing that command from there):
```bash=
# where are we?
$ pwd
/srv/app
$ cp -aRp src/ckan src_extensions/.
```
Once this command finished and we're back at the command prompt, let's exit the container shell and verify the CKAN source folder was copied "outside":
```bash=
bash-5.1# exit
exit
sivan@darwin:~/docker-ckan$ ls -la src/ckan/
total 968
drwxr-xr-x@ 50 sivan staff 1600 Jul 1 20:14 .
drwxr-xr-x 5 sivan staff 160 Aug 23 21:15 ..
drwxr-xr-x@ 3 sivan staff 96 Jul 1 20:14 .circleci
-rw-r--r--@ 1 sivan staff 267 Jul 1 20:14 .editorconfig
drwxr-xr-x@ 13 sivan staff 416 Jul 1 20:14 .git
-rw-r--r--@ 1 sivan staff 23 Jul 1 20:14 .gitattributes
drwxr-xr-x@ 4 sivan staff 128 Jul 1 20:14 .github
-rw-r--r--@ 1 sivan staff 552 Jul 1 20:14 .gitignore
-rw-r--r--@ 1 sivan staff 5 Jul 1 20:14 .pipignore
-rw-r--r--@ 1 sivan staff 1354 Jul 1 20:14 .travis.yml
drwxr-xr-x@ 3 sivan staff 96 Jul 1 20:14 .tx
-rw-r--r--@ 1 sivan staff 159941 Jul 1 20:14 CHANGELOG.rst
-rw-r--r--@ 1 sivan staff 7462 Jul 1 20:14 CODE_OF_CONDUCT.md
-rw-r--r--@ 1 sivan staff 163 Jul 1 20:14 CONTRIBUTING.md
-rw-r--r--@ 1 sivan staff 163 Jul 1 20:14 CONTRIBUTING.rst
-rw-r--r--@ 1 sivan staff 2106 Jul 1 20:14 Dockerfile
-rw-r--r--@ 1 sivan staff 4733 Jul 1 20:14 LICENSE.txt
-rw-r--r--@ 1 sivan staff 836 Jul 1 20:14 MANIFEST.in
-rw-r--r--@ 1 sivan staff 3782 Jul 1 20:14 README.rst
-rw-r--r--@ 1 sivan staff 555 Jul 1 20:14 SECURITY.md
drwxr-xr-x@ 9 sivan staff 288 Jul 1 20:14 bin
drwxr-xr-x@ 3 sivan staff 96 Jul 1 20:14 changes
drwxr-xr-x@ 22 sivan staff 704 Jul 1 20:14 ckan
-rw-r--r--@ 1 sivan staff 426 Jul 1 20:14 ckan-uwsgi.ini
drwxr-xr-x@ 10 sivan staff 320 Aug 23 17:36 ckan.egg-info
drwxr-xr-x@ 36 sivan staff 1152 Jul 1 20:28 ckanext
-rw-r--r--@ 1 sivan staff 128 Jul 1 20:14 conftest.py
drwxr-xr-x@ 5 sivan staff 160 Jul 1 20:14 contrib
drwxr-xr-x@ 6 sivan staff 192 Jul 1 20:14 cypress
-rw-r--r--@ 1 sivan staff 42 Jul 1 20:14 cypress.json
-rw-r--r--@ 1 sivan staff 611 Jul 1 20:14 dev-requirements.txt
drwxr-xr-x@ 19 sivan staff 608 Jul 1 20:14 doc
-rw-r--r--@ 1 sivan staff 2626 Jul 1 20:14 gulpfile.js
-rw-r--r--@ 1 sivan staff 173047 Jul 1 20:14 package-lock.json
-rw-r--r--@ 1 sivan staff 1018 Jul 1 20:14 package.json
-rw-r--r--@ 1 sivan staff 139 Jul 1 20:14 pip-requirements-docs.txt
-rw-r--r--@ 1 sivan staff 696 Jul 1 20:14 pyproject.toml
-rw-r--r--@ 1 sivan staff 19 Jul 1 20:14 requirement-setuptools.txt
-rw-r--r--@ 1 sivan staff 961 Jul 1 20:14 requirements-py2.in
-rw-r--r--@ 1 sivan staff 2018 Jul 1 20:14 requirements-py2.txt
-rw-r--r--@ 1 sivan staff 711 Jul 1 20:14 requirements.in
-rw-r--r--@ 1 sivan staff 1736 Jul 1 20:14 requirements.txt
drwxr-xr-x@ 3 sivan staff 96 Jul 1 20:14 scripts
-rw-r--r--@ 1 sivan staff 980 Jul 1 20:14 setup.cfg
-rw-r--r--@ 1 sivan staff 15125 Jul 1 20:14 setup.py
-rw-r--r--@ 1 sivan staff 988 Jul 1 20:14 test-core-circle-ci.ini
-rw-r--r--@ 1 sivan staff 4196 Aug 23 17:38 test-core.ini
-rw-r--r--@ 1 sivan staff 156 Jul 1 20:14 tsconfig.json
lrwxr-xr-x@ 1 sivan staff 19 Aug 23 21:15 who.ini -> ckan/config/who.ini
-rw-r--r--@ 1 sivan staff 587 Jul 1 20:14 wsgi.py
```
If you get similar directory listing output as shown above, congratulations - You've managed to copy the source foldre and you're very close to completing your setup!
We're still missing one change to the `docker-compose.dev.yaml` file, so open the file in your favorite editor and locate the volume mounts lines in the `ckan-dev` service- the lines under the section `volumes:` and edit to another mount line as shown below:
```dockerfile=
services:
ckan-dev:
build:
context: ckan/
dockerfile: Dockerfile.dev
args:
- TZ=${TZ}
env_file:
- .env
links:
- db
- solr
- redis
- datapusher
ports:
- "0.0.0.0:${CKAN_PORT}:5000"
volumes:
- ./src:/srv/app/src_extensions
- ckan_storage:/var/lib/ckan
```
Should become:
```dockerfile=
services:
ckan-dev:
build:
context: ckan/
dockerfile: Dockerfile.dev
args:
- TZ=${TZ}
env_file:
- .env
links:
- db
- solr
- redis
- datapusher
ports:
- "0.0.0.0:${CKAN_PORT}:5000"
volumes:
- ./src:/srv/app/src_extensions
- ckan_storage:/var/lib/ckan
- ./ckan_src:/srv/app/src/ckan <---- add this here
```
This will tell docker to replace the container CKAN source folder with the folder we'll create in a moment with the source we copied out, such that we could comfortably edit it from the host!
Now let's copy the source to this folder we just named in the `docker-compose.dev.yaml` file:
```bash=
~/docker-ckan$ cp -aRp src/ckan ckan_src
```
And let's delete the source folder from the extension development folder:
```bash=
~/docker-ckan$ rm -rf src/ckan/
```
Now restart your container setup (if it's running):
```bash=
~/docker-ckan$ docker-compose -f docker-compose.dev.yml restart
```
Wait for the bootup process to finish, indicated by the web app waiting for connections:
```dockerfile=
...
...
ckan-dev_1 | 2021-08-23 18:32:08,481 INFO [ckan.lib.jobs] Worker rq:worker:8e030c2e466748d496ad6937b4f7fdbb (PID 37) has started on queue(s) "default"
ckan-dev_1 | 2021-08-23 18:32:08,617 WARNI [werkzeug] * Debugger is active!
ckan-dev_1 | 2021-08-23 18:32:08,830 INFO [ckan.cli.server] Running CKAN on http://0.0.0.0:5000
ckan-dev_1 | 2021-08-23 18:32:44,882 INFO [ckan.config.middleware.flask_app] /api/3/action/status_show render time 0.014 seconds
```
And if you reached this and all went well, you're set! You should now be able to edit your CKAN source for fun and contribution to the community by using your preferred editor. Just open the path `docker-ckan/ckan_src` in , for example, VSCode and repeat the code modifications as in the [Hacking the CKAN Core Code](#Hacking-the-CKAN-Core-Code) section, to verify it's all working.
## Developing Extensions
Extensions are means by which we extend, modify and enhance CKAN in a "pluggable" way. This means that by using Extensions, we can tailor CKAN to our own needs, design and algorithms without ever touching its source code. This is so that the CKAN source code can mostly remain pristine and in a state where future CKAN source code upgrades, security bug fixes etc. can be applied regardless of any extension, keeping the quality promise of the CKAN release.
The CKAN extension mechanism is quite sophisticated and a lot of thought and effort was put into it, so it's highly recommeded to use it to extend and modify CKAN behavior rather then editing it's core code directly (as we did in the last section).
On par with the aforementioned, it is also a recommended way to package new features and fucntionalities, for instance, one large feature in each extension. So Extensions are an excellent way to create [Separation of Concerns](https://en.wikipedia.org/wiki/Separation_of_concerns) and keep a [modular](https://en.wikipedia.org/wiki/Modular_programming) featureset for your customization project.
### Benefits of Using Extensions
- Features can be enabled/disabled just by enabling your favorite or custom created extension.
- Upgrades and security fixes of the core code remains agnostic of any custom extension code you may have written to support your business goals and logic.
- Theming and styling remains completely agnostic of CKAN code.
- Testing is done per extension, tests are written per extension which makes it quicker and easier to test in the context of a large CKAN installation.
- Extensions have their own dependencies and as such enable you to use almost any library from any publisher in the Python ecosystem (per example, you could add image recognition capabilities to your CKAN portal via [OpenCV Python](https://pypi.org/project/opencv-python/) so it **automatically** adds certain suitable metadata in a dataset).
- It's very easy to collaborate on a well defined piece of extension code than on a huge code reposistory holding both the CKAN core code and your custom project.
Now that we're convinced with Extensions being the way to go for CKAN custom project development in almost all of the cases, let's roll up our sleeves and get hacking.
To get started with our own extension, we'll be using CKAN's own command that creates all of the required boilerplate such that the new extension is in a structure CKAN can load and enable when we instruct it to do so via its configuration.
Okay, now let's land again on the `docker-ckan` folder:
```bash=
$ pwd
/Users/sivan/CKAN3/docker-ckan
```
(note again that your path may vary, but the last part of it should always be `docker-ckan`)
Make sure your container setup is running, consult [Booting Up Containers](#Booting-Up-Containers) if you need a reminder how to do that.
Now let's shell back into the CKAN container:
```bash=
$ docker-compose -f docker-compose.dev.yml exec ckan-dev bash
bash-5.1#
```
As before, getting to the `bash-X.X#` indicates a success. Now let's create our first CKAN extension!
The command below initiates and creates all of the boilerplate that CKAN needs to know and understand how to setup your extension extra logic such such that it becomes an integral functionality of your CKAN portal. As such it is then undistinguishable from CKAN core functionality which makes it a very useful and a cool feature.
```bash=
$ generate extension --output-dir /srv/app/src_extensions
```
Notice that we ask the output to go to the `/srv/app/src_extensions` folder, which is, if you recall, the host mounted volume we already used.
This means that the resulting directory structure is accessible from the host so we can use our favorite GUI editor to develop it!
Running this command you'll be asked for some details and naming for your extension, name it "first". Make sure to input the name `ckanext-first` - all of the CKAN extension must start with this prefix, `ckanext` (indicating what they are ;).
Fill your name as the author, and your email (this is similar to configuring `git` with your identify in a sense).
You can leave the organization out, and all of the rest of details that are not applicable to you.
Now let's exit the container:
```bash=
$ exit
exit
sivan@darwin:~/CKAN3/docker-ckan$
```
Let's examine the new extension resulting folder:
```bash=
sivan@darwin:~/CKAN3/docker-ckan$ cd src/ckanext-first
sivan@darwin:~/CKAN3/docker-ckan/src/ckanext-first$ ls -la
total 136
drwxr-xr-x 14 sivan staff 448 Sep 2 15:10 .
drwxr-xr-x 6 sivan staff 192 Sep 2 15:10 ..
-rw-r--r-- 1 sivan staff 67 Sep 2 15:10 .coveragerc
drwxr-xr-x 3 sivan staff 96 Sep 2 15:10 .github
-rw-r--r-- 1 sivan staff 607 Sep 2 15:10 .gitignore
-rw-r--r-- 1 sivan staff 34500 Sep 2 15:10 LICENSE
-rw-r--r-- 1 sivan staff 195 Sep 2 15:10 MANIFEST.in
-rw-r--r-- 1 sivan staff 3257 Sep 2 15:10 README.md
drwxr-xr-x 4 sivan staff 128 Sep 2 15:10 ckanext
-rw-r--r-- 1 sivan staff 12 Sep 2 15:10 dev-requirements.txt
-rw-r--r-- 1 sivan staff 0 Sep 2 15:10 requirements.txt
-rw-r--r-- 1 sivan staff 490 Sep 2 15:10 setup.cfg
-rw-r--r-- 1 sivan staff 3637 Sep 2 15:10 setup.py
-rw-r--r-- 1 sivan staff 785 Sep 2 15:10 test.ini
```
This content is our newly created CKAN extension. This is, if you may, the "container" that hosts and contains your custom code and theming logic enabling you to customize CKAN and extend its functionality.
This guide by no means attempts to be an exhustive tutorial for creating extensions, but such can be found [here](https://docs.ckan.org/en/2.9/extensions/tutorial.html#creating-a-new-extension) and is recommended reading for those planning to create extensions for production use.
However, if you want to see the magic of extending CKAN right now, you can install the [ckanext-developerpage](https://github.com/datopian/ckanext-developerpage) extension. This extension creates a special "Developer Page" you can access by a new CKAN URL route (that the extension adds) and see valuable operating system and CKAN operational and configuration information useful when developing and experimenting with deployment and configuration of CKAN.
To install it first we need to clone it into the `./src`, the volume mounted folder we are already familiar with from previous section, such that CKAN will install the extension's Python package. Once we've done this, we also need to enable the extension in the CKAN environment configuration file, as mentioned in the section [Configuration](#Configuration) , at the configuration file in line `#58`.
Once you've cloned the extension repo into your respective `~/docker-ckan/src/` folder, you can modify the environment configuration file- navigate to line `#58` in your `.env` file and add the name `developerpage` as the last name on this line, preceeded with a whitespace.
CKAN extensions, at the `CKAN__PLUGINS` configuration directives are specified without the `ckanext-` prefix, so just the name of the hyphen is need.
The resulting section of the environment configuration file will look somewhat like this:
```bash=47
# Extensions
# Extension are how you extend CKAN, improve it and customize it to whatever needs you may have for the system.
# Here you configure the list of extension that will be
# loaded in this docker-compose setup.
# NOTE: The order is important! So if you have extension B, that
# uses base code in A, A must come first.
# NOTE: In the filesystem, CKAN extensions are
# usually as "ckanext-{EXTENSION_NAME}", note
# that in the configuration parameter you only
# specify the actual name of the extension rather the "ckanext-" prefix.
CKAN__PLUGINS=envvars image_view text_view recline_view datastore datapusher developerpage
```
Now let's restart our containers and try to use the new functionality we've added with the extension.
```bash=
~/docker-ckan$ docker-compose -f docker-compose.dev.yml restart
```
And once the containers have booted, navigate to to your CKAN portal , in the developerpage URL. If you followed all of the above and didn't change any host settings yet this should be something like:
`http://127.0.0.1:5000/developerpage`
If all went well, you should see detailed system information pertaining to your CKAN installation and portal:

Congratulations! You now everything there's to know to be able to hack on CKAN's core code itself or customize it using an extension.
## Running Tests
No good code can be deployed without good amount of testing, and the CKAN ecosystem is no different. Being a prominent Python project, CKAN uses standard and familiar testing constructs, known in the Python community as [pytest](https://docs.pytest.org/en/6.2.x/). pytest is an advanced testing framework that enables you to rapidly write tests to verify your Python software before deploying it to your production CKAN portal.
It's recommended to get familiar with pytest in order to write your own tests. For our learning experiment here, we'll run CKAN Core's own test suite- this is a good practice to do after checking it out and setting up your portal either in the docker container setup or otherwise, as it provides a quality tool to verify that the code is up to par with the expected functionality and that it will perform well on your target platform.
Running extension tests is similar in manner, and to read more about it and dive into more specific details it is recommended to consult CKAN's contribution guide [here](https://docs.ckan.org/en/2.9/contributing/test.html).
My recommendation is to run tests from *within* the CKAN container rather form the "outside" by issuing the execution command via docker as this allows easier quick inspection and re-execution of the tests in case of failure and attempting a quick fix. If you prefer to run it from outside the container, I outline here both ways.
So to run CKAN own tests from within the container let's again first shell into it (make sure you in your respective `~/.../docker-ckan` folder before):
```bash=
$ docker-compose -f docker-compose.dev.yml exec ckan-dev bash
bash-5.1#
```
Then execute the follwing:
```bash=
$ cd /srv/app/src/ckan/ckan # to get into the CKAN source folder
```
Then , to run the CKAN core test suite:
```bash=
pytest --ckan-ini=../test-core.ini tests/
```
You should then see *pytest* firing up, collecting tests and running them. You should hope to get most of them a `PASS` or green dots in the test progress status indicating the particular functionality tested was verified.
Now to run a test suite you've created for one of your extensions, start from shelling into the container, gain. Then `cd` to your extension folder, and execute `pytest` according to your tests setup (if you don't have a `test.ini` leave it out):
```bash=
$ cd /srv/app/src_extensions/ckanext-developerpage/ckanext/developerpage # if testing the developerpage extension
$ pytest tests/
2021-09-30 17:17:09,287 INFO [ckan.cli] Using configuration file /srv/app/ckan.ini
2021-09-30 17:17:09,287 INFO [ckan.config.environment] Loading static files from public
2021-09-30 17:17:09,388 INFO [ckan.config.environment] Loading templates from /srv/app/src_extensions/ckan/ckan/templates
2021-09-30 17:17:10,092 INFO [ckan.config.environment] Loading templates from /srv/app/src_extensions/ckan/ckan/templates
============================================================================================= test session starts ==============================================================================================
platform linux -- Python 3.8.10, pytest-4.6.5, py-1.10.0, pluggy-0.13.1
rootdir: /srv/app/src_extensions/ckanext-developerpage
plugins: ckan-0.0.12, Faker-8.13.2, split-tests-1.0.9, pyfakefs-3.2, freezegun-0.4.1, rerunfailures-8.0, cov-2.7.1
collected 1 item
tests/test_plugin.py . [100%]
=========================================================================================== 1 passed in 0.12 seconds ===========================================================================================
```