# Subscribe to New Datasets - Needs Analysis ## Summary _A summary of the needs and the solution. Usually, starts with key needs followed with overview of the solution and key tasks._ https://gitlab.com/datopian/clients/national-grid/-/issues/263 The intention of this new feature is to give users the ability to get notifications about new datasets (CKAN packages). ## Needs _These are distilled, preferably in the form of job epics._ When browsing through the National Grid data portal as I signed in user, I want to choose to get notified when new datasets (CKAN packages) get published so I can be up-to-date with the latest information available. When signing up to the National Grid data portal, I don't want to get notified when new datasets (CKAN packages) get published so I choose it if these notifications are relevant to me before I start receiving them. When I am subscribing to notifications of new datasets (CKAN packages), I want to be able to unsubscribe from them so I can change my mind about the relevancy of being up-to-date with this information. ## Design _This is the proposed design of the solution._ We expect this to happen: ```mermaid graph LR subgraph ckan ckancore["CKAN"] ckandb[(Database)] end curator((Data Curator)) curator --creates package--> ckancore ckancore --inserts new line to activity_list table--> ckandb ``` By default, the Data Subscriptions service automatically collects everything in the `activity_list` table every x seconds (configurable). We just need to start reading this information from the local database, and to include it in every step of the notification delivery. They are, most importantly: 1. Read dataset list that needs information from CKAN API e.g., dataset title, dataset URL. 2. Collect relevant activity for a user. 3. Build email template, which should now include another possible paragraph. ## Plan of Work _One or more structured issues as per the issue/task template._ ### Allow users to subscribe and unsubscribe to new datasets notifications _https://gitlab.com/datopian/clients/national-grid/-/issues/273_ **For the frontend app.** Depends on "Create database migration and endpoints." When browsing through the National Grid data portal as I signed in user, I want to choose to get notified when new datasets (CKAN packages) get published so I can be up-to-date with the latest information available. When I am subscribing to notifications of new datasets (CKAN packages), I want to be able to unsubscribe from them so I can change my mind about the relevancy of being up-to-date with this information. #### Acceptance Criteria * [ ] Users can go to their settings page (/settings#subscriptions) and select to get notifications for new datasets published. * [ ] Users can go to their settings page (/settings#subscriptions) and select to stop getting this type of notifications. * [ ] There are tests for every change in this project. #### Tasks * [ ] Write integration tests. * [ ] Create new section in the page. * [ ] Connect the section form with the Data Subscriptions API. ### Create database migration and endpoints _https://gitlab.com/datopian/clients/data-subscriptions/-/issues/17_ **For the Data Subscriptions service.** When browsing through the National Grid data portal as I signed in user, I want to choose to get notified when new datasets (CKAN packages) get published so I can be up-to-date with the latest information available. When signing up to the National Grid data portal, I don't want to get notified when new datasets (CKAN packages) get published so I choose it if these notifications are relevant to me before I start receiving them. When I am subscribing to notifications of new datasets (CKAN packages), I want to be able to unsubscribe from them so I can change my mind about the relevancy of being up-to-date with this information. #### Acceptance Criteria * [ ] The database state reflects all the necessary changes for the feature. * [ ] There's an endpoint for subscribing to every new dataset. * [ ] There's an endpoint for unsubscribing to every new dataset. * [ ] There are tests for the changes in the project. #### Tasks * [ ] Create a migration to change the `subscription` table. * Nowadays, `dataset_id` [is required](https://gitlab.com/datopian/clients/data-subscriptions/-/blob/master/data_subscriptions/models/subscription.py#L6). * [ ] Create `type` enum field. It can take values such as `dataset` or `new_datasets`. * [ ] Change nullable constraint to require `dataset_id` only when the type is `dataset`. * [ ] Create tests for the new REST endpoints. * [ ] Implement the endpoints. ### Send notifications of type "new_dataset" to users subscribing to this type of notifications **For the Data Subscriptions service.** When browsing through the National Grid data portal as I signed in user, I want to choose to get notified when new datasets (CKAN packages) get published so I can be up-to-date with the latest information available. When signing up to the National Grid data portal, I don't want to get notified when new datasets (CKAN packages) get published so I choose it if these notifications are relevant to me before I start receiving them. #### Acceptance Criteria * [ ] Users subscribing to new datasets get an email when there's a new `activity_list` row related to this event. * According to [[1]](https://docs.ckan.org/en/ckan-2.4.1/api/#ckan.logic.action.get.recently_changed_packages_activity_list)[[2]](https://github.com/ckan/ckan/blob/ef103f02292e0b50fbe7edc6a9e07f70d2fb9f45/ckan/model/activity.py#L407-L428), the CKAN API endpoint should return activity related to the creation of new packages. * [ ] Users not subscribing to new datasets don't get an email when there's a new `activity_list` row related to this event. * [ ] There are tests for every change in the project. #### Tasks * [ ] Write tests. * [ ] Change SQL queries that collect changes for a specific user's notification. From now on, they should include not only what's already in master, but all the activity related to new datasets (if the user has this type of subscription). They are: * [ ] [data_subscriptions/worker/activity_for_user.py#L4-17](https://gitlab.com/datopian/clients/data-subscriptions/-/blob/master/data_subscriptions/worker/activity_for_user.py#L4-17) * [ ] [data_subscriptions/notifications/batch_dispatcher.py#L14-24](https://gitlab.com/datopian/clients/data-subscriptions/-/blob/master/data_subscriptions/notifications/batch_dispatcher.py#L14-24) * [ ] Add this notification type to the email template. * _Sync with Sagar on the progress of migrating the templates into SendGrid._ https://gitlab.com/datopian/clients/national-grid/-/issues/272 ## Design Research _This is preliminary research for how to build a solution._ There's no need for this section in this Needs Analysis. ## Needs Inbox _This is where you collect all incoming needs related items prior to distilling them._ Not required, but recommended: * [ ] Consider if it's worth prioritising https://gitlab.com/datopian/clients/data-subscriptions/-/issues/14. * It's a task that would change the same SQL queries and classes. It may be a good time to prioritize it – specially if we want to do it in a soon future, since it will require to change the same things another time. * [ ] Work on https://gitlab.com/datopian/clients/data-subscriptions/-/issues/14.