How to create a new plugin

# How to create a new plugin // 前面说一说前景介绍，plugin可以干嘛，等等 ## Create a simple plugin to suit custom needs In this blog post, I will walk you through creating a plugin for Apache DevLake to collect data from HTTP APIs. Specifically, we will collect information regarding committers and contributors from Apache API. ### Create some files to init plugin Currently, Apache DevLake provides some templates to create a plugin conveniently. **Note** You need to start and run Apache Devlake before creating this plugin. 1. Create a new folder in `/plugins` close to other folders like `github` or `jira`. We name it `icla`, and you can name it as you like. 2. Copy `/generator/template/plugin/plugin_main.go-template` to `/plugins/icla/plugin_main.go` 3. Copy `api_client.go-template` and `task_data.go-template` to `/plugins/icla/tasks/api_client.go` and `/plugins/icla/tasks/task_data.go` Now we have three files in our plugin. ![](https://i.imgur.com/zon5waf.png) Next, we should replace some strings as follows in these three files. 1. `{{ .PLUGIN_NAME }}` to `ICLA` 2. `{{ .PluginName }}` to `Icla` 3. `{{ .pluginName }}` to `icla` Yes, It's horrid, and we need a code generator to help us finish this step. o(╥﹏╥)o Unfortunately, we don't have it currently; it's the issue about this: https://github.com/apache/incubator-devlake/issues/1822 . Maybe we can call `go run generator.go create plugin` and type the plugin name after the issue is solved. Have a try to run this plugin by function `main` in `plugin_main.go`, and the result maybe look as follows: ``` $go run plugins/icla/plugin_main.go [2022-06-02 18:07:30] INFO failed to create dir logs: mkdir logs: file exists press `c` to send cancel signal [2022-06-02 18:07:30] INFO [icla] start plugin [2022-06-02 18:07:30] INFO [icla] total step: 0 ``` How exciting. It works! ### Create a sub-task for data collection Before we start, there is a small thing you need to know. This is how the collection task will be executed: 1. First, Apache Devlake would call `plugin_main.PrepareTaskData` to prepare needed data before any sub-tasks. We need to create an API client here. 2. Then Apache Devlake will call the sub-tasks returned by `plugin_main.SubTaskMetas`. First, we have to define a collector to request data from HTTP or other data sources. So let's copy `/generator/template/plugin/api_collector.go-template` to `/plugins/icla/tasks/committer_collector.go` and replace follow vars. 1. `{{ .PluginName }}` to `Icla` 2. `{{ .pluginName }}` to `icla` 3. `{{ .COLLECTOR_DATA_NAME }}` to `COMMITTER` 4. `{{ .CollectorDataName }}` to `Committer` 5. `{{ .collector_data_name }}` to `committer` Second, in order to activate the newly added sub-task, we need to register it in our plugin by adding a new entry named `tasks.CollectCommitterMeta,` to `plugin_main.go/SubTaskMetas` ![](https://i.imgur.com/tkDuofi.png) Maybe you have noticed that `data.ApiClient` is missing. Good catch, let's uncomment `task_data.go/IclaTaskData.ApiClient` and `plugin_main.go/PrepareTaskData.ApiClient`. These codes create a new api_client. In order to collect data from `https://people.apache.org/public/icla-info.json`, we have to fill `https://people.apache.org/` into `api_client.go/ENDPOINT`. ![](https://i.imgur.com/q8Zltnl.png) And fill `public/icla-info.json` into `UrlTemplate`, delete unnecessary iterator and add `println("receive data:", res)` in `ResponseParser` to see if collection was successful. ![](https://i.imgur.com/ToLMclH.png) Ok, now we can kick it off by running `main` again. then the output of Davlake may look like this: ```bash [2022-06-06 12:24:52] INFO [icla] start plugin invalid ICLA_TOKEN, but ignore this error now [2022-06-06 12:24:52] INFO [icla] scheduler for api https://people.apache.org/ worker: 25, request: 18000, duration: 1h0m0s [2022-06-06 12:24:52] INFO [icla] total step: 1 [2022-06-06 12:24:52] INFO [icla] executing subtask CollectCommitter [2022-06-06 12:24:52] INFO [icla] [CollectCommitter] start api collection receive data: 0x140005763f0 [2022-06-06 12:24:55] INFO [icla] [CollectCommitter] finished records: 1 [2022-06-06 12:24:55] INFO [icla] [CollectCommitter] end api collection [2022-06-06 12:24:55] INFO [icla] finished step: 1 / 1 ``` Great! Now we can see data pulled from the server without problem. The last step is to decode the response body in `ResponseParser` and return it to the framework so it can be saved in the database. ```go ResponseParser: func(res *http.Response) ([]json.RawMessage, error) { body := &struct { LastUpdated string `json:"last_updated"` Committers json.RawMessage `json:"committers"` }{} err := helper.UnmarshalResponse(res, body) if err != nil { return nil, err } println("receive data:", len(body.Committers)) return []json.RawMessage{body.Committers}, nil }, ``` Ok, run the function `main` once again, then it turned out like this, and we should see some records show up in the table `_raw_icla_committer`. ```bash …… receive data: 272956 /* <- the number means 272956 models received */ [2022-06-06 13:46:57] INFO [icla] [CollectCommitter] finished records: 1 [2022-06-06 13:46:57] INFO [icla] [CollectCommitter] end api collection [2022-06-06 13:46:57] INFO [icla] finished step: 1 / 1 ``` ![](https://i.imgur.com/aVYNMRr.png) ### Create a sub-task to extract data from the raw layer We have already collected data from HTTP API and saved them into the DB table `_raw_XXXX`. This section will extract the names of committers from raw data. Similarly, let's copy `/generator/template/plugin/extractor.go-template` to `/plugins/icla/tasks/committer_extractor.go` and replace the following vars. 1. `{{ .PluginName }}` to `Icla` 2. `{{ .COLLECTOR_DATA_NAME }}` to `COMMITTER` 3. `{{ .ExtractorDataName }}` to `Committer` Let's look at the function `extract`, and some codes need to be written here. Apache Devlake now suggests we save data by [gorm](https://gorm.io/docs/index.html), so we will create a model by grom and add it into `plugin_main.go/AutoSchemas.Up()`. plugins/icla/models/committer.go ```go package models import ( "github.com/apache/incubator-devlake/models/common" ) type IclaCommitter struct { UserName string `gorm:"primaryKey;type:varchar(255)"` Name string `gorm:"primaryKey;type:varchar(255)"` common.NoPKModel } func (IclaCommitter) TableName() string { return "_tool_icla_committer" } ``` plugins/icla/plugin_main.go ![](https://i.imgur.com/4f0zJty.png) Ok, run the plugin and table `_tool_icla_committer` will be created automatically as this snapshot: ![](https://i.imgur.com/7Z324IX.png) Finally, because `resData.data` is raw data, we could decode them by json and add new `IclaCommitter` to save them. ```go Extract: func(resData *helper.RawData) ([]interface{}, error) { names := &map[string]string{} err := json.Unmarshal(resData.Data, names) if err != nil { return nil, err } extractedModels := make([]interface{}, 0) for userName, name := range *names { extractedModels = append(extractedModels, &models.IclaCommitter{ UserName: userName, Name: name, }) } return extractedModels, nil }, ``` Ok, run it then we will get: ``` [2022-06-06 15:39:40] INFO [icla] start plugin invalid ICLA_TOKEN, but ignore this error now [2022-06-06 15:39:40] INFO [icla] scheduler for api https://people.apache.org/ worker: 25, request: 18000, duration: 1h0m0s [2022-06-06 15:39:40] INFO [icla] total step: 2 [2022-06-06 15:39:40] INFO [icla] executing subtask CollectCommitter [2022-06-06 15:39:40] INFO [icla] [CollectCommitter] start api collection receive data: 272956 [2022-06-06 15:39:44] INFO [icla] [CollectCommitter] finished records: 1 [2022-06-06 15:39:44] INFO [icla] [CollectCommitter] end api collection [2022-06-06 15:39:44] INFO [icla] finished step: 1 / 2 [2022-06-06 15:39:44] INFO [icla] executing subtask ExtractCommitter [2022-06-06 15:39:46] INFO [icla] [ExtractCommitter] finished records: 1 [2022-06-06 15:39:46] INFO [icla] finished step: 2 / 2 ``` Now committer data have been saved in _tool_icla_committer. ![](https://i.imgur.com/6svX0N2.png) ### Collect data after login Let's look at `api_client.go`. `NewIclaApiClient` load config `ICLA_TOKEN` by `.env`, so we can add `ICLA_TOKEN=XXXXXX` in `.env` and use it in `apiClient.SetHeaders()` to mock the login status. Code as below: ![](https://i.imgur.com/dPxooAx.png) Of course, we can use `username/password` to get a token after login mockery. Just try and adjust according to the actual situation. Look for more related details at https://github.com/apache/incubator-devlake ### Submit your code as opensource code Good ideas~ But it might be a little challenging to write normative codes for those who lack the knowledge of migration scripts and domain layers. More info at https://.........