GT-Labs <-> Glossary analysis

# GT-Labs <-> Glossary analysis ## Challenge 1. We need to return whitelists information from Glossary API. 2. We must be able to upload a file instead of plain text. 3. Create a Sandbox for Glossary API. ## General process definition 1. Access to: http://localhost:6001/keyword_extractor?locale=en 2. We fill it with some text, we choose a language, and then we send it to Glossary API. ![](https://i.imgur.com/52HD72l.png) 3. After we fill it, a glossary is created async. Gt-labs actively polls Glossary API until returns a complete analized glossary. Complete ![](https://i.imgur.com/IC9k8nc.png) ![](https://i.imgur.com/LZkhCIE.png) ## Process in detail 1. The text we fill in `content` is converted to a file. So the text we send to Glossary API is actually a **file**. We do not send plain text to glossary. `GlossaryService - create(L11)` 2. The glossary service endpoint for Glossary API creation sends `extra_data: true`, this allows the response to return more information. ![](https://i.imgur.com/c7FZmdV.png) 3. After the glossary is completed, there's a call to: `https://labs.geartranslations.com/keyword_extractor/206 (Glossaries/Show)`. It consumes from Glossaries API: `/show`. 4. The API response looks like this: ![](https://i.imgur.com/YIDEOTo.png) ## Work Plan - [x] Update ruby version to 2.4.4 and fix mimemagic bug. - [ ] Update Glossary show response information about whitelists. - [ ] Update gt-labs client response to include information about whitelists. - [ ] Add a file input to upload files directly. - [ ] Create a Postman collection for Glossary API. Upload documentation to Wiki. - [ ] Sandbox for Glossary API. ## Appendix: Notes and technical considerations about gt-labs 1. Used Ruby version: 2.4.1. Suggested update without risks: 2.4.4. 2. We can't access to previous created reports. There's no index or show for previous reports. 3. Gt-labs machine doesn't have a Rails console to check information. 4. Provided database is `sqlite`, lightweight but problematic for querying in the future. 5. GT-Labs is now consuming Glossary in production mode. This exposes Glossary to some vulnerabilities. Ideally should consume staging, as staging has updated information from production.