Color Normalization Service API Document

# Color Normalization Service API Document ## Overview The goal of this service is to provide color normalization functionality to identify approximations for colors of many formats and to identify whether a field is a color at all. ## Resources ### Multi-language support For `iscolor`, `shorttext` and `longtext` endpoints, we now can support multiple languages, including `en, es, fr, it`. Language code can be found [here](https://cloud.google.com/translate/docs/languages). All request now can specifiy language by adding a `lang` field. If it is not specified, by default it is `en`. To expand current resources with a new language, use the module `toolkit/expand_language.py` ### Normalize Color All functionality related to normalization will sit under the `/normalize/` root and will be accessible via `POST` requests. All normalized colors are drawn from the list below. ```python [ 'red', 'white', 'grey', 'black', 'neutral', 'pink', 'brown', 'orange', 'yellow', 'green', 'purple', 'blue', 'gold' 'silver' ] ``` #### Normalize by Hex Code This endpoint should be used if the hex code of the color is already known and we would just like to provide approximations. This endpoint has the highest degree of accuracy. **Input:** ```rest POST /normalize/hexcode { "code": "#55ADED" } ``` Note that code should be a 7 character string where the first character is '#' and the remaining characters are 0-9, a-f case insensitive or 6 characters long where all six characters are 0-9, a-f case insesitive. If this does not hold then a 400 response code will be returned **Output:** ```rest { "code": "#55aded", "approximations": ["blue"] } ``` Note that 'code' will be a the same code as entered in the `POST` request but normalized to lower case and approximations will be a list of approximation color names all lower cased. This list will always contain one or more color names. #### Normalize by Short Text This endpoint should be used if the field is known to be the name of the color without the hex code and we would like to get the approximations and hex code. This endpoint has a lower degree of accuracy than the previous endpoint. It will attempt to determine if the name represents multiple colors or not. **Input:** ```rest POST /normalize/shorttext { "text": "Soft_Cream / blue" } ``` **Output:** ```rest { "colors": [{ "title": "Soft_Cream", "name": "soft cream", "code": "#fffdea", "approximations": ["white"] }, { "title": "blue", "name": "blue", "code": "#0000ff", "approximations": ["blue"] }] } ``` `shorttext` endpoint support batch mode which can deal with a list of color names at the same time. This can speed up the extraction process. For batch mode, `texts` is used instead of `text` in the payload, and its value is a list of texts. **Input:** ```rest POST /normalize/shorttext { "texts": ["dark blue", "gold", "gold", "red", "Red", "afdas", "yellow", "navy", "blue/black"], "lang": "en" } ``` **Output:** ```rest { "colors": [ { "text": "dark blue", "normalization": [ { "title": "dark blue", "name": "dark blue", "code": "#2e317d", "approximations": [ "purple", "blue" ] } ] }, { "text": "gold", "normalization": [ { "title": "gold", "name": "gold", "code": "#daba41", "approximations": [ "gold", "yellow" ] } ] }, { "text": "gold", "normalization": [ { "title": "gold", "name": "gold", "code": "#daba41", "approximations": [ "gold", "yellow" ] } ] }, { "text": "red", "normalization": [ { "title": "red", "name": "red", "code": "#e0362c", "approximations": [ "red" ] } ] }, { "text": "Red", "normalization": [ { "title": "Red", "name": "red", "code": "#e0362c", "approximations": [ "red" ] } ] }, { "text": "afdas", "normalization": [] }, { "text": "yellow", "normalization": [ { "title": "yellow", "name": "yellow", "code": "#e9ee3f", "approximations": [ "green", "yellow" ] } ] }, { "text": "navy", "normalization": [ { "title": "navy", "name": "navy", "code": "#1b2a5b", "approximations": [ "blue" ] } ] }, { "text": "blue/black", "normalization": [ { "title": "blue", "name": "blue", "code": "#4e45ca", "approximations": [ "blue" ] }, { "title": "black", "name": "black", "code": "#181717", "approximations": [ "black" ] } ] } ] } ``` Note that 'code' will be hex code corresponding to the color name in lower case letters and name is the name of the color normalized to lower case without punctuation or "words" composed of numbers. If no results are found then an empty dictionary will be returned with a 200 status code. For `batch mode`, **returned colors are in the same order of texts as in requests. Because some colors may not have approximation, it can be an empty list for those colors. Also if there are duplicated color names in `texts`, will get duplciated approximations as well** #### Normalize by Long Text This endpoint should be used if there is no field for color name or hex code. Color names will be searched for in the block of text and then returned. **Input:** ```rest POST /normalize/longtext { "text": "Koala Kids Blue and Grey Shirt with Faux Suspenders and Bowtie Detail" } ``` **Output:** ```rest { "colors": [ { "name": "grey", "code": "#929591", "approximations": [ "grey" ] }, { "name": "blue", "code": "#0343df", "approximations": [ "blue" ] } ] } ``` Note that the result will be a list of dictionaries representing all of the colors found in the text along with their codes and approximation lists. If no colors exist then "colors" will be an emtpy list `longtext` endpoint supports `batch mode` as well. **Input:** ```rest POST /normalize/longtext { "texts": ["Koala Kids Blue and Grey Shirt with Faux Suspenders and Bowtie Detail", "asdlifjalsdfjladjflasdjf] "lang": "fr" #if this field is not there, then by default it is english } ``` **Output:** ```rest { "colors": [ { "text": "Koala Kids Blue and Grey Shirt with Faux Suspenders and Bowtie Detail", "normalization": [ { "name": "grey", "code": "#929591", "approximations": [ "grey" ] }, { "name": "blue", "code": "#0343df", "approximations": [ "blue" ] } ] }, { "text": "asdlifjalsdfjladjflasdjf", "normalization": [] } ] } ``` ### Classify Field All functionality related to classifying fields will sit under the `/classify/` root and will be accessible via `POST` requests. #### Classify as Color or Not Color This endpoint will attempt to determine if a field is a color field given a list of text entries of that field **Input:** ```rest POST /classify/iscolor { "fields": ["periwinkle blue", "light pink", "a walk in the park"], "min_ratio": 0.66 } ``` - `'min_ratio'` is the minimum ratio to consider successful for a field to be a color, if not included it will default to 0.66 **Output:** ```rest { "is_color": true, "score": 0.66 } ``` The result will contain two fields is_color; a boolean value specifying if the field is a color field and score; a number between 0 and 1 indicating how many of the list entries were colors.