# Job Descriptions NER
## Introduction
In this exercise the objective is to analyse the performance of a Named-Entity recognition model trained over a list of `keywords` related to `job descriptions`.
In order to accomplish this task you will be supplied with the following files:
- Training set, a file containing a set of automatically annotated entities within a list of job descriptions. This file contains a set of lines (__separated by blank lines__) with the following format:
```
<START:jobtitle> encargado <END> a de <START:jobtitle> limpieza <END> <START:location> campo <END> de <START:location> gibraltar <END> <START:location> ceuta <END>
2014 <START:temporal> summer <END> <START:jobtitle> intern <END> program corporate
<START:jobtitle> consultor <END> high street
```
- Test set, a smaller file with a set of manually annotated entities with the same format as above.
- A folder containing the software needed for training and evaluating this type of models.
## Tagger Tools
Inside the `tagger-tools` folder there is an executable in `bin/tagger-tools`. With this program you can train and evaluate the performance of a model:
If you invoke the program without parameters the general usage of the program is printed out, for example:
`.bin/tagger-tools` will produce the following output:
```
Usage:
$RUN generate_training_set <input_dir> <output_file>
$RUN train_model <iterations> <cutoff> <language> <training_data_file> <model_file>
$RUN train_cv_model <iterations> <cutoff> <n_folds> <language> <training_data_file> <model_file>
$RUN evaluate_model <model_file> <input_file> <output_file>
$RUN extract_examples <elasticsearch_url> <elasticsearch_index> <country_code> <examples_output_file>
$RUN extract_job_titles <credential_file> <spreadsheet_id> <output_path>
```
Here the relevant options are:
#### train_model
To train a new model given a new training set you can execute:
- `train_model 100 4 es ner_training_source output_model`
A new model using the input file `ner_training_source` will be created, this new model will be written to the `output_model` file.
#### evaluate_model
In order to evaluate the new trained model you can execute:
- `evaluate_model output_model test-set evaluation.json`
In this case a new json file will be created including all the evaluation relevant metrics. An example of an output file is as next:
```
{
"evaluation_stats": {
"proglang": {
"tp": 0,
"fp": 1,
"tn": 0,
"fn": 2,
"precision": 0.0,
"recall": 0.0,
"f_measure": 0.0
},
"location": {
"tp": 29,
"fp": 12,
"tn": 0,
"fn": 36,
"precision": 0.7073170731707317,
"recall": 0.4461538461538462,
"f_measure": 0.5471698113207548
},
"jobspec": {
"tp": 6,
"fp": 35,
"tn": 0,
"fn": 61,
"precision": 0.14634146341463414,
"recall": 0.08955223880597014,
"f_measure": 0.1111111111111111
},
"global": {
"tp": 170,
"fp": 183,
"tn": 0,
"fn": 332,
"precision": 0.48158640226628896,
"recall": 0.3386454183266932,
"f_measure": 0.39766081871345027
},
"temporal": {
"tp": 0,
"fp": 7,
"tn": 0,
"fn": 13,
"precision": 0.0,
"recall": 0.0,
"f_measure": 0.0
},
"jobtitle": {
"tp": 130,
"fp": 115,
"tn": 0,
"fn": 197,
"precision": 0.5306122448979592,
"recall": 0.39755351681957185,
"f_measure": 0.4545454545454546
},
"seniority": {
"tp": 4,
"fp": 11,
"tn": 0,
"fn": 6,
"precision": 0.26666666666666666,
"recall": 0.4,
"f_measure": 0.32
},
"lang": {
"tp": 1,
"fp": 2,
"tn": 0,
"fn": 17,
"precision": 0.3333333333333333,
"recall": 0.05555555555555555,
"f_measure": 0.09523809523809525
}
}
}
```
## Objectives
The objective here is quite broad and there is not a right or wrong solution, we are expecting a descriptive analysis of the `training` and `test` sets. From this analysis we should be able to answers questions like:
- What entity classes are defined in the training-set and test-set?
- What is the coverage of the annotation among classes, mind that the annotations in the training-set have been automatically created?
- Should we increase the number of annotations for a specific class?
- Have all classes an equivalent coverage in both sets, training and test?
- Can you estimate a minimum number of annotations to obtain reliable results?
- What is the performance for all classes?
- There is any class with a poor performance?
- How can we try to improve the performance for these bad performer classes?
- In your opinion should we focus in `precision` or `recall`? What consecuences could have having a good `precision` vs a low `recall`? And in the opposite case?
Extra points:
- Can you think in a quick win way to improve the performance of the model?
- Go for it!!!