[TOC] # Report format ## Motivation - **Person or organization developing the experiment**: *What person or organization developed the experiment?* - **Abstract**: *A brief summary of the problem to be solved* - **Proposed solution**: *Why was this dataset, this model and these evaluations chosen to solve the problem?* ## Dataset information *General dataset information, which may include fields like:* - **Collection**: *Dataset Collection* - **Version**: *Version of the dataset* - **URL**: *Url of the dataset* ### Preprocessing information - **Preprocessing operations**: *Preprocessing operations performed on the dataset.* ### Dataset distribution *Sample distribution in the dataset categories.* | Label 1 | Label 2 | - - - | Label N | |------- |------ |-------|------ | | Samples | Samples | |Samples | ### Dataset partitions #### Percents *Percentage in each of the dataset partitions.* | Train | Test | Validation | |------ |------- |------ | | Perc (%) | Perc (%) |Perc (%) | #### Samples *Distribution of samples from each category in dataset partitions.* | | Train | Test | Validation | |------- |------ |------- |------ | | Label 1 | Samples | Samples |Samples | | Label 2 | Samples | Samples |Samples | | ---- | ---- | ----- |---- | | Label N | Samples | Samples |Samples | #### Additional criteria *Some additional criteria that were taken into consideration when performing the dataset partitions, such as:* - **Grouping criteria**: *Description of the grouping criteria with which the dataset was partitioned.* ### Dataset variability - **variability**: *How the measurements were calculated? k-fold re-samp.* ## Model data - **Model type**: *What type of model is it?* - **Primary intended users**: *Whether the model was developed with general or specific tasks in mind (e.g., plant recognition worldwide or in the Pacific Northwest). The use cases may be as broadly or narrowly defined as the developers intend. For example, if the model was built simply to label images, then this task should be indicated as the primary intended use case.* - **Primary intended uses**: *This helps users gain insight into how robust the model may be to different kinds of inputs.* - **Out of scope uses**: *Here, the model card should highlight technology that the model might easily be confused with, or related contexts that users could try to apply the model to.* ## Evaluation data - **Model performance measures**: *What measures of model performance are being reported, and why were they selected over other measures of model performance?* - **Decision thresholds**: *If decision thresholds are used, what are they, and why were those decision thresholds chosen? * - **Approaches to uncertainty and variability**: *How are the measurements and estimations of these metrics calculated? For example, this may include standard deviation, variance, confidence intervals, or KL divergence. Details of how these values are approximated should also be included (e.g., average of 5 runs, 10-fold cross-validation).* ## Results *Summary that includes plots, classification examples and an interpretation of the results.* ### Plots *Graphs or some graphical interpretation, such as the confusion matrix, that help to have a quick interpretation of the results.* ![](https://i.imgur.com/40k8oyQ.png) ![](https://i.imgur.com/UijjTkk.png) ### Analysis of the results *An analysis is performed to help interpret the results and give an idea about the performance of the model in particular cases and the general case. For example, you can include the interpretation of each of the precision x recall graphs and how the model helps in the context of the general problem.* ### Classification examples *Examples of the best and worst ranked samples can be included, as well as those that were in the uncertainty threshold.* ### Results tables *The results of the evaluation metrics that were shown above in the graphs are listed.* #### Per class | | Measure 1 | Measure 2 | Measure 3 | |------- |------ |------- |------ | | Label 1 | Result(s) | Result(s) |Result(s) | | Label 2 | Result(s) | Result(s) |Result(s) | | ---- | ---- | ----- |---- | | Label N | Result(s) | Result(s) |Result(s) | #### For all classes | Measure 1 | Measure 2 | Measure 3 | |------- |------ |------- | | Result(s) | Result(s) |Result(s) | #### Confusion matrix | | Actual Label 1 | Actual Label 2 | Actual Label N | |------- |------ |------- |------- | | Predicted Label 1 | Count |Count |Count | | Predicted Label 2 | Count |Count |Count | | Predicted Label N | Count |Count |Count |