# 3.3 Directory of visualisations This section is a quick overview of various types of plots and charts typically used in data visualisation. We will provide guidance of which types of plots are best different types of data and hints of what to be careful of when using a given chart. ## Amounts - Interested in the magnitude of some set of number: We have a set of categories and a quantitative value for each category. | Type of data | Type of chart | Beware of: | Examples | | ------------------------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- | | Numerical values for a set of categories | Bar plots | - Bars need to start at zero: so that the bar length is proportional to the amount shown.<br>- Labels identifying each bar: can take up a lot of horizontal space, in this case is better to use horizontal bars.<br> - Order in which the bars are arranged: if the bars represent unordered categories, order them by ascending or descending data values. <br> - Not good for very large datasets, the resulting figure can become too busy. | Numerical values for a set of categories with smaller differences between categories. | Dot plots | - Dots don't need to start at 0. <br> - Ordering is important (order them by ascending or descending data values). <br> - Not good for very large datasets, the resulting figure can become too busy. | | Numerical values for a set of categories for different groups. | Grouped bars | - Grouped bar plots contain a lot of information at once and they can be confusing. | |Numerical values for a set of categories for different groups, when the sum of the amounts represented by the individual groups is important for the message.| Stacked bars | - Can be useful when the point is to show that a value is the sum of other values, but you’re only interested in comparing the totals.<br> - They get harder to read the more segments each bar has. | |Numerical values for a set of categories for different groups when you are interested in highlighting a trend | heatmaps| - Ordering of the categorical data values matters.<br> - Can be easy to misread, color scale can create a misperception of the magnitude of the data.<br> - should primarily be used to illustrate patterns, not to replace tables.| | ## Distributions - Understand how one or many particular variables are distributed in a dataset | Type of data | Type of chart | Beware of: | Examples | | ------------------------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- | | Frequencies of occurences of a continous variable | Histogram | - Histograms are generated by binning the data, their visual appearance depends on the choice of the bin width. Is good practice to always explore multiple bins widths. | | Curves of one or several continous variables | Kernel density estimates | - The bandwidth parameter behaves similarly to the bin width in histograms and can affect the visual appaearance of the figure. <br>- Kernel density estimates can produce the appearance of data where none exists (in particular in the tails) and can lead to figures that make nonsensical statements. | | Visualizing several groups of distributions in the vertical axis as a function of another variable | Box plots or violin plots | - Violin plots need to have enough data points in each group to justify showing the point densities as smooth lines. Violin plots use kernel density estimates can produce the appearance of data where none exists.| | Visualizing several groups of distributions in the horizontal axis as a function of another variable | Ridgeline plot | - There is no need of a sparate explicit scale: the purpose of the plot is not to show specific density values but instead to allow for easy comparison of density shapes. | ## Proportions - Show how some group, entity, or amount breaks down into individual pieces that each represent a proportion of the whole From [Fundamentals of data visualisation](https://clauswilke.com/dataviz/visualizing-proportions.html#a-case-for-pie-charts). ![](https://i.imgur.com/XhgdPvV.png) (Note: I think we can use it because is just a screenshot and not a derivative). ## Relationships between two or more variables | Type of data | Type of chart | Beware of: | Examples | | ------------------------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- | | Associations among two quantitative variables | Scatter plots| | Associations among two or more quantitative variables | Correlograms| | Paired data | Slopegraphs | ## Uncertainty