changed 2 years ago
Published Linked with GitHub

PUMFs

Public Use Microdata Files

Mathew Vis-Dunbar | Mathew.Vis-Dunbar@ubc.ca

ECON 225 | 2022-03-08


Statistics Canada Products

Note:


Summaries

  • Tables summaries, subset of variables
  • Profiles geographic region
  • Maps Interactive visuals

https://www150.statcan.gc.ca/n1/en/type/data

Note:

Numbers have been crunched for you. The information is readily digestible and easily re-communicated. They usually allow some filtering. Great for a quick reference, or for other quick summations, like changes over time.

They link you to the survey tool that was used, which is great for further inquiry.

Examples:


Microdata

The microdata used by CRDCN researchers come primarily from Statistics Canada Survey Master files.

Increasingly, the Research Data Centres (RDCs) are repositories of administrative records from a variety of sources including tax, employment insurance, social assistance, and hospitalization records.

Note:

It would be nice to have something in between these two to facilitate ease of access to data allowing for unique analyses with more granular access.


Microdata

  • Extremely granular, risk of identifying individuals or individual businesses.
  • Highly restricted access.

https://www.statcan.gc.ca/en/microdata/data-centres/data

Note:

It would be nice to have something in between these two to facilitate ease of access to data allowing for unique analyses with more granular access.


PUMFs

Public use microdata files contain anonymized, non-aggregated data.

Note:


PUMFs

  • Meticulously cleaned up to help prevent identification of participants.
  • May be a subset.
  • May be a sample.

Note:


Labour Force Survey

The Labour Force Survey provides estimates of employment and unemployment [T]he LFS estimates are the first of the major monthly economic data series to be released.

This [PUMF] contains non-aggregated data for users who prefer to do their own analysis by focusing on specific subgroups in the population or by cross-classifying variables that are not in our catalogued products.

LFS Microdata | LFS PUMF

Note:


2016 Census

This Hierarchical File, 2016 Census Public Use Microdata File (PUMF) product provides access to non-aggregated data covering a sample of 1% of the Canadian households

This comprehensive file is excellent tool for policy analysts, pollsters, social researchers

2016 Census PUMF

Note:


Working with PUMFs


Releases

  • PUMFs are expensive to create and take time.
    • When will 2021 Census PUMFs be released???
  • Some are released regularly (LFS).
  • Some are one offs or only produced irregularly.

Licencing

Statistics Canada Open Licence governs the use of PUMFs.

Statistics Canada Open Licence

Note:


Documentation

  • PUMFs will have stand alone documentation.
    • Usually in a readme and meta file with a dictionary.
  • PUMF records will be connected with:
    • The source survey.
    • Related products (surveys, summaries, analyses etc).
  • The survey documentation will go into greater detail.

LSF PUMF | LFS survey

Note:


Accessing PUMFs

Two portals


Statistics Cananda

Note:

  • The Data landing page omits census data when filtering
  • The Statisical Program page requires knowing that a PUMF exists.

Abacus

The Data Liberation Initiative (DLI) is a partnership between post-secondary institutions and Statistics Canada with the goal of improving access to data resources.

DLI information

Note:


Abacus

Note:


Finding Data Generally


General Approaches

  • Who's responsible for
    • generating
    • collecting
    • aggregating
    • disseminating

Note:

  • Simply because it's generated doesn't mean it's clean or accessible

Deaths at the Hands of Police

CBC Deadly Force Database (2000 - 2020)

Tracking (In)Justice (2000 - Present)


Tracking (In)Justice

The CBC data has been critiqued for lacking a transparent methodology. Additionally, policing scholars have noted that while certain practices in the collection of data by the CBC data meet journalistic standards, they may not meet scholarly research standards.


Lunaris & FRDR

FRDR provides powerful functionality to search for Canadian research data. This federated search tool aggregates metadata from numerous repositories, including datasets deposited in FRDR’s repository platform.


Lunaris & FRDR


Other Sources

Select a repo