The Current State of Applied Microeconometrics Research on Fedora

This is a rough first draft of the (hopefully) finished Fedora Magazine article of the same name, and (hopefully again) the first of a series of analysis of the viability of Fedora Linux as a daily driver OS in certain industries.

(we can just shove as much reference material as needed here)

(actual article starts below)

Introduction to the "The Current State of" Series

Applied Microeconometrics: What is it, and what will you need?

You're a municipal government in Kenya, and you're trying to improve school attendance. Do you provide free educational material, support student health to reduce health related dropouts or loop in the parents? You want to improve labour force outcomes, do you roll out a government supported Employment and Training program? Do credits on income tax encourage offer welfare while also increasing employment? These are the questions applied microeconometrics (known as applied micro) deals with.

At its core, applied micro uses statistical techniques to answer public policy and social science questions that ask how much does intervention/factor "x" affect outcome "y". The challenge with such questions is to strip away other effects that might confound the relationship between the two. Students who have taken an introduction to statistics class would be well aware of the maxim correlation does not equate causation. With applied micro, we try to find what is causal.

In this article, we'll explore some of the commonly used tools for workflows in applied micro, and how well Fedora works in this context.

The applied micro workflow overview

Everyone's research process is different, so what follows may not be an exhaustive iteration of all the tools everyone uses, but it should cover most of the broad use cases of most research teams. The parts of the workflow we'd be covering are:

  1. Collaboration environment setup
  2. Literature review and tracking tools
  3. Statistical data analysis tools
  4. Qualitative data analysis tools
  5. Publication tools

Do let us know in the comments if there are any specific tools we haven't covered that you'd like to see included!

Collaboration environments

Cloud Storage Setup

Fedora has support for a number of cloud syncing services that are necessary to share datasets and files with collaborators.

If you or your collaborators use Google Drive, Fedora, through Gnome Accounts, can basically

Research

Research is the most important part of the social science research process (duh, it's part of the name). Organizing the research material can be just as important as the research process itself. When it is just a couple of files and interviews it seems simple enough, but when your research sample reaches the hundreds or thousands of people, having a well organized way to search for the material you collected is half of the batte. And for that, the most widely used tool is Zotero.

Zotero is a freemium open source program that is available through its official website's .tar file (which is harder to install and get started) or through an unnoficial wrapper available on Flathub. The program itself is free as in freedom (AGPLv3) and available on GitHub, but they offer a paid option for more online storage beyond the 300MB that is available in the free tier.

Data analysis

The industry standard tool in this area currently is Stata, used more specifically for data manipulation and visualization. Although Stata works with Fedora through the official website's RPM package, it is a proprietary and paid product. What we recommend instead is the free and open source (GPLv2) alternative, R, which has feature parity with it, is available through Fedora's repositories and has been slowly gaining traction and usage in the industry.

What you can do with R

R's limitations

Scripting and documenting

Python is the undisputed king when it comes to adoption as the scripting tool and one of the currently most used programming languages. It is available in its many versions through Fedora's repostitories, where thousands of Python-related packages can also be found.

When it comes to presenting the data a notebook interface is usually needed, and that's where JuPyTer shines. It is also free and open source software that can be used through a browser in its official website but can also be used and integrate directly with some IDEs.

Select a repo