changed 5 years ago
Linked with GitHub

Summary of Interviews from 9/15 - 9/22

Background of individuals

  • PhD in Physics, Intro to Programming instructor at local community college
    • Several years of experience with Matlab, Julia, Fortran
  • Master's graduate in Computer Science, Software Engineer at T-Mobile
    • Used Matplotlib extensively as PoC for thesis
  • Data scientist at Microsoft, former TA at University of WA for CS
    • Matplotlib as exploratory data analysis phase for work

General questions initiated

  • Experience with Matplotlib
  • Confidence with the library
  • Troubleshooting methodology
  • Expectations of documentation

Examples

  • Inconsistency of how code samples are built
  • Use more well-known data sets and structures
    • Improved examples of data that captures context
  • Pro and con of multiple ways to do a single task
  • More thorough and clear explanation of basic Matplotlib objects

Foundational knowledge

  • Making plots more readable with object labels
  • Plot arrangement and manipulation

Information architecture

  • Superficially linear at first glance with ToC and landing page
  • Subject affinity and related topic links more apparent
  • External resources provided for additional support

Usability Testing

Questionnaire additions

  • Give academic focus as well for responses
  • Loaded questions for what kind of feedback would you suggest
  • discoverability, tags, for examples
  • Indian community, Europeans
    • Translation? Feedback from others
    • See what NumPy is doing

Questions with Joe

data structure important for new users
stack overflow code snippets

good use cases, plot a curve, make a scatter plot

create data/collect data
plot/figure

multiple ways to do one thing

explaining basic matplotlib objects
labels for objects, what is this thing called
empowered to do research

docs be less cluttered, straightforward, lots of examples

Joe specific things:
not standardized matplotlib docs
different for many things, some examples need source code, put on browser
changing major/minor tick marks
different layouts
plot within a plot
spacing between axes, any docs for control for this?
annotations, clear way to put things in places
color maps, perils of
advanced techniques hard to find docs fo
dates are hard to locate

Good documentation

scikitlearn
consistency in models
separate links to examples

scipy

welcome page
make a thing in pythong? what would you want to plot?
good exmaples of based in real world numbers

Questions with Emily

computer science student perspective, "What can I do do achieve this goal?"

search engine, library, resource that helps to achieve

her projects:
machine learning, plotting for data visualization
final research topic
matlab for data analysis
python syntax helped with familiarity, quickly see plot
helped to visualize immediately, prototyping
doesn't feel the need to be an expert in this library in order to accomplish tasks
recommends for simple plots, hard to recommend for formal publication or front facing avenues as complex and fine details are hard to get exact visual for perfection

documentation?
search keywords on google, linked to specific page
didn't follow tutorials
ex. bar chart, search, copy/paste, modify as needed
first example not helpful, then searching matplotlib docs

code examples sometimes don't lead to something generate what is intended

adjusting plot size, font of ticks, not intuitive
used matlab to do final work
didn't pay much attention if matplotlib wasn't perfect as it was a stopover
copytext manipulation not intuitive

good tool to help prototype at a glance

prefers plot examples as result of code

code general enough for people to use as template and copy/paste then adjust parameters
inconsistent in examples

matplotlib docs criticisms
has many things, not simple/straightforward to get info for tasks
overview page is not helpful, doesn't want a book and yet it feels like a book
depending on library, most people would prefer quick&dirty guide as most helpful for most users
section titles more tool-oriented

bootstrap recommendation
docs have starter template, minimal text, more examples (short and straightforward, small footprint)
organized in a clearer manner
able to quickly find things

Questions with Smriti

How long have you used/known about Matplotlib? How often do you use the library?
Started using in 2016, actively started using since 2017 at work for performing Exploratory Data Analysis on data. I would say, I use it in every project (mostly at the start of a project)- so use heavily for a few weeks once every 3 months (1 project per quarter).

How would you describe your confidence with using Matplotlib to complete your tasks?
Confident on the visualizations that I use often like-Boxplots, line charts, histograms, density plots- basic functions, provide parameters, and plt.show(). But when it comes to researching new plots, or ways to combine multiple graphs together in 1-becomes challenging. I would give myself 3/5 on proficiency.

How have you troubleshooted issues you’ve had for Matplotlib in the past?
Mostly through Stack Overflow, most examples online do not relate to my data, so it is quite challenging to fix some issues (not in terms of syntax errors, but more in terms of not producing what I was expecting).

Do you use the Matplotlib documentation? Why or why not?
I use documentation only to see what parameters I can set in every plot function, nothing else. The documentation mostly works with random created lists of numbers or make_blobs, never based on any context. So I find other googled sources more helpful that explain to use the plots in well-known contexts and data (and not just a randomly generated list of numbers).

What do you usually use Matplotlib for?
Usually in the Exploratory Data Analysis phase to check the distribution of the data-distribution plots, to see frequency- density plots, and also to visualize model performance- ROC/AUC etc.

What do you think a user at your level needs to know in order to use Matplotlib successfully?
How to pass data into the plotting functions (2d arrays/list/data frame), , how to combine different sets of data in a single plot (challenging for me), how to arrange multiple plots in a single plot (Eg: 4 plots together)-also a bit challenging.

What would you like to see improve in the Matplotlib documentation?
Definitely use datasets that are more well-known and in the format like data frames, and not just demonstrating on a random generated list of numbers. Having an explanation on a dataset, gives us users more context on how to use it in a business scenario and not just for a small POC-where we want to test if the plotting function works or not.

tags: GSoD notes interview
Select a repo