Try โ€‚โ€‰HackMD

Quansight Data Science Residency Jam Session

2019-11-13

Attendees

  • TonyFast - @tonyfast
  • Adam Lewis - @balast

Magic of the day

2019-11-13

Attendees

  • Tyler Potts - @t-potts
  • Adam Lewis - @balast
  • Dillon Roach - @dillonroach
  • Abraham Maxfield - @utabe

Magic of the day

Changes the path of your current notebook

% pushd <path>

2019-10-30

Attendees

  • Tony Fast - @tonyfast
  • Tyler Potts - @t-potts
  • Abraham Maxfield -@utabe
  • Adam Lewis - @balast

Magic Dir()

dir() will list all of the attributes of whatever object you pass to it.
Caveat: dir can be customized in the class, so it might not always return what you want

[x for x in dir(pd.Series) if x != '_'] will only return the methods

2019-10-23

Attendees

  • Tyler Potts - @t-potts
  • Adam Lewis - @balast
  • Pam Wadhwa - @ppwadhwa

2019-10-16

Issues vs. Slack from data science communications (Saul)

Attendees

  • Tony Fast - @tonyfast
  • Tyler Potts - @t-potts
  • Pam Wadhwa - @ppwadhwa
  • Adam Lewis - @balast
  • Abraham Maxfield - @utabe
  • Dillon Roach - @dillonroach

2019-10-09

https://zoom.us/j/133471615

https://gist.github.com/sloria/7001839

https://github.com/simonw/datasette
https://github.com/simonw/sqlite-utils
https://gist.github.com/tonyfast/e638af10424de0284b36c0bf77fcd42a

https://github.com/ibis-project/ibis
https://datasette.readthedocs.io/en/stable/publish.html
https://cloud.google.com/community/tutorials/bigquery-ibis
https://docs.ibis-project.org/udf.html

text, music, visuals

@tonyfast- visual
@adam - audio books
@trent - music
@t-potts music
@pam - visual learner
@fatma - visual & text
@abe - visual & text
@dillon - visual learner / music til the afterlife

Attendees

  • Tony Fast - @tonyfast
  • Dillon Roach - @dillonroach
  • Abraham Maxfield -@utabe
  • Adam Lewis - @balast
  • Tyler Potts - @t-potts
  • Trent Oliphant - @trentoliphant
  • Pam Wadhwa - @ppwadhwa

audioeye.com

listing of image recognition services
https://www.g2.com/categories/image-recognition

Audioeye is currently using the following two endpoints

https://api.projectoxford.ai/vision/v1.0/analyze?visualFeatures=Description,Tag
Weโ€™re using that API for image recognition

https://api.projectoxford.ai/vision/v1.0/ocr?language=en
Weโ€™re using that for OCR

Pandas formatting trick

Line up all of your methods for easy readability

df = (
    df.merge(other_df)
    .groupby(['thing1', 'thing2'])
    .sum()
    )

Expand a Series of dicts to individual columns for each key

df.column_name.apply(pd.Series)

Kernel Shell

https://en.wikipedia.org/wiki/Shell_(computing)

2019-10-02

https://zoom.us/j/133471615

Attendees

  • Tony Fast - @tonyfast
  • Pam Wadhwa - @ppwadhwa
  • Adam Lewis - @balast
  • Abraham Maxfield - @utabe
  • Tyler Potts - @t-potts

Pam - Presentations in Jupyter Notebooks

Using RISE for presentations

Abe - GraphQl from github

https://github.com/willingc/pyquery-ql
https://gist.github.com/tonyfast/00ae53f59c9340f71b9605eca1d07019

โ€‹โ€‹โ€‹โ€‹import requests, pandas
โ€‹โ€‹โ€‹โ€‹__import__('requests_cache').install_cache('gh')
โ€‹โ€‹โ€‹โ€‹df = pandas.concat([pandas.DataFrame(requests.get("https://api.github.com/users?page={i}").json()) for i in range(10)])

Tyler - Aggravations from the Rock

https://github.com/xonsh/xonsh

โ€‹โ€‹โ€‹โ€‹Connect to EC2:
โ€‹โ€‹โ€‹โ€‹ssh -i ~/.ssh/north_ca_analytics.pem ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com
โ€‹โ€‹โ€‹โ€‹
โ€‹โ€‹โ€‹โ€‹Start Jupyter Lab from EC@
โ€‹โ€‹โ€‹โ€‹jupyter lab --no-browser --port=8800
โ€‹โ€‹โ€‹โ€‹
โ€‹โ€‹โ€‹โ€‹Forward port 8800 from EC2 to local machine which can be accessed by going to url localhost:8800 on a web browser
โ€‹โ€‹โ€‹โ€‹ssh -i ~/.ssh/north_ca_analytics.pem -N -f -L localhost:8800:localhost:8800 ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com
โ€‹โ€‹โ€‹โ€‹
โ€‹โ€‹โ€‹โ€‹Copy a file from the EC2 to local
โ€‹โ€‹โ€‹โ€‹scp -i ~/.ssh/north_ca_analytics.pem ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com:~/insert_test.ipynb /home/tyler/

2019-09-25

https://zoom.us/j/133471615

Attendees

  • Tony Fast - @tonyfast
  • Dillon Roach - @dillonroach
  • Adam Lewis - @balast

Dask Intro Presentation - @balast

Sphinx - @ppwadhwa

A Tale of Two ๐Ÿผs Dataframes.

Notebook Documents

2019-09-18

https://zoom.us/j/133471615

Attendees

  • Tony Fast - @tonyfast
  • Dillon Roach - @dillonroach
  • Pam Wadhwa - @ppwadhwa
  • Adam Lewis - @balast

Notes

https://gist.github.com/tonyfast/87562b3e76e20855aa076bf793c8208f
https://www.wired.com/story/artificial-intelligence-confronts-reproducibility-crisis/
https://pypi.org/project/requests-cache/
https://en.wikipedia.org/wiki/VisiCalc

Linking a pdf in NBs, can be helpful to just link to the requested page with #page2

Duck typing - style of dynamic typing in which an object's current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class or implementation of a specific interface. If it quacks like a duck, assume it's a duck.

globals.update(df) to push the DF names into global call name-space

Going to try to include 3-5min per-person 'lightning round' where we all discuss/present something we learned during the week

2019-09-10

https://zoom.us/j/725442060

Attendees

Notes

Error bars for holoviews:
http://holoviews.org/reference/elements/matplotlib/ErrorBars.html

Holoviews implements fuzzy string interpretation. When trying to discover options try something that sounds right and see if there is a suggestion in the error message.

Use tab when exploring options for objects (ie self.params)

DataFrames/hvplot allows linking of added views

hvplot is a high-level API built on holoviews: https://mybinder.org/v2/gh/pyviz/holoviews/master?filepath=examples

Anything with panel objects will serve as a panel object (panel serve)

http://holoviews.org/reference/containers/bokeh/DynamicMap.html

https://panel.pyviz.org

pyviz ecosystem

panel - widgets
holoviews - for plots
datashader - lotta point plots
hvplot - plotting dataframes

Make dataframes, not plots

Altair provides a higher level plotting syntax for dataframes. That is partially consistent with holoviews. It renders json with an altair schema instead of bokeh or matplotlib.

Tyler Lambda presentation

https://aws.amazon.com/premiumsupport/knowledge-center/build-python-lambda-deployment-package/
https://blog.shikisoft.com/access-mongodb-instance-from-aws-lambda-python/

โ€‹โ€‹โ€‹โ€‹jupyter lab --no-browser --port=8800
โ€‹โ€‹โ€‹โ€‹ssh -i ~/.ssh/north_ca_analytics.pem ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com
โ€‹โ€‹โ€‹โ€‹ssh -i ~/.ssh/north_ca_analytics.pem -N -f -L localhost:8800:localhost:8800 ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com
โ€‹โ€‹โ€‹โ€‹scp -i ~/.ssh/north_ca_analytics.pem ubuntu@ec2-18-144-54-246.us-west-1.compute.amazonaws.com:~/insert_test.ipynb /home/tyler/quansight/saveday/lambda/