PyCon 2020
===
# Peter Wang from Anaconda
- Python for data
- Open source
- Why it matters
- Q&A
## Python for data
- data literacy
- Anaconda Distribution
- double disruption in 2012
- Big data
- Cloud computing
- Founding visions
- Key tool: python
- Get paid for
- Analyst: insight
- Data developer: code that produce insight
- Programmer: code
- Python through the Years
- 1990 Scripting Era
- 2000 SciPy Era
- 2010 PyData Era
- What is the cause of data science
- more data
- power of cloud
- concrete expectation in data
- powerful open source
- Why Python for Data Science
- Accessible
- Performant
- Compatible
- Python's challengev (The result of success)
- Embarrassment of riches ... What do I use for my problem today
- Aging King ... Hard to refactor
- New-found importance in the business world
## Open Source
- Meaning of open
- free
- I can read the source
- open to PR (pull request)
- open to ideas
- Sustainable
- There are not open source problems
- There are business model problems
- OSS, two ways
- although both scratch their own itch
- commoditization vs innovation
- Business value in open source software
- Without the control from vendor
- Freedom to make change
- OSS community value
- sharing
- participation
- pride
- Not just code
- Software is a process and a relationship
- Software grows by consuming idea that beter adapt it to the new network
- OSS isn't property
- usufruct
- Code is becoming less important than APIs
- Be aware of Faux-OSS: some platfrom privilege
- Stable APIs are a common for future innovation
- Open
- a vluae regarding contribution, utilizatin and user freedom
- Source
- read code could guaranteeuser sovereignty and freedom
- Software
- should be viewed as a process. Develop it *well*
- Community
- good software requires healthy communities
## Why it matters
- How can we build a good, resilient technological society
- Author: Cybernetics and society - Norbert wiener
- Author: The technological society - Jacques Ellul
- TedTalks: misplaced worried about AI - www.humanetech.com
- Overwhelms Human Weakness toward overwhelms Human Strength
- Slate Star Codex
- Meditations on moloch
- What does this have to do with python
- Empower regular people
- Language is a human instinct and is a natural path to insight
- Computer langugages are thoughtware, not software
- Precursor -> acturator: theory of action
- Data literacy is a prerequisite for tuture freedom: (cybernetic Era)
- Darker pattern
- Open source: library only work on particular platform
- Tensorflow: light light grey pattern
- best practice to utilize the open source software, an
- Peter Wang: make a good use of community resource.
### Q
---
- How do you make a contribution to OSS, even if you work in a company?
## Loki
---
### Natural Language Understanding (NLU) or semantic calculation
- 目標:軟體具備自然語意計算的能力
- 坊間號稱
- 寫不完的關鍵字
- 例子: 大聲 夠大聲 太大聲 不夠大聲 不太夠大聲
- 語意計算的目標:
- BOW-Based approach
- 結構是關鍵
- Microsoft LUIS (未建entity)
- 單只能抓關鍵字
- To have a real NLU
- 保留結構
- 搞懂中文
- 詞彙
- 功能語言學
- 場景 功能 語句
- 專案名稱 意圖 語句
- 智慧物聯網 -> 勿連網
- AIOT 離線操作
- if-else 程式開發即可
- 隱私問題
- Articut
- droidtown
### Intermittent computing
- IoT
- Battery
- Battery aging
- Liveness, correctness, consistency, efficency
- pybind11
### FixIt
- Existing: Flake8, PyLint
- FixIt can replace the suggested syntax automatically
- Customized rule
## mlflow Tracking
### Agenda
- Logging
- MLflow
- Life cycle
- MLOPS
- Recommended tool: Kubeflow
- Setup train inference post-analysis (feedback loop)
## Airflow
- concept
- DAG (directed acyclic graph)
- Operator
- airflow architecture
- schedule worker metadata
- bitshift
- define operator relationship
- question
- Can I use the tool for non-ETL tasks?
## On-line teaching
- play to win
# 9/6
## I became the open source maintainer by mariatta
### Motivation
- Lack of diversity
### Contribution
- ~~Modules~~
- Find project that I case about, where my existing skills would be useful
- Repo
- Doc
- Issue in the bug tracker
- Mailing-list
### In progress
- Like a new job
- No onboarding session
- Takes time
- Not just about code
### Things I didn't expect
- Too many emails
- Handle code of conduct cases
- Moderate communication channels
### Lesson learned by mistake
- Make my own decisions
- Communication
- Time management
### I did it again
- "In open source, the more you give, the more you get back." Peter Wang
### How you can do it better
- Clarify the expectations
- Have a maintenance guide
- Have a succesion plan in place
- Organize an annual sprint
- Get Funding
### Advice: Repeat the quesiton
### Q:
- Hi Mariatta, thanks for the talk, especially for the part of your learned lesson. I would like to ask for some advice. Time management is the challenge, we always have a crazy to-do list. For the starter, for the open source, do you exercise any strategy to make you continuously contribute to the community?
- Say no if necessary
- How to "find" time to make contribution to the community?
- Set a goal
## Network automation with python by Eric Chou
### Opening
- Dickens, a tale of two cities
### Traditional
- Isolated
- CLI Monkey
- Management: telent -> SSH
### Reason for Change
- Software defined networking
- OpenFlow, OpenDaylight, SD-WAN
- Network virtualization
- NFV, Overlay VxLAN
- Hyper-scale datacenters
- (Amazon AWS, MS Azure, Facebook)
- Microsoft SONiC, Facebook FBOSS
- Why python
- Relatively easy to learn
- Vendor support
- Common denominator
- Language popularity (large ecosystem)
### Device level management
- Open source: Paramiko, NAPALM, Netmiko
- API
- RESTFul API
- XML/JSON
- ...
- New comer: YANG Data Model [PyYAML]
- Onboard management
- Python + Linux + Container
- Vendor provided SDK
### Controller-based management
- Cisco
- VMWare
- Big Switch Network
- OpenDaylight
### Network automation frameworks
- Ansible
- Hosts
- Host Variables
- Playbook
- Use YAML
### Recommendations
- Start small, say device level management
- DevNet
- Controller-based solution if applicable
- Prefer open source
- Ansible
### Closing with the preface
## 誰識KOL by 騰林 from Cathay
### Agenda
- Visualize the time spending
- Intro, SNA, NLP, Summary
### Social Network Analysis (SNA)
- 蝦皮: facebook 讚
### Collect from facebook
- conversation
- data selection logic
### How to evaluate the importance from a graph
- neo4j
- Degree
- Closeness
- Betweenness
- PageRank
- 網頁重要性排序
- example by Taipei MRT
### SNA 視覺化
- 連接多個KOL的節點特性
- 靠近並不代表支持
### LDA分析技巧
- 去除重複
- 轉換成長文本
- 限制詞頻
### Actionable item
- 信用卡的使用情況,分析不同卡片的使用狀況
- 理專金流狀況
## Corona-Net Fighting COVID-19 with ML by Lam from HK
- Hospital are overwhelmed
- Corona-Net
- Binary Classification
- Binary Segmentation
- 3-Class Segmentation
- Why Numpy
- Parallelism and vectorization
- Easy to prototype
- Image Processing
- Torchvision: tight integration with PyTorch
- Albumentations: Biomedical Imaging
- Why PyTorch
- More research support
- Customization
- Model architecture
- Classification detection segmentation
- Classification
- ResNets: shortcut connections
- Global and local information
- FPNs
- Efficient-Net
- Compound Scaling Method
- Joint scaling of network 1)depth 2)width 3)input resolution
- Segmentation
- Fully convolutional networks
- Encoder-decoder
- U-Net
- symmetrical contracting
- Data Augmentation
- Segmentation Evaluation
- Evaluation metrics
- Future development
- Segmentation
- Binary, Multi-Class (3-Classes)
Natural Language Understanding (NLU) or semantic calculation
坊間號稱
寫不完的關鍵字
語意計算的目標:
Microsoft LUIS (未建entity)
To have a real NLU
功能語言學
智慧物聯網 -> 勿連網
Articut
Intermittent computing
FixIt
mlflow Tracking
Agenda
Airflow
concept
airflow architecture
bitshift
question
On-line teaching
9/6
I became the open source maintainer by mariatta
Motivation
Contribution
Modules
In progress
Things I didn't expect
Lesson learned by mistake
I did it again
How you can do it better
Advice: Repeat the quesiton
Q:
Hi Mariatta, thanks for the talk, especially for the part of your learned lesson. I would like to ask for some advice. Time management is the challenge, we always have a crazy to-do list. For the starter, for the open source, do you exercise any strategy to make you continuously contribute to the community?
How to "find" time to make contribution to the community?
Network automation with python by Eric Chou
Opening
Traditional
Reason for Change
Software defined networking
Network virtualization
Hyper-scale datacenters
Why python
Device level management
API
Onboard management
Controller-based management
Network automation frameworks
Ansible
Recommendations
Prefer open source
Closing with the preface
誰識KOL by 騰林 from Cathay
Agenda
Visualize the time spending
Social Network Analysis (SNA)
Collect from facebook
How to evaluate the importance from a graph
neo4j
PageRank
SNA 視覺化
LDA分析技巧
Actionable item
Corona-Net Fighting COVID-19 with ML by Lam from HK