PyCon 2020
===
# Peter Wang from Anaconda
- Python for data
- Open source
- Why it matters
- Q&A
## Python for data
- data literacy
- Anaconda Distribution
- double disruption in 2012
- Big data
- Cloud computing
- Founding visions
- Key tool: python
- Get paid for
- Analyst: insight
- Data developer: code that produce insight
- Programmer: code
- Python through the Years
- 1990 Scripting Era
- 2000 SciPy Era
- 2010 PyData Era
- What is the cause of data science
- more data
- power of cloud
- concrete expectation in data
- powerful open source
- Why Python for Data Science
- Accessible
- Performant
- Compatible
- Python's challengev (The result of success)
- Embarrassment of riches ... What do I use for my problem today
- Aging King ... Hard to refactor
- New-found importance in the business world
## Open Source
- Meaning of open
- free
- I can read the source
- open to PR (pull request)
- open to ideas
- Sustainable
- There are not open source problems
- There are business model problems
- OSS, two ways
- although both scratch their own itch
- commoditization vs innovation
- Business value in open source software
- Without the control from vendor
- Freedom to make change
- OSS community value
- sharing
- participation
- pride
- Not just code
- Software is a process and a relationship
- Software grows by consuming idea that beter adapt it to the new network
- OSS isn't property
- usufruct
- Code is becoming less important than APIs
- Be aware of Faux-OSS: some platfrom privilege
- Stable APIs are a common for future innovation
- Open
- a vluae regarding contribution, utilizatin and user freedom
- Source
- read code could guaranteeuser sovereignty and freedom
- Software
- should be viewed as a process. Develop it *well*
- Community
- good software requires healthy communities
## Why it matters
- How can we build a good, resilient technological society
- Author: Cybernetics and society - Norbert wiener
- Author: The technological society - Jacques Ellul
- TedTalks: misplaced worried about AI - www.humanetech.com
- Overwhelms Human Weakness toward overwhelms Human Strength
- Slate Star Codex
- Meditations on moloch
- What does this have to do with python
- Empower regular people
- Language is a human instinct and is a natural path to insight
- Computer langugages are thoughtware, not software
- Precursor -> acturator: theory of action
- Data literacy is a prerequisite for tuture freedom: (cybernetic Era)
- Darker pattern
- Open source: library only work on particular platform
- Tensorflow: light light grey pattern
- best practice to utilize the open source software, an
- Peter Wang: make a good use of community resource.
### Q
---
- How do you make a contribution to OSS, even if you work in a company?
## Loki
---
### Natural Language Understanding (NLU) or semantic calculation
- 目標:軟體具備自然語意計算的能力
- 坊間號稱
- 寫不完的關鍵字
- 例子: 大聲 夠大聲 太大聲 不夠大聲 不太夠大聲
- 語意計算的目標:
- BOW-Based approach
- 結構是關鍵
- Microsoft LUIS (未建entity)
- 單只能抓關鍵字
- To have a real NLU
- 保留結構
- 搞懂中文
- 詞彙
- 功能語言學
- 場景 功能 語句
- 專案名稱 意圖 語句
- 智慧物聯網 -> 勿連網
- AIOT 離線操作
- if-else 程式開發即可
- 隱私問題
- Articut
- droidtown
### Intermittent computing
- IoT
- Battery
- Battery aging
- Liveness, correctness, consistency, efficency
- pybind11
### FixIt
- Existing: Flake8, PyLint
- FixIt can replace the suggested syntax automatically
- Customized rule
## mlflow Tracking
### Agenda
- Logging
- MLflow
- Life cycle
- MLOPS
- Recommended tool: Kubeflow
- Setup train inference post-analysis (feedback loop)
## Airflow
- concept
- DAG (directed acyclic graph)
- Operator
- airflow architecture
- schedule worker metadata
- bitshift
- define operator relationship
- question
- Can I use the tool for non-ETL tasks?
## On-line teaching
- play to win
# 9/6
## I became the open source maintainer by mariatta
### Motivation
- Lack of diversity
### Contribution
- ~~Modules~~
- Find project that I case about, where my existing skills would be useful
- Repo
- Doc
- Issue in the bug tracker
- Mailing-list
### In progress
- Like a new job
- No onboarding session
- Takes time
- Not just about code
### Things I didn't expect
- Too many emails
- Handle code of conduct cases
- Moderate communication channels
### Lesson learned by mistake
- Make my own decisions
- Communication
- Time management
### I did it again
- "In open source, the more you give, the more you get back." Peter Wang
### How you can do it better
- Clarify the expectations
- Have a maintenance guide
- Have a succesion plan in place
- Organize an annual sprint
- Get Funding
### Advice: Repeat the quesiton
### Q:
- Hi Mariatta, thanks for the talk, especially for the part of your learned lesson. I would like to ask for some advice. Time management is the challenge, we always have a crazy to-do list. For the starter, for the open source, do you exercise any strategy to make you continuously contribute to the community?
- Say no if necessary
- How to "find" time to make contribution to the community?
- Set a goal
## Network automation with python by Eric Chou
### Opening
- Dickens, a tale of two cities
### Traditional
- Isolated
- CLI Monkey
- Management: telent -> SSH
### Reason for Change
- Software defined networking
- OpenFlow, OpenDaylight, SD-WAN
- Network virtualization
- NFV, Overlay VxLAN
- Hyper-scale datacenters
- (Amazon AWS, MS Azure, Facebook)
- Microsoft SONiC, Facebook FBOSS
- Why python
- Relatively easy to learn
- Vendor support
- Common denominator
- Language popularity (large ecosystem)
### Device level management
- Open source: Paramiko, NAPALM, Netmiko
- API
- RESTFul API
- XML/JSON
- ...
- New comer: YANG Data Model [PyYAML]
- Onboard management
- Python + Linux + Container
- Vendor provided SDK
### Controller-based management
- Cisco
- VMWare
- Big Switch Network
- OpenDaylight
### Network automation frameworks
- Ansible
- Hosts
- Host Variables
- Playbook
- Use YAML
### Recommendations
- Start small, say device level management
- DevNet
- Controller-based solution if applicable
- Prefer open source
- Ansible
### Closing with the preface
## 誰識KOL by 騰林 from Cathay
### Agenda
- Visualize the time spending
- Intro, SNA, NLP, Summary
### Social Network Analysis (SNA)
- 蝦皮: facebook 讚
### Collect from facebook
- conversation
- data selection logic
### How to evaluate the importance from a graph
- neo4j
- Degree
- Closeness
- Betweenness
- PageRank
- 網頁重要性排序
- example by Taipei MRT
### SNA 視覺化
- 連接多個KOL的節點特性
- 靠近並不代表支持
### LDA分析技巧
- 去除重複
- 轉換成長文本
- 限制詞頻
### Actionable item
- 信用卡的使用情況,分析不同卡片的使用狀況
- 理專金流狀況
## Corona-Net Fighting COVID-19 with ML by Lam from HK
- Hospital are overwhelmed
- Corona-Net
- Binary Classification
- Binary Segmentation
- 3-Class Segmentation
- Why Numpy
- Parallelism and vectorization
- Easy to prototype
- Image Processing
- Torchvision: tight integration with PyTorch
- Albumentations: Biomedical Imaging
- Why PyTorch
- More research support
- Customization
- Model architecture
- Classification detection segmentation
- Classification
- ResNets: shortcut connections
- Global and local information
- FPNs
- Efficient-Net
- Compound Scaling Method
- Joint scaling of network 1)depth 2)width 3)input resolution
- Segmentation
- Fully convolutional networks
- Encoder-decoder
- U-Net
- symmetrical contracting
- Data Augmentation
- Segmentation Evaluation
- Evaluation metrics
- Future development
- Segmentation
- Binary, Multi-Class (3-Classes)
{"metaMigratedAt":"2023-06-15T12:26:28.854Z","metaMigratedFrom":"Content","title":"PyCon 2020","breaks":true,"contributors":"[{\"id\":\"c0b6eb79-0233-4df6-940e-54f95df080b4\",\"add\":8247,\"del\":329}]"}