PyCon 2020 === # Peter Wang from Anaconda - Python for data - Open source - Why it matters - Q&A ## Python for data - data literacy - Anaconda Distribution - double disruption in 2012 - Big data - Cloud computing - Founding visions - Key tool: python - Get paid for - Analyst: insight - Data developer: code that produce insight - Programmer: code - Python through the Years - 1990 Scripting Era - 2000 SciPy Era - 2010 PyData Era - What is the cause of data science - more data - power of cloud - concrete expectation in data - powerful open source - Why Python for Data Science - Accessible - Performant - Compatible - Python's challengev (The result of success) - Embarrassment of riches ... What do I use for my problem today - Aging King ... Hard to refactor - New-found importance in the business world ## Open Source - Meaning of open - free - I can read the source - open to PR (pull request) - open to ideas - Sustainable - There are not open source problems - There are business model problems - OSS, two ways - although both scratch their own itch - commoditization vs innovation - Business value in open source software - Without the control from vendor - Freedom to make change - OSS community value - sharing - participation - pride - Not just code - Software is a process and a relationship - Software grows by consuming idea that beter adapt it to the new network - OSS isn't property - usufruct - Code is becoming less important than APIs - Be aware of Faux-OSS: some platfrom privilege - Stable APIs are a common for future innovation - Open - a vluae regarding contribution, utilizatin and user freedom - Source - read code could guaranteeuser sovereignty and freedom - Software - should be viewed as a process. Develop it *well* - Community - good software requires healthy communities ## Why it matters - How can we build a good, resilient technological society - Author: Cybernetics and society - Norbert wiener - Author: The technological society - Jacques Ellul - TedTalks: misplaced worried about AI - www.humanetech.com - Overwhelms Human Weakness toward overwhelms Human Strength - Slate Star Codex - Meditations on moloch - What does this have to do with python - Empower regular people - Language is a human instinct and is a natural path to insight - Computer langugages are thoughtware, not software - Precursor -> acturator: theory of action - Data literacy is a prerequisite for tuture freedom: (cybernetic Era) - Darker pattern - Open source: library only work on particular platform - Tensorflow: light light grey pattern - best practice to utilize the open source software, an - Peter Wang: make a good use of community resource. ### Q --- - How do you make a contribution to OSS, even if you work in a company? ## Loki --- ### Natural Language Understanding (NLU) or semantic calculation - 目標:軟體具備自然語意計算的能力 - 坊間號稱 - 寫不完的關鍵字 - 例子: 大聲 夠大聲 太大聲 不夠大聲 不太夠大聲 - 語意計算的目標: - BOW-Based approach - 結構是關鍵 - Microsoft LUIS (未建entity) - 單只能抓關鍵字 - To have a real NLU - 保留結構 - 搞懂中文 - 詞彙 - 功能語言學 - 場景 功能 語句 - 專案名稱 意圖 語句 - 智慧物聯網 -> 勿連網 - AIOT 離線操作 - if-else 程式開發即可 - 隱私問題 - Articut - droidtown ### Intermittent computing - IoT - Battery - Battery aging - Liveness, correctness, consistency, efficency - pybind11 ### FixIt - Existing: Flake8, PyLint - FixIt can replace the suggested syntax automatically - Customized rule ## mlflow Tracking ### Agenda - Logging - MLflow - Life cycle - MLOPS - Recommended tool: Kubeflow - Setup train inference post-analysis (feedback loop) ## Airflow - concept - DAG (directed acyclic graph) - Operator - airflow architecture - schedule worker metadata - bitshift - define operator relationship - question - Can I use the tool for non-ETL tasks? ## On-line teaching - play to win # 9/6 ## I became the open source maintainer by mariatta ### Motivation - Lack of diversity ### Contribution - ~~Modules~~ - Find project that I case about, where my existing skills would be useful - Repo - Doc - Issue in the bug tracker - Mailing-list ### In progress - Like a new job - No onboarding session - Takes time - Not just about code ### Things I didn't expect - Too many emails - Handle code of conduct cases - Moderate communication channels ### Lesson learned by mistake - Make my own decisions - Communication - Time management ### I did it again - "In open source, the more you give, the more you get back." Peter Wang ### How you can do it better - Clarify the expectations - Have a maintenance guide - Have a succesion plan in place - Organize an annual sprint - Get Funding ### Advice: Repeat the quesiton ### Q: - Hi Mariatta, thanks for the talk, especially for the part of your learned lesson. I would like to ask for some advice. Time management is the challenge, we always have a crazy to-do list. For the starter, for the open source, do you exercise any strategy to make you continuously contribute to the community? - Say no if necessary - How to "find" time to make contribution to the community? - Set a goal ## Network automation with python by Eric Chou ### Opening - Dickens, a tale of two cities ### Traditional - Isolated - CLI Monkey - Management: telent -> SSH ### Reason for Change - Software defined networking - OpenFlow, OpenDaylight, SD-WAN - Network virtualization - NFV, Overlay VxLAN - Hyper-scale datacenters - (Amazon AWS, MS Azure, Facebook) - Microsoft SONiC, Facebook FBOSS - Why python - Relatively easy to learn - Vendor support - Common denominator - Language popularity (large ecosystem) ### Device level management - Open source: Paramiko, NAPALM, Netmiko - API - RESTFul API - XML/JSON - ... - New comer: YANG Data Model [PyYAML] - Onboard management - Python + Linux + Container - Vendor provided SDK ### Controller-based management - Cisco - VMWare - Big Switch Network - OpenDaylight ### Network automation frameworks - Ansible - Hosts - Host Variables - Playbook - Use YAML ### Recommendations - Start small, say device level management - DevNet - Controller-based solution if applicable - Prefer open source - Ansible ### Closing with the preface ## 誰識KOL by 騰林 from Cathay ### Agenda - Visualize the time spending - Intro, SNA, NLP, Summary ### Social Network Analysis (SNA) - 蝦皮: facebook 讚 ### Collect from facebook - conversation - data selection logic ### How to evaluate the importance from a graph - neo4j - Degree - Closeness - Betweenness - PageRank - 網頁重要性排序 - example by Taipei MRT ### SNA 視覺化 - 連接多個KOL的節點特性 - 靠近並不代表支持 ### LDA分析技巧 - 去除重複 - 轉換成長文本 - 限制詞頻 ### Actionable item - 信用卡的使用情況,分析不同卡片的使用狀況 - 理專金流狀況 ## Corona-Net Fighting COVID-19 with ML by Lam from HK - Hospital are overwhelmed - Corona-Net - Binary Classification - Binary Segmentation - 3-Class Segmentation - Why Numpy - Parallelism and vectorization - Easy to prototype - Image Processing - Torchvision: tight integration with PyTorch - Albumentations: Biomedical Imaging - Why PyTorch - More research support - Customization - Model architecture - Classification detection segmentation - Classification - ResNets: shortcut connections - Global and local information - FPNs - Efficient-Net - Compound Scaling Method - Joint scaling of network 1)depth 2)width 3)input resolution - Segmentation - Fully convolutional networks - Encoder-decoder - U-Net - symmetrical contracting - Data Augmentation - Segmentation Evaluation - Evaluation metrics - Future development - Segmentation - Binary, Multi-Class (3-Classes)
{"metaMigratedAt":"2023-06-15T12:26:28.854Z","metaMigratedFrom":"Content","title":"PyCon 2020","breaks":true,"contributors":"[{\"id\":\"c0b6eb79-0233-4df6-940e-54f95df080b4\",\"add\":8247,\"del\":329}]"}
Expand menu