kevin su

@pingsutw

Joined on Feb 24, 2019

  • Overview SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution. Flyte can utilize SkyPilot to execute a Flyte task on any cloud and allocate a GPU for you. Goal Run a Flyte task on any cloud with a cheapest machine. Run Flyte task on spot instances with automatic recovery from preemptions. Example
     Like  Bookmark
  • Motivation Currently it is hard to implement backend plugins, especially for data-scientists & MLE’s who do not have working knowledge of Golang. Also, performance requirements, maintenance and development is cumbersome. The document here proposes a path to make it possible to write plugins rapidly, while decoupling them from the core flytepropeller engine. Goals Plugins should be easy to author - no need of code generation, using tools that MLEs and Data Scientists are not accustomed to using. Most important plugins for Flyte today are plugins that communicate with external services. It should be possible to test these plugins independently and also deploy them privately. It should be possible for users to use backend plugins for local development, especially in flytekit and unionML
     Like 2 Bookmark
  • PR: https://github.com/flyteorg/flytekit/pull/1782 Prerequisite Install Airflow and Flytekit RUN pip install apache-airflow RUN pip install google-cloud-orchestration-airflow==1.9.1 RUN pip install apache-airflow-providers-google RUN pip install jsonpickle pip install git+https://github.com/flyteorg/flytekit.git@873c60e505bb3492b5e3fa09840836b4d6775f4b#subdirectory=plugins/flytekit-airflow
     Like  Bookmark
  • Issues Discussion Motivation: Why do you think this is important? Currently flyteadmin notifications are delivered using the PagerDuty, Github and Slack email APIs. On AWS deployments FlyteAdmin uses SES to trigger emails, for all others the only alternative email implementation is SendGrid integration. Setting up SES or SendGrid can be somewhat complicated. Furthermore, asking your Flyte users to configure the aforementioned services with email integrations adds even more overhead. It would be simpler as an alternative to provide webhook integration for notification so that users only have to configure existing API keys for PagerDuty/Github/Slack. Flyte currently only allows sending notifications by email and requires users to explicitly define notification rules in their launchplans. FlyteAdmin Webhook
     Like  Bookmark
  • Save Model to submarine model registry by adding submarine.save_model(model, "tensorflow") in your script import tensorflow_datasets as tfds import tensorflow as tf from tensorflow.keras import layers, models import submarine def make_datasets_unbatched(): BUFFER_SIZE = 10000
     Like  Bookmark
  • Submarine SDK Overhaul Local cache (0.8.0) [P1] Download S3/hdfs data to local file systems before training Dataset API (0.9.0) [P1] We can leverage fsspec def exists(self, path: str) -> bool:
     Like 2 Bookmark
  • Build your dockerfile with your python script. Here is an example Lauch a training job from submarine workbench 3.(optional) Lauch a job from command line curl -X POST -H "Content-Type: application/json" -d ' { "meta": { "name": "mlflow-example",
     Like  Bookmark
  • MISC Record command: https://asciinema.org/ Install Mysql workbench: https://askubuntu.com/questions/1230752/how-can-i-install-mysql-workbench-on-ubuntu-20-04-lts Request to submarine hitcount server for i in {1..1000000}; do curl -s https://hits.dwyl.com/apache/submarine.svg; done > /dev/null 2>&1 Create a python virtualenv sudo apt-get install python3-distutils
     Like  Bookmark
  • Install Package pip install pysubmarine Submarine sandbox management submarine sandbox start # create submarine sandbox submarine sandbox start --version 0.6.0 submarine sandbox delete # delete submarine sandbox Submarine get resource submarine get experiment <id>
     Like  Bookmark
  • HackMD – LaTeX 語法與示範 Paper Clipper: A Low-Latency Online Prediction Serving System TensorFlow-Serving: Flexible, High-Performance ML Serving A Case for Serverless Machine Learning TICTAC: ACCELERATING DISTRIBUTED DEEP LEARNING WITH COMMUNICATION SCHEDULING PipeMare: Asynchronous Pipeline Parallel DNN Training TFX: A TensorFlow-Based Production-Scale Machine Learning Platform
     Like  Bookmark
  • http://www.guide2research.com/conference/ Conference 2021 Name H5-index Deadline icml 135
     Like  Bookmark
  • import dependencies from __future__ import print_function import time import swagger_client from swagger_client.rest import ApiException from pprint import pprint from swagger_client import JobLibrarySpec from swagger_client import JobTaskSpec from swagger_client import JobSpec
     Like  Bookmark
  • MISC activation function ![hello](https://cdn-images-1.medium.com/max/1600/1*rIiBaH5IMVPaE5BM-n7VZw.png =80%x) over/under-fitting 解決high variance/over-fitting:增大樣本集。Variance的引入可以理解成樣本集不夠全面,訓練樣本的分佈與實際數據的分佈不一致造成的。擴大樣本集,甚至使用全量的數據,可以盡量使得訓練集與應用模型的數據集的分佈一致,減少variance的影響。 解決high bias/under-fitting:提升模型的複雜性。通過引入更多的特徵、更複雜的結構,模型可以更全面的描述概率分佈/分界面/規則邏輯,從而有更好的效果。 The plot below gives us a clear picture — as the predicted probability of the true class gets closer to zero, the loss increases exponentially:
     Like  Bookmark
  • contributed by < pingsutw > 重新回答第 2 周測驗題 解題思路 xs_new 分別依據 len 大小,來判斷新的字串要存 heap or stack, 當字串大於 16 時,資料存在 heap,所以這邊 AAA 代表 16 xs *xs_new(xs *x, const void *p) { *x = xs_literal_empty();
     Like  Bookmark
  • contributed by < pingsutw > 開發環境 $ uname -a Linux kevin 5.0.0-38-generic #41-Ubuntu SMP Tue Dec 3 00:27:35 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0 $ lsb_release -a
     Like  Bookmark
  • contributed by < pingsutw > 開發環境 $ uname -a Linux kevin 5.0.0-38-generic #41-Ubuntu SMP Tue Dec 3 00:27:35 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ gcc --version gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0 $ lsb_release -a
     Like  Bookmark