Hum Qing Ze
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    AI Day Research : Prototype : Learning === ###### tags: `learning` ## "Swift for TensorFlow" & " Introduction to tf.keras and TensorFlow 2.0" - Paige Bailey - Google Brain Paige Bailey is the product manager for TensorFlow core as well as Swift for TensorFlow. Prior to her role as a PM in Google's Research and Machine Intelligence org, Paige was developer advocate for TensorFlow core; a senior software engineer and machine learning engineer in the office of the Microsoft Azure CTO; and a data scientist at Chevron. Her academic research was focused on lunar ultraviolet, at the Laboratory for Atmospheric and Space Physics (LASP) in Boulder, CO, as well as Southwest Research Institute (SwRI) in San Antonio, TX. ## "TensorFlow Lite: On-Device ML and the Model Optimization Toolkit" - Jason Zaman - Light Machine Learning at the edge is important for everything from user privacy to battery consumption. This talk will give an overview of the different strategies to optimize models for on-device inference: pruning, integer quantization with the model optimization toolkit. Then there will be a demo of all these techniques together to run a model on an EdgeTPU. Jason is the community lead for TensorFlow SIG-Build and an ML-GDE. He works as a machine learning engineer at Light doing computational photography in mobile cameras. Along with speaking regularly, he is also active in Open Source as a Gentoo Linux developer and maintainer of the SELinux Project. ## "Which image should we show? Neural Linear Bandit for Image Selection" - Sirinart Tangruamsub - Agoda Sirinart is a data scientist at Agoda. Before joining Agoda, she was a postdoctoral researcher at the University of Goettingen. She has extensive experience in the fields of computer vision and natural language processing at various startups and corporates. Her current areas of interests include personalization and recommendation systems. ## "XLNet - The Latest in language models" - Martin Andrews - Red Dragon AI Martin is a Google Developer Expert in Machine Learning based in Singapore - and was doing Neural Networks before the last AI winter... He's an active contributor in the Singapore data science community and is the co-host of the Singapore TensorFlow and Deep Learning MeetUp (with now with 3700+ members in Singapore). ## "Deep Learning on Graphs for Conversational AI" Sam Witteveen - Red Dragon AI Sam is a Google Developer Expert for Machine Learning and is a co-founder of Red Dragon AI a deep tech company based in Singapore. He has extensive experience in startups and mobile applications and is helping developers and companies create smarter applications with machine learning. Sam is especially passionate about Deep Learning and AI in the fields of Natural Language and Conversational Agents and regularly shares his knowledge at events and trainings across the world, as well as being the co-organiser of the Singapore TensorFlow and Deep Learning group. ## "TensorFlow Extended (TFX): Real World Machine Learning in Production" - Robert Crowe - Google Brain A data scientist and TensorFlow addict, Robert has a passion for helping developers quickly learn what they need to be productive. He's used TensorFlow since the very early days and is excited about how it's evolving quickly to become even better than it already is. Before moving to data science Robert led software engineering teams for both large and small companies, always focusing on clean, elegant solutions to well-defined needs. You can find him on Twitter at @robert_crowe ------- ## Paige Bailey: tf.keras webpaige@google.com Twitter: @DynamicWebPaige **internal google training material for google engineers** Alphafold paper ## The Internals of tf.keras ### Architecture * Engine * baselayer * base network - DAG of layers * model (network + training/evail loop) * Sequential * ### Layers and Models [Layers docs](https://keras.io/layers/about-keras-layers/) Everything is a layer: models are just layers Computation from inputs -> outputs batchwised computation can't mix eager exec and static manages states - training and inference mode supports "type checking" automatic compat checks frozen or unfrozen serailised and deserialized - soon Mixed precision layers DOES NOT DO: device placement no dataset no non-batch computation no outputless or inputless processing Gave an example of a canonical lazy layer - most layers you build look like that, you don't hardcode what the layer does yet `GradientTape()`something like a training loop `optimisers.[optimiser]()` ### Functional Models [Models docs](https://keras.io/models/model/) similar to how layers work too! Can nest also Model is a layer: but also provides access to training, saving and summary/model visualization Layer: "layer" or "blocks" in the literature Model: "model" or "network" model compile: substantial perf hit for eager ops because some parts of it are really not meant for eager exec compile: build spec: fit: go through data functional API -> create DAGs of layers build directed acyclic graph -> show linkage of model functional API is connectivity of layers : declarative models: no logic : all logic is contained inside the layers all debug is done at compile time - you merely define the thing static input compat check model saving model plotting auto masking good for model debug Can check the entire model history ### training and inference When you call `fit()` it runs an entire list of functions ### losses and custom metrics custom metric - init update state add_metric endpoint pattern (can also write your own loops) ## "Deep Learning on Graphs for Conversational AI" Sam Witteveen - Red Dragon AI DL: great for perception tasks, getting better at generative tasks: but how do we get _reasoning_ GPT2: seems to have dumped all the knowledge into weights: but knowledge as weights is inefficient Maybe use Graphs? like the nodal representation: undirected, digraph, weighted Knowledge graphs: the 'knowledge panel': "freebase" wikidata cyc DBpedia wordnet prospera NELL geonames gdelt Concept seems to be to train a model that does our google searches for us Symbolic AI - adding of knowledge into conflict Node - edge - node ( object-property-value) (subject predicate object) (RDF data) Informational Retrieval - Right knowledge at right time? - Custom Graphs? - What happens if I have missing information? DL for knowledge graphs: Getting knowledge out is hard - extraction ++ prediction on graphs: node classification v edge classification what's the right graph classification? Node regression mode? Facebook Ego Network: node value and regression ```graphviz digraph hierarchy { nodesep=1.0 // increases the separation between nodes node [color=Red,fontname=Courier,shape=box] //All nodes will this shape and colour edge [color=Blue, style=dashed] //All the lines look like this BarackObama->{Michelle} Michelle -> {Malia Sasha} } ``` So you can look up the Barack Obama node and check out the other properties ### Why are graphs hard? meaing and features are in the relationships, not the nodes no nice fixed position for each node edges can be directed non-euclidean space Inductive bias of DL: euclidean space - fixed space sequences - lack of breakthroughs for representation of this space How do we represent this input? node+edge embedding? adjacency matrixes - order n^2 deep walk - [node2vec](https://cs.stanford.edu/~jure/pubs/node2vec-kdd16.pdf) - random walks along the graph for N steps treat as sentence - use skip grams - use random walks as sentence - we don't know if this is the best representation [Graph-CNN](https://github.com/tkipf/keras-gcn): graph convolution: subgraphing of n nodes reachable from a given node - assumes nodes connected implies likelihood of similarity - loss on known nodes only - treat all nodes as undirected Kegra Kipf: keras-gcn Relational-inductive Bias, deep learning edge-node-global updating - looks at edges - gets embedding values - then nodes - and then a global update Transfer learning with graphs? graph-in-graph-out? predict a new graph ## TF eXtended: Robert Crowe @robert_crowe Real world ML in production Configuration, Data Collection, Serving Infrastructure, Process Managemet Tools, Analysis Tools,... TF-extended exists to do this [Ranking Tweets withTF](https://medium.com/tensorflow/ranking-tweets-with-tensorflow-932d449b7c4) ### Production Pipelining: TFX Production ML: - labelled data - feature space coverage - minimal dimensionality - maximum predictive data - fairness - rare conditions - data lifecycle management Classic problems don't go away: SE problems still sit around! ["Hidden Technical Debt in ML Systems" ](https://bit.ly/ml-techdebt) Data Ingestion - Validation - Feature Engineering - Train Model - Validate Model - Push if good - Serve Model [Apache Beam](https://beam.apache.org/documentation/) Component: Driver does job exec - Exec does work - Publisher updates ml.metadata (in a model validator) some configurator file pulls and writes back to the metadata store - based on dependency task-aware pipeline: transform-trainer (classic) ++++ TFX: training data - input data -> transform -> transformed data -> trainer -> trained models metadata store: contains artifacts and properties + Exec history of runs + Metadata-powered functionality - remember what was previously run (and what data they were run on) + carry-over states from previous model runs - caching of previously computed outputs Beam: unified batch and stream distributed processing API - SDKs in multiple languages + sets of runners You application lives for years - want to compare, you need the metadata to visualise what's happened Evaluator let's you check for individual slices in dataset. If one user isn't being served well, he's having a bad experience Model objective is nearly always a proxy for your business objectives World doesn't stand still Data is never what you wished you had ML Triangle Business realities - Bad Data -Model needs Improvement (Demographics? Insights? Processes?) What-if tool - run inference on your model TFX and Kubeflow pipelines: Kubeflow team takes TFX code and applies it to a Kubernetes environment ## Sirinart: How do we pick photos for Agoda? The Multi-armed bandit: AB testing but... extreme We know that each path has some expected reward - so we try everything! Exploration-Exploitation Thompson Sampling - updating posterior distribution with neural linear units to approximate the posterior distrib Bayesian Linear regression on the representation on the last layer of a neural network ## XLNET: Martin Andrews, Red Dragon AI ### Transformer Architectures Feed in tokens at the bottom - pass through layers - get result at the top Masked multi-self attention Sequential Attention: Turn the input into memory - "attend" to each portion - each step queries dot product to produce score - check match of query - feed into softmax to create attention distribution q,k,v: queries (what) keys (why choose) values (what you get when you choose) Transforming all of the stuff - modifies all of the input so that it's "more useful" for the stuff at the end -> take input, generate qkv, score them by dot product, softmax the value -> sum (this is each column) Learning the meaning of a word through its context in the sentence. Token and position embed: take words of the english language and then zip compress them into fragments - group words into each other then form a vocabulary - great way to form an infinite vocabulary Positional embeddings - each position in the input stream comes up with a kind of "sine-wave position" phase -> can compare positional differences by this phase difference Unsupervised training if I can: BERT: introspection - don't do one word at a time - mask out some of the words and have the model play fill in the blanks - non-predictive but more like analytical - feedback in all directions over predictive Reconfigurable output: don't need to retrain from scratch, but rather we can use it to understand text None of the text is labelled, it's just live data. Stop the sentence and then get it to predict ### What's new in XLNET from the same team that did BERT two streams of attention long memory like TransformerXL Loads of compute => Results+ Fixing the Masking Problem: multiple mask entries CLASH in BERT: words in setences aren't independent - but MASK appearances are - MASK never appears in RL data actual RL data is different from training data - better hiding: permutation process to omit certain tokens - rely on positional encoding to preserve order Solution: split the streams XL memory: need to make sure that the positional encoding 'joins up': train on whole words: not just tokens -> whole word gives BERT better results Abandon "next-sentence-or-not" tasks XLNet-Large - similar sizes to BERT-large Heavy-compute word generators ### 1-minute glosses Distil model to a CNN -> use model > train model to get answer Adapter modes - don't update original transformer - add in extra trainable layers -> "fix up" (which is effective) "Parameter Efficient Transfer Learning for NLP" Last Layer "graph layer" Multimodal learning -> MASK technique to "fill in" text and photos VideoBERT ## Swift for TensorFlow: Paige Bailey A next gen framework for ML: ML Arxiv papers exceed Moore's Law: rapid accuracy improvements [The Swift Programming Language](swift.org): Python-like but with typing: intuitive swift for tensorflow allows you to differentiate any function just by adding a decorator `@differenciable` functional approaches for swift: declare and then use optimizer to update syntactically similar to Kotlin: useful anywhere C++ can go Typed APIs; static detections for errors Interoperability: No wrappers: import and then call: import the c or c++ directly -> using a python wrapper limits you to the pythonic single threading works for python too: `import Python` and then using it: differentiable programming: language-integrated autodiff - any function in S4TF is differentiable ### Performance: speedy low level perf: parity with c++ thread-level scalability: no GIL -> no bottleneck in data ingestion process automatic graph extraction ## TFLite: On-device ML and Model Op Toolkit - Jason Zaman @perfinion jason@perfinion.com Use on phone because everyone has one,lots of data that you can use but not to send over to a server, get an immediate reply How it works is that it converts a model into TFLite Model optimization toolkit:

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully