# The 3rd Augmented Intelligence and Interaction (AII) Workshop
###### tags: `shared` `workshop`
[TOC]
## About this workshop
+ This workshop is held by Prof. Min Sun from June 30 to July 1
+ Detailed information can be found in this page: http://aliensunmin.github.io/aii_workshop/3rd/
## Keynote speech
+ K1: On Adversarial Learning
- Traditional Model Learning
- Maximum likelihood learning (with regularization): $\theta = argmax \quad \mathbb{E}_q \log p \Rightarrow argmin \quad KL(q\Vert p)$
- Likely that the samples drawn from model distribution will NOT look realistic
- Model distribution will not have mass where there’s no support of the true distribution
- How about interchange p (model) & q (true)? (q is not available though)
- Likely to look realistic, but may only represent (very small) a subset of possible data
- Use entropy of p as regularizer to let p (model) spread out
- theta = argmax E_p log (q/p) = argmin KL(p||q)
- Problem: not knowing q => Is there a method to know q?
- Implementation details
- Critic: g(x) = log (q/p)
- Optimal classifier should be able to discriminate between real/fake samples
- i.e., GAN
- Q: What is the role for GAN in classification?
- GAN is a way to synthesize data
- Small unlabelled and large labelled learning => semi-supervised learning
+ K2: Object-Preserving Cross-Domain Image Translation for Adaptive Object Detection
- Domain adaptation in object detection
- Paired/Un-paired training images; un-paired case is more practical
- Multimodal image translation
- GAN, conditional GAN (to different domain), cycle GAN, AugGAN
## Invited speech
+ Meta Learning of Figure-Ground Segmentation
- Region-of-interest from user feedback (whether is in the region (yes/no))
- Feedback Segmentation using Transductive Learning (whether a point is in the region)
- SwipeCut (whether line is in the region of interest)
- Tap&Shoot (tapping focus)
- Learning by Editing (unsupervised learning with GAN): Visual-Effect GAN (VEGAN)
+ On Manageable Visual Storytelling
- Given photos => text story
- One story is not multiple image captions
- Cohesion & coherence, creativeness, visual: grounding
- E2E models are hardly manageable
- Preliminary results showed
- Model does NOT know how to describe things NOT in the training set
- Need more data (due to small datasets)
- Divide and conquer: image/scene understanding -> story generation
- Semantic layer in between these two steps: FrameNet Terms (verbs and nouns)
+ Network Representation Learning and its Applications
- Network embedding applications: user identification (sharing accounts) (SIGIR)
- Representing data as a heterogeneous net
- Nodes: items/meta info
- Find mappings for nodes to low-dimensional representation
- User ID as ground truth
- Hybrid account-user recommender
- MARINE (WWW)
+ Research at Taiwan AI Labs: Music AI
- Pipeline: audio in, audio out
- Source separation -> music transcription -> composition -> synthesis
+ Exploration via Flow-Based Intrinsic Rewards
- Curiosity-Driven exploration challenges
+ AutoML: Who is Designing Your Neural Net
- AutoML
- Human should focus on problem formulation
- AutoML in industry: Google cloud AutoML
- Neural Architecture Search (NAS)
- Automating architecture design
- Subfield of AutoML
- RL-based & EA-based approaches
- NAS: recent trends
- Multi-objective NAS
- Distribution of architectures
- Accelerating NAS
- What’s the next wave
- Federated learning + NAS
+ Towards Unsupervised Speech Recognition
- Why unsupervised learning?
- More than 7000 languages
- Labelling is labour intensive
- Acoustic token discovery
- Problem: token is not readable
- To speech recognition: need a table between tokens and texts
- Introduction of GAN: find a better and better mapping network
- Through a totally unsupervised learning
- Jointly learn token discovery and mapping table
- Learn from itself
- Pseudo labels
- Bootsrapping
- How about semi-supervised learning?
+ Goal-Driven-Based Speech Enhancement and its Applications to Assistive Hearing Device
- Speech enhancement
- Replacing the original norm objective constraints to other specific goals
+ A Semantic Approach to Abstractive Summarization
- Extractive/Abstractive
- LCSTS Chinese Text Summarization Datasets