HackMD - Collaborative Markdown Knowledge Base

## XENTROPY - End to end consulting on AI and large scale compute infrastructure Chan, Ka Hei Founder _chankahei@xentropy.co_ ex-NVIDIA data scientist / solution architect --- ## Background End to end AI consulting - Identify opportunities in applying AI technologies - AI awared project management and system architecture - AI Research and development - Code optimisation for GPU hardwares for development and production ---- Trusted by - Hong Kong Science and Technology Park - Hong Kong Hospital Authority - AIA Hong Kong - Manulife Hong Kong --- ## More Background System and infrastructure administration - On premise managed jupyter notebook - Batch job scheduling - Web access portal - Networking optimisation - User management ---- Trusted by - University of Hong Kong - Hong Kong Polytechnique University - Macau University of Science and Technology --- ## DEMO word autocompletion for clinical note - 3 million pieces of clinical notes are sampled and fed to customize a GPT-2 model - trained for 4 days with an 8-GPU NVIDIA DGX system for a single experiment - customised inference logic to achieve performance target - full stack end to end development Click [HERE](http://containers.xentropy.co:8082) to try it out! --- ## Industry Trend #### Centralised development of large AI model across multiple applications __OpenAI__ - ChatGPT __Google__ - PaLM __META__ - MultiRay ---- ### Emergency of large multimodality transformer models Transformer model are empirically proven to be effective in most data modalities, or even combinations of data modalities. - Tabular Medical Data [Hi-BHERT](https://arxiv.org/pdf/2106.11360.pdf) - Natural Language [ChatGPT](https://openai.com/blog/chatgpt/) - Computer Vision + Natural language [DALL-E 2](https://openai.com/dall-e-2/) - Reinforcement Learning [Decision Transformer](https://arxiv.org/pdf/2106.01345) ---- ### Major Advantages - Cost Amortization across many teams - Simpler development and operations - Faster research to production: Single-point acceleration - Improved hardware utilisation - Take advantage of deep neural network's scaling property ---- ### Cost Amortization - large upfront cost of training large scale deep neural network makes most but only the highly important application economically unviable - risk to return ratio - large models serves as a catalyst where activation cost of AI project is vastly reduced ---- ### Simplified development and operation - maintaining _ONE_ model is significant work - streamlined validation, feedback and retraining - elastic hardware allocation - dependency management - ... etc - maintaining _ONE HUNDRED_ model is impossible ---- ### Faster go to market time - reuse __knowledge__, __toolchain__, and __data__ acquired in previous projects ---- ### Hardware utilisation - sharing hardware resources always introduce overheads, which can be significant - divert data scientists away from actually producing high value models - centralised large model eliminates most of the overhead ---- ### Scaling characteristics of DNN - empirically observed under most circumstances deep neural network performance scales with compute capacity and data volume in inverse log relationship - more team sharing a model => more data and more compute capacity for the model => better performance --- ## Collaboration - Private consulting session - Joint application development - Join our community --- ## Discussion