LLM Examples - HackMD

# LLM Examples - Docs: - https://scicomp.aalto.fi/aalto/generative-ai-tools/ - https://scicomp.aalto.fi/triton/apps/llms/ - https://scicomp.aalto.fi/triton/discipline/machinelearning/ - Example repo: - https://github.com/AaltoSciComp/llm-examples #### Demo Plan (40 min) - General introduction to AI tools in Aalto (5 min) - https://scicomp.aalto.fi/aalto/generative-ai-tools/ - Focus on local llm usage on triton (15 min) - Where to find models - Ready to use envs on Triton and how to create conda env (refer to conda session of day 2) - Resources you need to request to run models - partition: (tricky sometimes) - GPU vRAM and how this is related to #parameters - Some models use operators that rely on newer CUDA features which are only supported by newer GPUs - New frameworks need higher GPU compute capability - system mem - num of cpus - Frameworks to run models - Where to find docs and example repos - Run examples (15 min) - Generation via transformers - Batch inference via vllm (if time allows) - Wrap up(2 min): where to get help and what kind of help we provide - Creating ones own conda env or figuring out the best practice with LLM frameworks can be a bit tricky. So don’t do troubleshooting alone—drop by our daily Garage session and ask. we’re here to help you choose the right tools and get the most out of them. :::info ## LLMs on Triton - Docs: - https://scicomp.aalto.fi/triton/apps/llms/ - Example repo: - https://github.com/AaltoSciComp/llm-examples :::