owned this note
owned this note
Published
Linked with GitHub
<img src="https://github.com/clizarraga-UAD7/DataScienceLab/raw/main/images/UADLSquareLogo.png?raw=true" width=150>
::: info
# UArizona Data Lab Workshops - Spring 2025
:::
:construction: :construction: :construction: :construction: :construction:
***
## Content
## General Information
**Room Reservations:** Sci & Eng Library Room 212. Tue 1-3pm, Thu 12-3pm (Confirmed)
**Unique Zoom Link for all DataLab Workshops:** [https://arizona.zoom.us/j/89667081542](https://arizona.zoom.us/j/89667081542)
**Qualtrics registration for each Workshop:** ...
| DataLab Workshop| Registration |
| :-- | :-- |
| Research Productivity (3 - online) | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_cw3FdoEFy1SSp26/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_cw3FdoEFy1SSp26) |
AI Makerspace Meet Up| [Q Editlink](https://uarizona.co1.qualtrics.com/jfe/form/SV_5mRIgo8t54wO3Ii/edit) - [Q Reg link](https://uarizona.co1.qualtrics.com/jfe/form/SV_5mRIgo8t54wO3Ii) |
Bioinformatics | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_eUHXcEqBSFo44d0/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_eUHXcEqBSFo44d0)|
Classical Machine Learning | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_0CyWx6D43C7ZsmG/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_0CyWx6D43C7ZsmG) |
Data Science Tapas | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_brM5XGZHc4AhHgO/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_brM5XGZHc4AhHgO) |
Functional Open Science Skills for AI/ML Applications |[Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_9M475S7x4HkUWNg/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_9M475S7x4HkUWNg) |
Mastering GenAI Applications | [Q Edit Link](https://uarizona.co1.qualtrics.com/survey-builder/SV_0wWiJ946ta9ExzE/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_0wWiJ946ta9ExzE) |
NLP for All (online) | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_3pEBKSiN4ejcY86/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_3pEBKSiN4ejcY86) |
CyVerse Office Hours | [Q Edit link](https://uarizona.co1.qualtrics.com/survey-builder/SV_d0F8WzR8CjuF6Qe/edit) - [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_d0F8WzR8CjuF6Qe) |
CyVerse Webinars | [Q Reg Link 1](https://uarizona.co1.qualtrics.com/jfe/form/SV_cMggVcnCLwAWL6m) <br> [Q Reg Link 2] <br> [Q Reg Link 3] <br> [Q Reg Link 4]<br> *\*\***pending for 2-4**\*\** |
Container Camp | [Q Link]() **Pending TBU** |
| Getting started with Soteria | [Q Reg Link](https://uarizona.co1.qualtrics.com/jfe/form/SV_8l9nHhIzCyk2DVs) |
### List of known Workshops
1. [Functional Open Science Skills for AI/ML Applications](https://github.com/ua-datalab/FOSS_AI-ML/wiki) (Michele / ) (hybrid)
2. [Bioinformatics](https://github.com/ua-datalab/Bioinformatics/wiki) (Michele / Clement / Francesca / Simona) (hybrid)
3. [NLP for All](https://github.com/ua-datalab/NLP-Speech/blob/main/README.md) (Megh / ) (online)
4. [LLM Frontiers](https://github.com/ua-datalab/Generative-AI) (Enrique / Nick / Carlos )
<!-- 5. [Build a LLM from Scratch](https://github.com/ua-datalab/LLMs-from-scratch/wiki) (Carlos / Enrique / ) ([Greg's program plan](https://gchism.notion.site/Reading-group-LLMs-11cd24ce394d80f0b2a6d37302dd714e))
-->
6. [Research Productivity](https://github.com/ua-datalab/ResearchProductivity/blob/main/README.md) (Rudy) (online)
7. Soteria (Rudy / Michele) (online)
8. CyVerse Office Hours (Michele) (hybrid)
9. AI Applications in CyVerse Webinar (online) (Michele / )
10. Container Camp (Michele) (in-person. Spring break)
***
### Schedule
| Time | Mon | Tue | Wed |Thu | Fri |
| :--: | :-- | :-- | :-- | :-- | :-- |
| 9:00 |This day is reserved for **Consultations** | This morning is reserved for research on emerging technologies | | This day is reserved for research on emerging technologies | This day is reserved for research on emerging technologies |
| 10:00 | | [**Research Productivity Workshop**](https://github.com/ua-datalab/ResearchProductivity/blob/main/README.md) (Jan 28, Mar 18) | | [**Research Productivity Workshop**](https://github.com/ua-datalab/ResearchProductivity/blob/main/README.md) (Feb 20) | [**Cyverse Office Hours**](https://learning.cyverse.org/) (every other week) + [**CyVerse webinar**]() (every other week) |
| 11:00 | | | [**DataLab Meetings**](https://docs.google.com/document/d/1c4tf8dd055bwiFs13mE9F3TU-rEooxiKft61dpDL-DM/edit?usp=sharing) - every other week | | |
| 12:00 | | | | [**NLP for All**](https://github.com/ua-datalab/NLP-Speech) | |
| 13:00 | | [**Classical Machine Learning**](https://github.com/ua-datalab/MLWorkshops/blob/main/README.md) | [**Data Science Tapas**](https://github.com/ua-datalab/DataScience-Tapas/blob/main/README.md) | [**Mastering Generative AI Foundation Models for Research**](https://github.com/ua-datalab/Generative-AI/blob/main/README.md) | |
| 14:00 | | [**Functional Open Science Skills for AI/ML Applications**](https://github.com/ua-datalab/FunctionalOpenSourceSkills/wiki) | | [**Bioinformatics**](https://github.com/ua-datalab/Bioinformatics/wiki) | |
| 15:30 | | [**AI Makerspace Meet Up**](https://github.com/ua-datalab/AI-Makerspace) (Snakes & Lattes)| | | |
| 16:00 | | | | | |
<br/>
<br/>
| Activity | Dates | Instructors|
| :-- | :-- |:--|
|**DataLab Workshops** | **Week of Jan 28 - Week of Mar. 28 (8 weeks)** | |
| Research Productivity (3 - online) | Jan 28, Feb 20, Mar 18 @10-11AM9, / Apr 9, 10 & 11 @10AM-11:30 | Rudy S. |
AI Makerspace (In person consultations - Snakes&Lattes) | Tue @3:30PM| Enrique N., Carlos L. |
Bioinformatics | Thu @2PM| Michele C., Clement R., Francesca V., Simona M. |
Classical Machine Learning | Tue @1PM | Carlos L. |
Data Science Tapas | alternate Wed @1PM | _Various_ |
Fundamental Skills for Open Science | Tue @2PM | Michele C., Carlos L., Enrique N., Leonardo S. |
Mastering Generative AI Foundation Models for Research | Thu @1PM | Enrique N., Nick E., Carlos L. |
NLP for All (online) | Thu @12PM | Megh K., Mithun P. |
CyVerse Office Hours/Webinars | alternate Fri @10AM | Michele C. + |
Container Camp | (Spring break) | |
****
<!--
**Notes**:
**Tina Lee**:
* CyVerse seminars S2025 (every 3-4 weeks) (30-60min)
2/3 Apps development/workflows using CyVerse
1/3 How to use CyVerse / beginners-intermediate
YouTube (8 years ) 100 Webinars
S2025: Ilyoung (2), ...
* Rethink Webinars: HowTo, Podcasts, ...
* DSI + CyVerse Events Calendar
***
**Greg**:
* JetStream 2 Fellowship Pilot (AI in Healthcare for Researchers) - 9 months - Health Science researchers (CB2, ) - own dynamics Spring + Fall 2025
* Raschka's LLM from Scratch seminar with IS students - Spring 2025
***
**Rudy**:
* **Research Productivity Workshop**
I. (Jan 28, Feb 20, March 18: 10:00 - 11:30)
II. (Apr 9, 10, 11. 8:45-10:00)
* Soteria Marketing
* **CyVerse Health - Solution for Data Privacy and Compliance: Getting Started with Soteria**
I. Wed Feb 5 (9:00-9:30am)
II Wed Mar 19 (2:00-2:30pm)
***
**Michele**:
* Bioinformatics (with Clement Goubert, Simona Merlini, Francesca Vitali)
* Add new sessions: QTL (Feb 27), GWAS (Mar 6)
* Guest sessions: Advance RNA-Seq (Simona/Francesca, Feb 20-28), Transposable Elements (Clement, Mar 20), Current AI/ML implementations in Biosciences (Wheeler Lab)
* Notes to be updated upon confirmation from guests.
* [Foundational Open Source Skills for AI/ML Applications](https://github.com/ua-datalab/FunctionalOpenSourceSkills/wiki) Jan 28-Mar25, 8 sessions.
* FOSS for AI/ML is a non final name.
* FOSS for AI/ML additional notes [here](https://github.com/ua-datalab/mlpaths/wiki/Computer-Vision-and-Image%E2%80%90based-Learning)
* To add: list of covered software (Gradio, YOLO, ultralytics); image annotation notes (Label Studio, Roboflow).
* For Tapas: I'd like to cover Git/GitHub and Reproducibility.
***
**Megh**:
* NLP for All Workshop: Add sessions on annotations (human + automatic, organize, use AI tools), reorder topics, w/Mithun (8-9 sessions). Find infrastructure (CyVerse) for running an LLM, extra GPU clusters in CyVerse (PyTorch GPU), 1) Michele onboard session. 2) Handholding session, 3) streamline,...
***
**Enrique & Carlos**
_~~Build a LLM from Scratch~~_
Replicate Sebastian Raschka [book and code](https://github.com/rasbt/LLMs-from-scratch)

| Topic | Date | Description |
| :-- | :--: | :-- |
| PyTorch refresh and setup | Jan 28 | |
| Understanding large language models | Feb 4 | |
| Working with text data |Feb 11 | |
|Coding attention mechanisms | Feb 18 | |
| Implementing a GPT from scratch to generate text | Feb 25 | |
| Pretraining on unlabeled data | Mar 4 | |
| Spring break - NO session | Mar 8 - 16 | |
| Fine-tuning for classification | Mar 18 | |
| Fine-tuning to follow instructions | Mar 25 | |
***
**Carlos, ...**
Add additional sessions to the _LLM Frontiers Workshop_
**(Pending)**
(Replicate [Google/Kaggle Generative AI workshop](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/readme.md))
* Prompt Engineering ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_1_Prompting.ipynb))
* Document Q&A with RAG using [Chroma](https://docs.trychroma.com/) ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_2_Document_Q%26A_with_RAG.ipynb))
* Embeddings and Similarity Scores ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_2_Embeddings_and_similarity_scores.ipynb))
* Classifying embeddings ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_2_Classifying_embeddings_with_Keras.ipynb))
* **Function calling using Gemini API** ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_3_Function_calling_with_the_Gemini_API.ipynb))
* **Building agents using [LangGraph](https://www.langchain.com/langgraph)** ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_3_Building_an_agent_with_LangGraph.ipynb))
* Google Search grounding with Gemini API using [Google AI Studio](https://aistudio.google.com/prompts/new_chat) ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_4_Google_Search_grounding.ipynb))
* Fine tuning a custom model using [Google AI Studio](https://aistudio.google.com/app/tune) ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Day_4_Fine_tuning_a_custom_model.ipynb))
* Multimodal prompting using Gemini API ([Notebook](https://github.com/clizarraga-UAD7/Notebooks/blob/main/Google-GenAI/Bonus_Day_Extra_API_features_to_try.ipynb))
* [Google End-to-End Gen AI App Starter Pack](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/sample-apps/e2e-gen-ai-app-starter-pack)
* Protype chain of steps
* Integrate into the App
* Playground testing
* Deploy with CI/CD pipelines
* Monitor in production
***
LLM Applications & Related Topics
**(Pending)**
* Ollama
* Llama + OCR
* Multimodal RAG
* Code Generation for Scientific Computing (generating scientific computing code, particularly for data analysis and simulation)
* Scientific Literature Summarization (complex scientific papers)
* Parsing data from documents
* **Scientific Image Analysis (multimodal LLMs to interpret and annotate scientific images)**
* Structuring all language model responses
* Question answering for Docs
* Virtual assistants
* Classification (labeling)
* Data Analysis and Interpretation (interpret complex scientific datasets, generate insights, and suggest analysis strategies)
* **Data Labeling** (Labelbox, Labestudio, )
* **Leonardo exeriments** (YOLO, Roboflow, ) End-to-end Jetson project
* **Jeff/Tyson Drone's data**
* **POSE Estimation**
* Use of NVIDIA GPU in HPC/CyVerse
* CUDA/Numba
* RAPIDS, Polars, CUDA
***
### S2025 Workshop topic ideas
1. Ollama scale-up: Local, CyVerse, HPC (Use NVIDIA GPUs)
1. Use AI Verde in some example (OpenAI API)
1. Best practives of Prompt Engineering using AI Verde
1. Quick RAG application using AI Verde / HPC
1. Multimodal Q&A+OCR in AI Verde (?) (Docling, LLama 3.2, ...)
* Multimodal Embedding Models RAG
1. SQL specialized query code generation
1. Function calling with LLMs
1. Code generation assistants
***
Scratch space:
**Function calling using Gemini API** (Agents, structured output: SQL generation,.. )
**Building agents using [LangGraph](https://www.langchain.com/langgraph)** (Llamaindex)
**Llama multimodal** (Ollama local) - OCR
**U of A Usecases:** (Real examples - Research objective) access to a v100 Jetstream - CyVerse
* **Use Ollama across platforms** (run locally and scale up)
* **OCR example**
* **SQL** Jake Harwood (Communications)
* **LLM**
* **Fine Tuning. Use cases Amazon review**
*
**Things you wanted to do with LLMs but you were afraid to ask.**
-->