# NLP with AI tools
---
# 熟悉工具
- **Huggingface**
- Vector Database
- Agent
---
# Huggingface 🤗
----
## 使用 HuggingFace Hub
<iframe width="560" height="315" src="https://www.youtube.com/embed/rkCly_cbMBk?si=9E8VrS_Mqg1faB00" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
- Models/Spaces/Datasets:[就是 Git repo](https://huggingface.co/docs/hub/repositories-getting-started)
- 如何同時維護:[Github Action](https://github.com/marketplace/actions/sync-with-hugging-face-hub)
- [`git remote add`](https://w3c.hexschool.com/git/fd426d5a)
----
## 使用 Huggingface 上的 Model
```
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="mistralai/Mistral-7B-v0.1")
```
```
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
```
----
## 使用 Huggingface Space
- [Python web app hosting](https://huggingface.co/spaces/launch)
<img src="https://huggingface.co/front/assets/spaces-launch-page/streamlit-logo.svg" style="background:white;"></img>

---
# 熟悉工具
- Huggingface
- **Vector Database**
- Agent
---
# 向量資料庫
- 設計來儲存向量格式的資料
- 提供快速的向量查詢(距離計算)
- 分散式架構、高可用性(HA)⋯等等
- [REF](https://courses.edx.org/asset-v1:Databricks+LLM101x+2T2023+type@asset+block@Module_2_slides.pdf)
----
## Pinecone
- [Article Recommender in Typescript](https://docs.pinecone.io/docs/personalized-content-recommendations-typescript)
- [Full code](https://github.com/pinecone-io/recommender-example-typescript)
<img src="https://cdn.sanity.io/images/vr8gru94/production/e88ebbacb848b09e477d11eedf4209d10ea4ac0a-1399x537.png" style="background:white;"></img>
----
## Milvus
- [Run Milvus using Python](https://milvus.io/docs/example_code.md)

----
## weaviate
- [Quickstart Tutorial](https://weaviate.io/developers/weaviate/quickstart)

----
## PostgreSQL
```
CREATE EXTENSION vector;
```
- [PostgreSQL extension: `pgvector`](https://github.com/pgvector/pgvector)
- [`pgvector` on Azure Database for PostgreSQL](https://learn.microsoft.com/zh-tw/azure/postgresql/flexible-server/how-to-use-pgvector)
```
-- create table
CREATE TABLE tblvector(
id bigserial PRIMARY KEY,
embedding vector(3)
);
-- query top 5
SELECT * FROM tblvector
ORDER BY embedding <-> '[3,1,2]'
LIMIT 5;
```
----
## Azure Cognitive Search
- [Quickstart: Vector search using REST APIs](https://learn.microsoft.com/en-us/azure/search/search-get-started-vector)

---
# 熟悉工具
- Huggingface
- Vector Database
- **Agent**
---
# Agent 代理人
<iframe width="560" height="315" src="https://www.youtube.com/embed/fqVLjtvWgq8?si=fUzcNAQ8YDr8bMb2" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
[影片來源:推特](https://x.com/swyx/status/1672686090589990912?s=20)
----
## Agent
- [Why You Need To Know About Autonomous AI Agents](https://www.kdnuggets.com/2023/06/need-know-autonomous-ai-agents.html)
- RAG: Retrieval Augmented Generation
- ReAct: Reason + Act
- [REF](https://courses.edx.org/asset-v1:Databricks+LLM101x+2T2023+type@asset+block@Module_3_slides.pdf)
----
## RAG

- https://zhuanlan.zhihu.com/p/655363719
- https://www.promptingguide.ai/techniques/rag
----
## Chain

----
## Chain

[ref](https://ai.plainenglish.io/using-langchain-chains-and-agents-for-llm-application-development-d538f6c70bc6)
----
## ReAct

[ref](https://tsmatz.wordpress.com/2023/03/07/react-with-openai-gpt-and-langchain/)
----
## Agent
---
# Wrap it up
- NLP with ~~AI~~ LLM tools
- LLMOps
---
## 推薦課程
- [Large Language Models: Application through Production](https://learning.edx.org/course/course-v1:Databricks+LLM101x+2T2023/home)
- [Lab repo](https://github.com/databricks-academy/large-language-models)
{"title":"week_7_8_nlp_with_ai_tools","description":"week 7 & 8","contributors":"[{\"id\":\"f86386aa-f010-402c-b40f-4d1d7d6afa8b\",\"add\":5938,\"del\":1091}]"}