reference: https://www.youtube.com/watch?v=AVIKFXLCPY8&list=PLJV_el3uVTsPz6CTopeRp2L2t4aL_KgiI&pp=iAQB
# LLM (Large Language Model)
[TOC]
Large Language Models (LLMs) are a type of deep learning model
The more parameters, the more complex and powerful the model becomes.

Papper Source "Archive"
## Generative AI
Definition:
Generative AI refers to systems that can generate complex and structure objects.

In order to train model to discover the pattern or function, vast amount of data are required!
## Language Model (LM)

Examples of Language Models:
- chat GPT
- Gemini
## Prompt Engineering
Prompt Engineering is the process of crafting prompt to guide LLMs in generating desired response.

### Magic Prompt

Key steps for creating effective prompt
1. Let’s think step by step | Breaking down the thought process.
2. Explain the answer (Chain of Thought).
3. This is important to my life | Emotional manipulation as a prompt strategy.
4. It’s more effective to tell LLMs what to do rather than what not to do.
5. Rewards and penalties can be helpful in optimizing responses.
http://arxiv.org/abs/2312.16171
https://arxiv.org/pdf/2312.16171
### Premise & Provide example! (In-context Learning)
In-context learning involves giving examples within the prompt to improve the accuracy of LLM responses.

#### In-context Learning
Provide example help improve the correctness of responses.
For example, reminding GPT-4 to "read the example" can lead to more accurate answers

Gemini 1.5 has strong in-context learning capabilities.

### Breaking Down a Task
Ref: https://arxiv.org/abs/2210.06774
Recursive Reprompting & reversion
Breaking down complex tasks can be achieved through techniques like recursive reprompting and revision.

Similar to chain of thought technique.

GPT3.5 uses chain of thought technique by default
## Self-Reflection BY LLM
Self-reflection is valuable because it allows the model to identify errors.
LLM has a chance to discover the answer is wrong.
The reason is that generating the correct answer is harder than verifying the answer! (Like Human )

### Constitutional AI (Harmlessness Respons)
(Harmlessness form AI feedback)
This refers to AI ensuring its responses remain harmless, following predefined safety policies.
-> Policy Protection


## Self - Consistency
Since LLM responses vary due to randomness,
Multiple queries can be used to extract the best response, improving the accuracy rate.

https://arxiv.org/abs/2203.11171
### Tree Of Thought
Combines self-reflection and self-consistency to refine the reasoning process.

## Letting GPT Use Tools (DALL-E, etc.)
LLM can use different tool to answer complex question

One question cam using multiple tools to answer question

https://youtu.be/ZID220t_MpI?feature=shared
### Retrieval Augmented Generation (RAG)

In this method, GPT-4 determines when to use external tools, such as the internet, to provide accurate and relevant answers.
For instance, you could append “please use the internet to answer this question” to a prompt.
### Program of thought
Letting GPT write code, execute the program, and return the result.

### GPT Plug-in
This model uses GPT plug-ins to extend functionality and enable tool use.
## Model Cooperation
Combining models leads to better results (1+1 > 1). Models can complement each other’s strengths.

e.g. Frugal GPT
### Self-Reflection Across Models
Different module can reflect and collaborate to improve their result


### Multiple Agent & Multiple Debate
Using multiple AI agents to hold debates improves problem-solving. Each agent offers insights, and the debate continues until the best conclusion is reached.

### Debate Methods
Avoid ending a debate too quickly

How to determine the debate has finished

### Debate Prompt
Avoid ending a Debate too quickly

# LLM Training

## Step 1 (self-supervised learning.)

- Language knowledge
- World Knowledge
Expert knowledge
Domain knowledge
### Train Methods
Crawling Data from internet -> self-supervised learning.

### Cleaning Data
- Content filtering
- Filter Harmful Data
- Text Extraction
- Remove HTML tags
- Quality Filtering
- Remove Low Quality Data
- Repetition Remove
- Remove Duplicate Data
- Test-set Filtering
### LLM Capability
Parameters -> 先天資質
e.g.
GPT-1 117 M
GPT-2 1542M
GPT-3 175 B
Data Amount -> 後天學習
GPT-1 1G
GPT-2 40G
GPT-3 580G
## Step2
## Instruction Find-Tuning (SuperVised Learning)
-> Required Labels (USER: AI:)
e.g
USER:最帥的人是誰?
AI:侯
USER:最帥的人是誰?侯
AI:智
USER:最帥的人是誰?侯智
AI:晟
Conclusion: we can use the parameter form the self-supervised leaning as the initial parameters.
Next, Using artificial label to fine-Tuning the LLM parameters.
### Adapter
Adapter is a technique that didn't adjust the original Parameter instead it attempt append new parameter to affect the original function.
- Reducing Training Time
- Avoiding new parameters is far way original parameter.

Pre-Train -> Complex Rules
Adapter
- LoRA -> F(x)+G(x)
Key Point -> If we have great Pre-Train, it will get great result After we fine-tune


General Find-tune

Review Methods !
https://arxiv.org/abs/1909.03329v2
Instruct GPT

https://arxiv.org/abs/2203.02155
Instruct Fine-tune is a finish touch

-> In instruction fine tune -> the quality is impotent
### Reverse ChatGPT to gain Instruct Fine-tune Data
-> What kind of Tasks you can do ?
-> According the task to generate to probably user input(Question ).
-> According the User Input to generate to the Answer.
We got the -> '{Question} And {Answer}' (Generate By ChatGPT).
### Initial Parameters
LLaMA -> Open Source Pre-Train Model.
### Self Find tune
- Pre-Train: LLaMA
- Instruct fine-tune: Reverse ChatGPT to gain Instruct Fine-tune Data
## Step 3 (RLFH)
Reinforcement Learning form human feedback !
------------------------------------------
# [Papper - Research](/OWtbE0z5Qk-pe7UE_2HuCQ)