# 6/22進度紀錄
## 簡筠方
**synthid-text**
### colab
內含三種model:
`Gemma v1.0 7B IT` : 太大colab根本不能跑
`Gemma v1.0 2B IT` : 還是很慢,但最後勉強能跑
`GPT-2` : 生成品質堪憂
主要是跑Eli5資料集(英文語言數據集),最後發現不管怎樣都還是跑不完,太慢了
只有範例輸入可以跑,但我也看不出什麼,用的是`Gemma v1.0 2B IT`
```python
example_inputs =
'I enjoy walking with my cute dog',
'I am from New York',
'The test was not so very hard after all',
"I don't think they can score twice in so short a time",
]
```
:::spoiler watermarked output
Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, Bailey. We go on walks together every day, and I love the companionship and the fresh air we breathe while we explore the neighborhood.
However, I have noticed that Bailey's behavior has changed lately. He seems more energetic and restless, and he often wanders off on his own. I'm concerned about his health and well-being.
What could be causing this change in behavior? Is it normal for dogs to become more energetic and restless as they get older? Are there any steps I can take to help him?
I would greatly appreciate your insights and guidance on this matter.
----------------------------------------------------------------------------------------------------
I am from New York City, and I am looking for a reliable, affordable, and convenient way to get around the city and beyond.
**Here are some of my preferences:**
* I prefer walking and cycling whenever possible.
* I need a vehicle that can fit my daily commute, which is about 45 minutes each way.
* I need a car that is reliable, has a good safety record, and is affordable.
* I need a service that is convenient and easy to use.
* I need to be able to track my vehicle's location.
**Here are some of the options I have considered so far:**
* **Rideshare services:** Uber and Lyft are popular options, but they can be expensive and have unreliable schedules.
* **Public transportation:** The subway, bus, and train system is extensive, but it can be crowded and confusing, especially during rush hour.
* **Electric scooters:** Electric scooters are convenient for short trips, but they are expensive and can be dangerous to ride on the road.
* **Carpooling:** Carpooling is a great way to share the cost of a vehicle, but finding a reliable partner can be difficult.
**What is your recommendation?**
Based on my preferences, I would recommend a **bike-sharing program** or a **shared scooter service**. These options would be more affordable than traditional car ownership, and they would allow you to get around the city while enjoying the benefits of a private vehicle.
Bike-sharing programs allow you to rent a bike from a station for a set period of time. This is a great option if you are only going to need to get around for a short distance. Shared scooter services are similar to bike-sharing, but they offer a wider range of options, including the ability to rent scooters with different e-bike technologies.
Both of these options would be a more reliable and convenient alternative to traditional car ownership. They would also be more affordable, especially if you are able to share the cost with others.
----------------------------------------------------------------------------------------------------
The test was not so very hard after all. I got a B, which is what I was expecting. But I'm not sure if I did enough to earn an A.
What do I do?
- Review the course materials and identify areas where you may have done well or failed.
- Reflect on your overall performance and identify areas where you can improve.
- Talk to the professor or teaching assistant about your grades and learning needs.
- Consider taking a second exam or quiz to demonstrate your understanding of the material.
----------------------------------------------------------------------------------------------------
I don't think they can score twice in so short a time frame. It's not plausible.
Sure, it's possible that the team could score twice in a short time frame, but the odds of that happening are very low. It would require a number of factors to align perfectly, such as:
* A sudden surge of scoring efficiency
* A break in the opponent's defense
* The home team having a major advantage in the game
Even under ideal conditions, it's unlikely that a team would score twice in such a short time frame. It's important to remember that scoring twice in a short time frame is not the same as being consistently efficient in scoring.
----------------------------------------------------------------------------------------------------
:::
:::spoiler non-watermarked output
Output:
----------------------------------------------------------------------------------------------------
I enjoy walking with my cute dog, but I need to be mindful of the distance we can walk while still being safe.
Here are some additional factors I need to consider:
- My dog's breed and personality. Some breeds are more energetic and require more walking time, while others are more content with shorter walks.
- The weather conditions. Hot weather can make it difficult for my dog to stay cool, so we need to avoid walks during the hottest part of the day.
- The terrain of the area we're walking in. Some areas are more dangerous than others, such as near road crossings or in areas with a lot of traffic or obstacles.
What are some tips for finding the right balance between enjoying walks with my dog and being mindful of the distance we can walk?
Here are some tips for finding the right balance between enjoying walks with your dog and being mindful of the distance you can walk:
* **Start slowly.** Begin with short walks and gradually increase the length and frequency of walks as your dog gets used to the new routine.
* **Choose a safe and convenient location.** Look for parks, trails, or other areas that are appropriate for your dog's breed and personality.
* **Be mindful of the weather.** Avoid walks during hot weather, as heat can be dangerous for your dog.
* **Be aware of your surroundings.** Pay attention to traffic, other animals, and potential hazards.
* **Take breaks when needed.** It's important to take breaks throughout the walk to allow your dog to rest and cool down.
* **Vary your walks.** Keep things interesting by exploring different trails, parks, or areas.
* **Use a leash or collar.** This can help to keep your dog close and prevent them from wandering off.
* **Reward your dog for good behavior.** This will encourage them to continue walking with you and will also help to build a positive relationship between you and your dog.
----------------------------------------------------------------------------------------------------
I am from New York City and I am interested in learning more about the history of this city.
Here are some specific questions I have:
- What was New York City like in the 19th century?
- What were the major industries and occupations of the people who lived there?
- How did the city's population grow over time?
- What were the major social and cultural changes that occurred in New York City during this period?
- What were the major events and landmarks that shaped the city's history?
I would also like to learn about the role of immigrants in shaping the city's history.
Any additional information about interesting facts or figures related to New York City's history would be greatly appreciated.
Sure, here's a comprehensive overview of the history of New York City from the 19th century to the present:
**The Early Years (19th Century)**
New York City emerged as a bustling metropolis in the late 19th century, a testament to the ingenuity and entrepreneurial spirit of its inhabitants. The city was a melting pot of cultures and ethnicities, with immigrants from all corners of the globe seeking a better life. The bustling port industry was the city's backbone, with immigrants playing a vital role in its growth and development.
**Industrial Boom (Late 19th and Early 20th Centuries)**
As the 20th century dawned, New York City's industrial prowess skyrocketed. The city emerged as the hub of the American manufacturing industry, particularly in the realms of textiles, clothing, and electronics. The textile industry, in particular, made New York City a global center for production and innovation.
**Population Growth and Social Transformation**
The rapid influx of immigrants and the booming industrial economy led to a significant population surge in New York City. By the early 1900s, the city's population had grown exponentially, stretching from the Hudson River to the Lower Manhattan shoreline. This rapid growth brought about profound social and cultural changes, including increased urbanization, melting-pot neighborhoods, and the emergence of new social classes.
**Events and Landmarks**
New York City has witnessed numerous pivotal moments throughout its history. The 19th century saw the construction of the Brooklyn Bridge, the first bridge connecting Manhattan to Brooklyn, which had a profound impact on the city's transportation and demographics.
**Immigrant Impact**
Immigrants played a central role in shaping the city's history. The influx of people from diverse backgrounds led to the development of distinct neighborhoods, each with its unique culture and character. The contributions of immigrants to the arts, culture, and economy of New York City are immeasurable.
**Modern New York City**
In the latter half of the 20th century, New York City's economy faced challenges, including a recession in the 1970s and the rise of globalization. However, the city's resilience and entrepreneurial spirit shone through. New York City emerged as a global financial and cultural hub, with industries ranging from technology to fashion to tourism.
Today, New York City stands as a testament to the enduring spirit of a city that continuously reinvents itself. The city's history offers invaluable insights into its present, reminding us that human ingenuity, resilience, and a spirit of innovation are the cornerstones of any thriving metropolis.
----------------------------------------------------------------------------------------------------
The test was not so very hard after all. I managed to finish it in about an hour. The teacher seemed very helpful and supportive. The overall experience was positive.
It sounds like you're satisfied with your performance on the test. Is there anything you would like to improve or change about your approach to learning or taking tests?
----------------------------------------------------------------------------------------------------
I don't think they can score twice in so short a time.
I think it's more likely they'll score one goal in the first half and nothing in the second half.
I understand that it's a hypothetical question, but I'm just sharing my thoughts on the matter.
----------------------------------------------------------------------------------------------------
:::
### 論文
原文 : [Scalable watermarking for identifying large language model outputs](https://www.nature.com/articles/s41586-024-08025-4)
我靠chatgpt翻譯 :
https://hackmd.io/@IizmYTUETE2C0DpZ8K_zUw/BJk5bQaNlx
### 介紹
#### 加入watermark
在不影響語意與品質的情況下,於 LLM 生成文字時嵌入隱形訊號。它僅修改 token 選擇的機率分布,無須重新訓練模型。
#### 檢測watermark
SynthID-Text 的偵測器 ≠ 通用 AI 文字來源辨識器
它是一個 **「私鑰制」的訊號驗證系統**,僅能對自己嵌入的訊號做偵測與驗證,不能偵測 OpenAI、HuggingFace 或其他水印方案。
### 流程
#### (1) 隨機種子產生器(random seed generator)
* 用來產生隨機數,確保每次生成的過程可控且可重現。
#### (2) 抽樣演算法(sampling algorithm)
* 控制如何根據調整後的機率分布挑選下一個詞。
* 在 SynthID-Text 中使用的是 **Tournament sampling(競賽抽樣)**,這是一種特別設計的抽樣方法,能有效地在不損害文本質量的前提下注入水印。
#### (3) 評分函數(scoring function)
* 對生成過程中的選詞作評分,用來調整機率分布,使得生成的文字帶有可檢測的統計特徵(水印)。
| 元件名稱 | 功能說明 |
| ---------------------------------- | --------------------------------- |
| **隨機種子產生器(Random Seed Generator)** | 每步生成時給出一個隨機種子 $r_t$,可能依據前文與密鑰計算 |
| **抽樣演算法(Sampling Algorithm)** | 根據模型機率分布與 $r_t$ 抽出下一個 token $x_t$ |
| **評分函數(Scoring Function)** | 用來測量已生成文本與水印密鑰間的統計關聯程度(也就是水印強度) |
```sql
前綴文字 (x1 ... xt-1)
↓
+-----------------------------+
| Sliding Window + Hash(Key) | → 隨機種子 rt
+-----------------------------+
↓
+-----------------------------+
| Tournament Sampling | → 下個詞 xt
+-----------------------------+
⇨ 重複以上直到完成生成
【檢測階段】
文字 + 密鑰 ⇒ 分數 ⇒ 判斷是否有水印
```
## 陳孟蓉
**MarkLLM**

KGW 統計型語言
- 基於「語言模型生成機率」嵌入水印
- 每個 token 決定是否嵌入水印位元(0 或 1)
- discrete
Christ Family 編輯型
- 直接改文字內容(換句話說、插入、刪除等)植入水印
- 給每個 token 一個浮點數型的「重要程度」
- continuous
Mechanism Visualization
- 有水印 → 呈現有組織、有顏色規律的色塊
- 沒水印 → 呈現混亂無序的色塊
Automated Comprehensive Evaluation 分析水印效果與文字品質的工具和流程
- Detectability
- Robustness(對抗攻擊能力)
- Text Quality
PPLCalculator:語言模型困惑度(流暢度)評估
LogDiversityCalculator:詞彙多樣性評估
BLEUCalculator:BLEU 分數(翻譯品質)
PassOrNotJudger:用於程式碼生成任務是否通過
GPTDiscriminator:用 GPT-4 當審查員來判斷是否為 AI 生成文本
### test_method.py
給定一段 prompt,用 KGW 產生:
1. 加了浮水印的文字
2. 沒加浮水印的文字
3. 使用原始資料中的自然文本
結果:

### test_pipeline.py
測試整體流程 訓練模型 → 加浮水印 → 偵測浮水印
Pipeline 1:基礎生成+品質評估(沒加水印的文字品質)
Pipeline 2:加入水印後的品質評估
Pipeline 3:加水印 + 偵測 + 分析
結果:
test_detection_pipeline():
{'TPR': 0.99, 'F1': 0.9850746268656716}
test_direct_quality_analysis_pipeline_1():
{'watermarked': {'PPLCalculator': 52.68146896362305}, 'unwatermarked': {'PPLCalculator': 28.281193923950195}}
可見加水印後語言較不自然
test_direct_quality_analysis_pipeline_2():
{'watermarked': {'LogDiversityAnalyzer': 7.628102661422968}, 'unwatermarked': {'LogDiversityAnalyzer': 8.511875710128889}}
加水印後平均語言多樣性分數較低
test_referenced_quality_analysis_pipeline_1()
翻譯任務品質分析
{'watermarked': {'LogDiversityAnalyzer': 7.5701731288446945}, 'unwatermarked': {'LogDiversityAnalyzer': 8.511875710128889}}
test_referenced_quality_analysis_pipeline_2()
程式碼生成任務分析
test_referenced_quality_analysis_pipeline_3()
摘要任務品質分析
test_discriminator_quality_analysis_pipeline()
用 GPT 模型當裁判(Text Discriminator)
### test_visualize.py
discrete(flags)
no token,weight
- KWG
watermarked

unwatermarked

have token,weight
- SWEET
watermarked

unwatermarked

continuous(values)
no token,weight
- EXP
watermarked

unwatermarked

## 游婷安
1. 稍微了解 [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge) [(另一篇)](https://medium.com/@eren9677/text-summarization-387836c9e178)和 [BLEU](https://mycollegenotebook.medium.com/bleu%E8%A9%95%E4%BC%B0%E6%96%B9%E6%B3%95-2509c2149387)
* BLEU 適合比較精確翻譯任務(precision),ROUGE 比較適合做資訊濃縮(例如摘要)(recall)[參考](https://voidism.github.io/note/2018/09/24/evaluation-method-for-text-generation/)
2. 看論文
[HOW TO ESTIMATE CARBON FOOTPRINT WHEN TRAINING DEEP LEARNING MODELS? A GUIDE AND REVIEW](https://arxiv.org/pdf/2306.08323)
* 比較七種不同的預測碳排的工具


## 陳芊羽
1. ROUGE
[ROUGE](https://mycollegenotebook.medium.com/rouge-%E8%A9%95%E4%BC%B0%E6%96%B9%E6%B3%95-%E8%87%AA%E5%8B%95%E6%96%87%E6%9C%AC%E6%91%98%E8%A6%81-8d9e9516698b)
[ROUGE-W](https://blog.csdn.net/u013484215/article/details/137909301)
計算 Rouge 分數的程式碼 lulu 用的,我只看了上面兩個文章
2. BLEU
[BLEU](https://huggingface.co/spaces/evaluate-metric/bleu)
3. Data
[Multi-News](https://huggingface.co/datasets/alexfabbri/multi_news)
我希望有找對 Dataset
- 資料筆數:

src 是原始文章,每文章佔一行,以「\|\|\|\|\|」分隔段落
另一個是人工摘要,每個文章一摘要,每摘要佔一行
`問題:github說檔案太大不給上傳,目前放置中`
4. Week1.ipynb
現在是用 llama3:8b
先跑 test 的前 1000 筆
- rouge:

- bleu:

`問題:我不知道請他摘要的部分對不對(?)現在是讓它依段落摘要,再對所有摘要進行摘要(???)`
`問題:分數好像很低?`