Bcakbone model

LLaMA

LLaMA
lit-llama
LLaMA-Adapter
LLaMA: Open and Efficient Foundation Language Models (paper)
LLaMA 2

ChatGLM

ChatGLM-6B code
ChatGLM-6B_v2_huggingface
GLM: General Language Model Pretraining with Autoregressive Blank Infilling (paper)


ChatGLM-LoRA-RLHF-PyTorch code

T5 model

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
G ithub

Parameter-efficieny-Adapter

Towards a Unified View of Parameter-Efficient Transfer Learning
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Progress direction V1

Understanding of Noise \ HyperPrompt

Some method

Towards a Better Understanding of Noise in Natural Language Processing

HyperPrompt: Prompt-based Task-Conditioning of Transformers

CTG

HMMs

Prediction-Constrained Hidden Markov Models for Semi-Supervised Classification

Tractable Control for Autoregressive Language Generation

  • 多樣性和流暢度:在HMM中,可以使用不同的隱藏狀態來表示多種可能的生成情況,從而實現生成多樣性的文本。這可以通過在HMM的狀態轉換中引入更多的變異性來實現。
  • 語義和指稱一致性:在HMM中,可以設計狀態轉換和輸出概率,以便在生成過程中保持語義一致性和指稱一致性。這需要設計適當的模型參數來捕捉上下文的含義。
  • 限制生成範圍:在HMM中,可以通過限制狀態轉換和輸出概率來控制生成的範圍,從而避免生成無意義或荒謬的文本。
  • 唯一一篇說到CTG可以結合PCs(其中的HMMs)去實現
  • Datasets: CommonGen

CTG

Why is constrained neural language generation particularly challenging?

COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics

  • 透過Langvegin Dynamics產生的Sequence去跟一般LM產生的Seq去作Top-Mask。
  • 更好地處理限制條件之間的衝突或矛盾的問題,例如同時要求生成的文本既符合風格限制又符合語義限制

Effective Unsupervised Constrained Text Generation based on Perturbed Masking

COLLIE: Systematic Construction of Constrained Text Generation Tasks

Parallel Refinements for Lexically Constrained Text Generation with BART

Constrained Beam Search

Controllable Text Generation with Language Constraints

  • Balance the specified constraints and fluency

MultiControl_github

Inference GeLaTo

export PATH=$PATH:/home/Work/julia-1.9.3/bin/
目前想法:
問題/挑戰: 約束文本生成時難以處理多義性或歧義性,導致生成的結果不確定或不符合預期
多義性建模:利用概率電路來有效地建模生成過程中的多義性。
約束集成: 將約束信息集成到概率電路中,以確保生成的文本滿足特定約束。這可以是額外的約束條件,如語法規則、語境要求等。
不確定性量化:這可以通過概率分佈的形式,清晰地表示在某一步驟上多個可能的生成選擇,使得生成的文本結果更具概率性。
動態調整:在生成過程中動態調整模型的注意力或權重,以更好地適應特定的約束或上下文。

Comparision to tradition

  • Flexibility:
    • Traditional Task-Specific Approaches: Typically involve designing a task-specific model or system from scratch. This can be resource-intensive and less flexible when adapting to new tasks or changing requirements.
    • Constraint-Based Text Generation: Offers more flexibility as the same underlying model can be adapted to various tasks by adjusting the constraints. This versatility can be particularly advantageous in dynamic environments or when dealing with a range of tasks.
  • Resource Efficiency:
    • Traditional Task-Specific Approaches: Require collecting and annotating large amounts of task-specific data. Building and training models for each task can be resource-intensive.
    • Constraint-Based Text Generation: May leverage pre-trained language models, which have been trained on vast amounts of general data. This can significantly reduce the need for extensive task-specific datasets and training resources.
  • Scalability:
    • Traditional Task-Specific Approaches: Building and maintaining models for multiple tasks can be challenging to scale, especially when dealing with diverse tasks or domains.
    • Constraint-Based Text Generation: Offers scalability because the same model architecture can be reused across different tasks with adjustments made to the constraints. This can simplify the process of scaling to new tasks.
  • Rapid Prototyping:
    • Traditional Task-Specific Approaches: Developing a new task-specific model can be time-consuming, especially in cases where the task is not well-defined or evolving.
    • Constraint-Based Text Generation: Enables rapid prototyping and experimentation. Since the base model is pre-trained, adapting it to a new task involves defining the constraints, which can expedite the development process.
  • Transfer Learning:
    • Traditional Task-Specific Approaches: May not easily transfer knowledge learned from one task to another.
    • Constraint-Based Text Generation: Capitalizes on transfer learning as the pre-trained model brings general language understanding. This knowledge can be fine-tuned for specific tasks with the introduction of task-specific constraints.
  • Consistency and Compliance:
    • Traditional Task-Specific Approaches: Ensuring consistency and compliance with specific constraints or guidelines may require extensive manual effort.
    • Constraint-Based Text Generation: Offers a systematic way to enforce constraints, ensuring generated content aligns with predefined rules, standards, or domain-specific requirements.
Probabilistic circuits

Probabilistic circuits slide from UCLA
Paper from Probabilistic Circuits: Representation and Inference
Tractable Control for Autoregressive Language Generation

10/24 \ 11/27

10/24

Why is constrained neural language generation particularly challenging?

11/27

Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

  • We propose a probabilistic dialogue system for KGD that can be learned by approximate MLE with sequential posterior inference (SPI).

Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents
ATTENTION SATISFIES: A CONSTRAINT-SATISFACTION LENS ON FACTUAL ERRORS OF LANGUAGE MODELS
都是強調限制模型讓機器人行走安全

Constrained Decoding for Neural NLG from Compositional
Representations in Task-Oriented Dialogue

image

Others

REPLUG: Retrieval-Augmented Black-Box Language Models

Those contents above are previous idea..