PLUG AND PLAY LANGUAGE MODELS: A SIMPLE APPROACH TO CONTROLLED TEXT GENERATION

tags: `RL Group meeting` 112/3/28

Outline

Abstract
Introduction
Related Work
Plug and Play Language Models
Experiments, Results, and Evaluation
Conclusion

Abstract

Controlling attributes of the generated language is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining.
PPLM combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM.
Instead of training the PLM, PPLM wants to correct the output of the PLM by an additional attribute model so that it meets certain expectations.
The attribute model here can be either a bag of words to represent a certain topic or a trained mini-model to score a certain topic.
Then pass the gradient to the hidden states of the PLM through this attribute model, so that the hidden states can be corrected.
The PLM output will then be skewed towards the topic we want.

Introduction

We demonstrate the PPLM approach using a GPT-2 345M model as the general-purpose LM
$p (x)$
The method applies in any representation space from any transformer-based text generator and allows combination with any attribute model
$p (a | x)$ .

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

We introduce the Plug and Play LM for controlled language generation, discuss its relation to existing work, and how sampling from a PPLM works.
We quantify effectiveness using both automated evaluation as well as human evaluation.
We show that the PPLM approach can be used to c where generation of toxic content is likely by following the negative gradient of a model trained to detect toxicity.

Controlled generation
- Current models need to be separately fine-tuned for each specific attribute.
- Our method does not require retraining any conditional generative model, and both the language model and the conditional model can be flexibly assembled.
Noisy Channel Modeling
- Their approach translates a source language sentence
  $y$ into a target language sentence
  $x$ by first sampling from a forward model proposal distribution
  $p_{f o r w a r d} (x | y)$ and then reranking samples based on probabilities given by
  $p_{b a c k w a r d} (x | y) \propto p (x) p (y | x)$ .
- PPLM scores samples using the same basic equation, but we have no forward or proposal model
  $p_{f o r w a r d} (x | a)$ ,we rely on the latent space updates.
Weighted decoding
- Control with weighted decoding (WD) is difficult and often leads to sacrificing fluency and coherence.
- Sophisticated sampling methods can be used to constrain the model generation to certain keywords and topics.
Text Style Transfer
- A key difference between the above and our approach is that we use an offline discriminator and perform optimization based on this discriminator.

Plug and Play Language Models

1. Language Modeling with Transformers

Given a sequence of tokens
$X$ = {
$x_{0}, . . ., x_{n}$ }
$p (X) = \prod_{i = 1}^{n} p (x_{i} ∣ x_{0}, \dots, x_{i - 1})$
$H_{t} = [(K_{t}^{(1)}, V_{t}^{(1)}), \dots, (K_{t}^{(l)}, V_{t}^{(l)})]$ ,where
$(K_{t}^{(i)}, V_{t}^{(i)})$ corresponds to the key-value pairs from the
$i$ -th layer generated at all time-steps from 0 to
$t$ .
$o_{t + 1}, H_{t + 1} = LM (x_{t}, H_{t})$

2. Steering Generation: Ascending log

$p (a | x)$

We shift the history
$H_{t}$ in the direction of the sum of two gradients:
- one toward higher log-likelihood (LL) of the attribute
  $a$ under the conditional attribute model
  $p (a | x)$ .
- one toward higher LL of the unmodified language model
  $p (x)$ .
Combining these factors with a variable multiplier provides us with a controllable “knob” to guide generation in a given direction with a specified strength.
$∆ H_{t}$ is initialized at zero and updated with gradients from an attribute model that measures the extent to which the generated text possesses the desired attribute.
We rewrite the attribute model
$p (a | x)$ as
$p (a | H_{t} + ∆ H_{t})$ .
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
It will quickly result in unrealistic adversarial or fooling examples as the text moves into low probability regions.

3. Ensuring Fluency: Ascending log

$p (x)$

Kullback–Leibler (KL) Divergence
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Post-norm Geometric Mean Fusion
- it serves to constantly tie the generated text to the unconditional p(x) LM distribution.
- $x_{t + 1} \sim \frac{1}{β} ({\tilde{p}}_{t + 1}^{g_{m}} p_{t + 1}^{1 - γ_{g m}})$

4. PPLM provides two functionalities

A score that can be used to rank samples based on the LL of the desired attribute.
A gradient ascent direction to perform an update in the latent space.
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Experiments, Results, and Evaluation

We conduct an ablation study with four variants:
- B: the baseline,unchanged GPT-2 LM, sampled once.
- BR: B but sampled r times, with best sample chosen based on the LL ranking and filtering based on Dist score.
- BC: update the latent representations (Het) and then sample once.
- BCR: update the latent representations
  ${\tilde{H}}_{t}$ and generate r samples, choose the best sample based on the LL score.
As baseline approaches we consider:
- CTRL: a recent language model.
- GPT2-FTRL: a GPT-2 LM fine-tuned for human evaluated positivity with RL;
- WD: a weighted decoding baseline in which the B LM’s outputs are weighted directly toward maximizing
  $p (a | x)$ .
BoW Attribute Models
- The simplest attribute model we use gives the log of the sum of likelihoods of each word in some predefined Bag of Words (BoW).
- Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
Discriminator Attribute Models
- We optimize for a higher-probability of the sequence having a specific attribute by considering changes only to the next token to be generated.
- Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →

Conclusion

PPLM flexibly combines a large, pre-trained LM and a BoW or a small, easy-to-train discriminator.
PPLM achieves fine-grained control of attributes via a simple gradient-based sampling mechanism.

Appendix

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Reference

PLUG AND PLAY LANGUAGE MODELS: A SIMPLE APPROACH TO CONTROLLED TEXT GENERATION

tags: RL Group meeting 112/3/28

Outline

Abstract

Introduction

Related Work

Plug and Play Language Models

Experiments, Results, and Evaluation

Conclusion

Appendix

Read more

Contrastive Disentanglement for Coherent Empathetic Dialogue

Towards a Unified Framework of Contrastive Learning for Disentangled Representations, NIPS

How to measure hallucination

CONT: Contrastive Neural Text Generation

tags: `RL Group meeting` 112/3/28