# fmri2text
- mind retreat
- https://gitlab.inria.fr/parietal/non_continuous_decoding
- design matrix of LM features, BOLD encoding
- generative model, missing predicted / actual bold comparison
## 24 Octobre 2022
- init github repo
- [x] refactor utils file @athual
- [x] fix fir util @ccaucheteux
- [x] run decoding on multiple TRs @ccaucheteux
- setup narratives on dragostore
- [x] get mean subject left and right
- [ ] script datalad configuration
- certificate error preventing download
- project narrative subjects on FUGW barycenter
- alignment data
- can it be the same as the encoding data?
- idea
- PCA on LM features, trained on text that is independant from our transcripts
- align subjects using beta coefficients of encoding model fitted on PCs
- this would allow alining subjects who listened to different stories?
- modules
- barycenter
- inputs
- list of subjects
- v0 who were scanned listening to the same story
- v1 who have been scanned listening to different stories
- story name
- outputs
- OT plan from each subject to barycentric subject
- subject's features transported to barycentric subject
- PCA transformer fitted on LM features
- alignment
- inputs
- subject who was scanned listening to the story used to compute barycenter
- barycentring subject (BOLD)
- PCA transformer fitted on LM features
- outputs
- OT plan from subject to barycentric subject
- subject's BOLD transported to barycentric subject
- encoding
- inputs
- list of aligned BOLD matrices (n_tr, n_voxels)
- transcript for each matrix
- outputs
- encoding transformer
- Alexis:
- avg (exp 1): {story: bold}. bold of shape `[tr, voxels]`
- single (exp 3-5): Fugw().transform() : `[tr, voxels] -> [tr, voxels]`
- Charlotte:
- debug features in decode/encode (notebook)
- go all to scripts
- `utils.py` (`def build_design_matrix()`)
- `encode.py`
- `decode.py` (decode, decode_one_step)
- `0_run_exp_avg_subject.py`
- encoder trained on voxel-wise average of multiple subjects, eval on new story
- `1_run_exp_fugw_avg_subject.py`
- encoder trained on voxel-wise average of multiple sujects after they have been aligned with FUGW, eval on new story
- `2_run_exp_one_subject.py`
- encoder trained on one subject, eval on same or different subject
- `3_run_exp_one_subject_fugw.py`
- encoder trained on one subject, eval on different subject aligned with fugw
- `4_run_exp_concat_subject.py`
- encoder trained on concatenated subjects, eval on left-out subject
- `5_run_exp_concat_fugw_subject.py`
- encoder trained on concatenated subjects after they have been aligned with FUGW, eval on left-out subject
- plusieurs histoires
- out of story generalisation
- autre generative LM
- Mardi 25 Octobre / Mercredi:
- parle `paths.py`
- `eval.py`
```python
# eval.py
def eval(true_texts, decoded_texts):
"""
true_texts: list of str of shape [n]
decoded_texts: list of str of shape [n]
"""
return bleu, meteor, wer, bert
```
## 25/26 Octobre 2022 (Charlotte)
**Questions pour le 26:**
- definir la prbmatique/contrib + exp + outputs (e.g. plots)
- on aimerait l'eval sur quoi exactement (combien de TRs)
- Questions sur le papier de Huth:
- are we sure that its model is not multi-subjects?
- combien de TR pour l'eval?
- combien de candidates K?
- quel model generatif de langage?
TODO:
- [x] start FIR at TR=1 or 2
- [x] add WER in eval
- [x] avg, out of story. Test on a ~1,000 words story.
- [x] Fix eval (predictions should be a long string, or at least several TRs.)
- [x] single subject
- [x] concat subject
- [ ] code Beam decoder
- [ ] add encoder eval
## Huth paper
**Important points to discuss:**
- code/data availability soon
- cross-subjects story perception. They *do* align subjects (supp table 4).
- eval:
- they show short segments. But say they eval by generating 1,800 words in a row??
- empty starting point + no context at all? Strange no?
- written: *"The language 628 model is provided with the words that occur in last 8 seconds of the candidate"*
**Eval Setups (always out of story)**:
- story perception
- story imagination
- movie watching
- cross-subjects story perception: Sup Table 4
**Methodological points to include**:
- [x] starting at t=2s (start_tr=1 or 2)
- [x] add WER in eval
- [x] add Beam in decoder
## 28 Octobre 2022 (Charlotte)
Handling of infinite/nan values in decoding:
- how to proceed if nan / infinite / negative values in scores: how to score ? how to normalize?
- z_score / softmax
## 29 Octobre (Charlotte)
- TODO:
- [x] check NLL in generate / check infinite values in NLL / decide how to normalize (softmax for NLL, min_max for brainscore)
- [x] Multi-story
- [ ] Check partial_fit sklearn
- [ ] Organise repo.
- `models/`
- `feature_extractor.py` : from array of text to design matrix. .fit() .transform()
- `bayse_decoder.py`: from (context, bold) to text
- `end_to_end_decoder.py`: from (context, bold) to text
- `align/`
- `fugws.py`
- `baseline_aligners.py`: from X,y,subjects to X,y. .fit(), .transform()
- `data/`
- `narratives.py`: Dataset with `__getitem__(session)` and .sessions, .metadata, .subjects, .stories / .tasks
- `loader.py`
- `concat_datasets.py` From per-session dataset, to multisessions, subjects datasets ()
- `experiments/`:
### 31 Octobre (Charlotte)
TODO:
- [x] Regarder resultats avg_sub
- [x] Lancer single_sub
- [ ] Lancer multi-sub. Check a quel point on est far away en RAM pour du multisujet dans un seul model
- [ ] Clean pipeline? (avec un models/)
- [ ] Dans models, introduire partial_fit_estimator
- [x] Article La Recherche
- [ ] Poster Neurips
### 3 Novembre (Charlotte)
- [x] launch with start_tr=2 to have a fair baseline model
- [x] Add error bars in eval to select the best predicted sentences
- [ ] Launch on all subjects with SGDRegressor (RAM out of memory => need to pass in partial_fit and loader)
- models: add subject embedding layer
- skearn based with custom SGDRegressor? OK for SGDRegressor
- torch based
- [ ] TorchDataset and concat_collator
- [ ] Factorize Models
- [ ] Add linear torch model
- [ ] HYDRA
### 7 Novembre:
**scale exp**
- [ ] torch dataset and batch loader
- [ ] partial_fit sklearn estimator
- [ ] nn.Linear model with torch lightning
- [ ] subject embedding layer
**check current**
see what is best as a metric to score language generations
- what voxel_mask / weights? (var mask, R mask, R2 mask)
- MSE or other? what could be used?
## 31 Octobre (Common)
**Summary**
Next meeting: jeudi 10
Next steps => 10:
- Alexis + Alex: alignement (avg subject then beta values)
- Alex:
- [ ] intégrer IBC LPP dans data
- [ ] vérifier que les données sont clean
- Charlotte:
- [x] *pipeline* sans alignement avg_sub + single_sub (un seul sujet)
- [ ] *results* sans alignement avg_sub + single_sub
- [x] regarder comment adapter pour gros dataset *multisub* (torch/partial_fit de sklearn) (mercredi)
- [x] add *Huth data* (jeudi)
- [ ] *clean* pipe avec `models/` `data/` dans une nouvelle branche (vendredi)
- [ ] jeudi: regarder les resultats + debug Huth dataset (with Alexis)
- [ ] vendredi: clean pipe avec `models/`
**First results on average subjects**
**Methods**
- For each story, average bolds across subjects
- Fit on n-1 story, test on last story
- Evaluate encoding and decoding on the left out story
- Encoding eval: R score for each voxel
- Decoding procedure:
- start with empty context.
- given context, generate n_gen=30 possible continuation using gpt2 (for now, we use the *true* number of words to generate),
- compute the brainscore of each possible generation (MSE this time because we want one measure per sample)
- keep the n_beam=10 best in terms of brainscores ("lm_and_brain", blue), or perplexity ("lm_only", red).
- update context with the segments kept
- Decoding eval: bleu / rouge / bertscore / meteor
**Encoding results**
<img src="https://i.imgur.com/4sDM4YE.png" alt="drawing" width="200"/>

**Decoding results**

**Decoding grid**
default params: {'n_beam': 10, 'n_gen': 30, 'start_tr': 0, 'decode': 'beam', 'n_steps': 100}

### Experiment example
```python=
def run_exp():
# Data
X, y, groups = NarrativeDataset().dataset # X=bolds, y=texts, groups=subjects/stories
train, test = train_test_split(X, y=y, groups=groups)
# Model
model = Model(**model_params)
# -- Train --
model.fit(X[train], y[train], groups[train], **fit_params)
# -- Eval --
# eval fitting metrics
run_eval_encoder(model.encoder, X[test], y[test])
# eval decoded texts
predicted_texts = model.predict(X[test])
metrics, df = run_eval_texts(predicted_texts, y[test])
return model,
```
### Feature extractor class
```python=
class FeatureExtractor(BaseTransformer):
def __init__(self):
self.fir = FirTransformer()
self.nlp_tokenizer = AutoTokenizer.from_pretrained(conf.nlp_model_name)
self.nlp_model = AutoModelForCausalLM.from_pretrained(conf.nlp_model_name)
def fit(self, X, y=None):
return self
def transform(self, texts, y=None):
features = build_design_matrix(texts, model=self.nlp_model, self.nlp_tokenizer, **params)
return features
def fit_transform(self, X, y=None):
self.fit()
return self.transform(X, y=y)
```
### Model class
```python=
class SklearnEncoder(object):
def __init__(self, conf):
self.y_pipe = self.build_y_pipeline(conf)
self.pipe = self.build_pipeline(conf)
def build_pipeline(self, conf):
steps = [
("scaler", StandardScaler()),
("ridge", RidgeCV(np.logspace(-3, 6, 10)))
]
return Pipeline(steps)
def build_y_pipeline(self, conf):
y_steps = [
("scaler", RobustScaler(quantile_range=(0.1, 99.9))),
]
return Pipeline(y_steps)
def fit(self, X, y):
y = self.y_pipe.fit_transform(y)
self.pipe.fit(X, y)
return self
def predict(self, X):
y_pred = self.pipe.predict(X)
y_pred = self.y_pipe.inverse_transform(y_pred)
return y_pred
class BayseModel(object):
def __init__(self, conf):
# Feature Extractor (text -> design matrix)
self.feature_extractor = FeatureExtractor(**conf.feature_params, **conf.fir_params)
# Encoder
self.encoder = SklearnEncoder(**conf.encoder_params)
# Decoder
self.decoder = BayseDecoder(**conf.decoder_params)
def fit(self, bolds, texts):
# Learn encoding matrix
features = self.feature_extractor.fit_transform(texts)
self.encoder.fit(features, bolds)
def decode(self, bolds, prev_texts, **decoder_kwargs):
"""
Generate new text given a series of bolds and one context
"""
decoded = _decode(self.encoder, bold, context, **decoder_kwargs)
return decoded
def evaluate(self, bolds, texts, start_tr=1):
"""
Eval encoder (R)
Generate sequences given start_tr
Eval generations
"""
encoder_metrics = _run_eval_encoder(self.encoder, bolds, texts)
decoded_texts = self.decode(bolds[start_tr:], prev_texts[:start_tr])
decoder_metrics = _run_eval_texts(decoded_texts, texts[start_tr:])
decoder_metrics["context"] = prev_texts[:start_tr]
return decoder_metrics
def save(self):
# Only need to save the sklearn encoder params
save_model(self.encoder)
save_conf(nlp_conf)
def load(self, file):
rebuild
```
## Code base
utils.py can be split into:
- features.py (get_features, generate_continuations)
- fir.py (design_matrix, apply_fir)
- data.py (get_data)
fmri/narratives.py should be renamed into data/narratives.py. Other files in fmri/ can be deleted except from exclude_scans
fmri/narratives.py should disappear and be merged with some data utils
## Datasets
Huth: https://openneuro.org/datasets/ds003020/versions/1.0.2
## From events to design_matrix @athual
```python
import pandas as pd
import numpy as np
def get_texts_from_events(events, extra_trs=5, text_col="word_raw", tr=2):
"""
events a dataframe of shape [n_words]. with columns:
- onset
- word_raw (or text_col)
"""
# Load eventsulus
events = events.dropna(subset=[text_col])
# Aggregate eventsulus by fMRI scans
events["scan"] = (events["onset"].interpolate().astype(float) // tr).astype(
int
)
assert not events["scan"].isna().any()
events = events.groupby("scan")[text_col].agg(lambda x: " ".join(x).strip())
# Subselect non empty scans
min_tr = int(events.index.min())
max_tr = int(events.index.max() + extra_trs)
events = events.loc[min_tr:max_tr]
# Re-align scans and text
text = np.empty(max_tr, dtype=f"<U{events.apply(len).max()+1}")
for time_frame in events.index:
text[time_frame] = events.loc[time_frame]
text = text[min_tr:]
return text
if __name__=="__main__":
from transformers import AutoTokenizer, AutoModelForCausalLM
from src.utils.utils import get_features, build_design_matrix
events = pd.DataFrame({"onset": np.arange(0, 100, 0.5),
"word_raw": np.arange(200).astype(str)})
# Aggregate text by TR
texts = get_texts_from_events(events)
# Load model
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Get features GPT-2
features = build_design_matrix(
texts, model=model, tokenizer=tokenizer
)
print(features.shape)
```
## Example to load beta coef for one subject @athual
```python=
import numpy as np
import torch
from sklearn.model_selection import KFold
from transformers import AutoModelForCausalLM, AutoTokenizer
from src.encode import encode
from src.utils.data import get_bolds, prepare_data
def get_beta_coef(
subject,
model_name="gpt2",
device="cpu",
tr=1.5,
):
# Get data
print("Loading data")
bolds = get_bolds(subject)
texts, bolds, stories = prepare_data(
list(bolds.keys()), list(bolds.values()), tr=tr
)
# Define splits
cv = KFold(shuffle=False)
train, test = list(cv.split(texts, bolds))[0]
# Init nlp model
print(f"Loading {model_name} model")
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
if torch.cuda.is_available():
model.to(device)
# Encoding
print(
f"Fitting encoder on {len(train)} samples,"
f" {len(np.unique(stories[train]))} stories."
)
encoder = encode(
texts[train],
bolds[train],
nlp_model=model,
nlp_tokenizer=tokenizer,
bold_pca=0,
)
# Extract weights
weights = encoder["encoding_model"]["ridge"].coef_
return weights
if __name__ == "__main__":
weights = get_beta_coef("sub-004")
print("Weights of shape [voxels, dim_gpt x n_fir_delays]: ", weights.shape)
```
## FUWGS
- SRM to get avg subject
- beta to get single subject
## Useful links @athual
- LM Generation
https://huggingface.co/blog/how-to-generate
- Eternal Terminal
https://eternalterminal.dev/usermanual/
## Results (02/11/2022)
### Average subject
**Encoding**
<img src="https://i.imgur.com/6WImHha.png" alt="drawing" width="300"/>

**Decoding**

### Single-subject
**Encoding**
<img src="https://i.imgur.com/GJFNkn8.png" alt="drawing" width="300"/>
**Decoding**

### Multi-subject (train), left-out subject same story (test)
**Encoding**
<img src="https://i.imgur.com/h9PP6QB.png" alt="drawing" width="300"/>
<img src="https://i.imgur.com/CMHeKVA.png" alt="drawing" width="300"/>
**Decoding**

**Conclusion**
On Narratives:
* *Encoding* good pour average et multi. Relatively bad for single.
Rmk: scores relatively bad for PrettyMouth et Lucy
* *Decoding* issue. Seems to always yiels the same results. Error in code? Wordlen fixed?
On Lebel: issue with alignment/raw data projected.
# Decoding end-to-end
## Options
* **Option 1 - Replace**
+ Linearly predict `text_embeddings` given fMRI using sklearn
+ Replace activations with the predicted `text_embeddings`
+ Issues: How to predict different words? Shouldn't we predict directly sentence embeddings?
* **Option 2 - CrossAttention**
+ Add a cross-attention layer to a generative model. The model has to be generative.
+ e.g. with GPT-2. Start with context.
* **Option 3 - Guidance**
+ Optimize activations to best predict the condition.
## Options 2: Cross Attention
``` python
class BoldSpatialReducer(nn.Module):
"""
Project fMRI into smaller dimensionality (voxels->dim)
"""
def __init__(self,
bold_dim_in=40962,
bold_dim_out=768,
bold_dim_hidden=64,
):
super().__init__()
self.bold_dim_in = bold_dim_in
self.bold_dim_out = bold_dim_out
self.spatial_reducer = nn.Sequential(
nn.Linear(bold_dim_in, bold_dim_hidden),
nn.Linear(bold_dim_hidden, bold_dim_out)
)
def forward(self, bold):
"""
from [T, V] to [T, D]
"""
out = self.spatial_reducer(bold) # [TR, V] -> [TR, D]
return out
class fMRIConditionalGPTDecoder(ConditionalGPTDecoder):
def __init__(self, config,
cross_attention_layers=(),
freeze_mlp=True,
add_fmri=False,
bold_dim=40962,
bold_dim_hidden=64,
init_fmri_std=None,
max_fmri_position_embeddings=40,
add_fmri_position_embedding=True,
sinusoidal_fmri_embeddings=True,):
super().__init__(config,
cross_attention_layers=cross_attention_layers,
freeze_mlp=freeze_mlp)
# Add fMRI layer
self.add_fmri = add_fmri
if self.add_fmri:
self.bold_dim = bold_dim
self.fmri_layer = BoldSpatialReducer(bold_dim_in=self.bold_dim,
bold_dim_out=self.config.n_embd,
bold_dim_hidden=bold_dim_hidden)
init_weights(self.fmri_layer, std=self.config.initializer_range)
# Add positional embedding for fMRI
self.add_fmri_position_embedding = add_fmri_position_embedding
if self.add_fmri_position_embedding:
if sinusoidal_fmri_embeddings: # Set gradient to False
self.fmri_pe = SinusoidalEmbedding(self.config.n_embd)
else:
self.fmri_pe = nn.Embedding(max_fmri_position_embeddings, self.config.n_embd)
init_weights(self.fmri_pe, std=self.config.initializer_range)
def forward(self, *args, fmri=None, fmri_positions=None,
**kwargs):
assert not "encoder_hidden_states" in kwargs or \
kwargs["encoder_hidden_states"] is None, \
"Encoder hidden states should be None"
if fmri is not None:
assert self.add_fmri
if self.add_fmri and (fmri is not None):
# Project fMRI onto GPT space
B, T, V = fmri.shape
assert V == self.bold_dim
fmri_embeds = self.fmri_layer(fmri.reshape(B * T, V))
fmri_embeds = fmri_embeds.reshape(B, T, self.config.n_embd)
if self.add_fmri_position_embedding:
if fmri_positions is None:
fmri_positions = torch.arange(T).long().to(fmri_embeds.device)
fmri_embeds += self.fmri_pe(fmri_positions)[None] # [B, T, D]
else:
fmri_embeds = None
kwargs["encoder_hidden_states"] = fmri_embeds
out = super().forward(*args, **kwargs)
return out
def prepare_inputs_for_generation(self, *args,
fmri=None,
fmri_positions=None,
**kwargs):
"""
This is needed for generation
"""
output = super().prepare_inputs_for_generation(*args, **kwargs)
output.update({"fmri": fmri,
"fmri_positions": fmri_positions})
return output
```