Data from: Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds

# Data from: Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds --- Bruno L. Giordano1*, Michele Esposito2, Giancarlo Valente2 and Elia Formisano2,3,4* 1 Institut des Neurosciences de La Timone, UMR 7289, CNRS and Université Aix-Marseille, Marseille, France. 2 Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, Netherlands. 3 Maastricht Centre for Systems Biology (MaCSBio), Faculty of Science and Engineering, Maastricht University 4 Brightlands Institute for Smart Society (BISS), Maastricht University *Corresponding authors. E-mails: bruno dot giordano at univ-amu dot fr; e dot formisano at maastrichtuniversity dot nl This repository is distributed as is. Requests for clarifications etc. should be addressed to: Bruno L. Giordano, bruno dot giordano at univ-amu dot fr In this paper, we re-analyse behavioural data from Giordano et al. (2010; perceived natural sound and word dissimilarity), and Santoro et al. (2017; 7T fMRI responses to natural sounds). References: - Giordano, B. L., McDonnell, J. & McAdams, S. Hearing living symbols and nonliving icons: Category-specificities in the cognitive processing of environmental sounds. Brain Cogn 73, 7--19 (2010). - Santoro, R. et al. Reconstructing the spectrotemporal modulations of real-life sounds from fMRI response patterns. PNAS 114, 4799--4804 (2017). ## Repo structure * Install.m: Matlab script called inside the analysis code to install toolboxes and declare relevant paths. * README_1st.txt: installation information (also included in this README.md) * /code/: code used to fit the models to the stimuli and analyze the data. Main analysis code: * analyze_01_acoustic_models_distances.m (Matlab): fits acoustics models to sound stimuli and computes between-stimulus distances * analyze_02_nlp_models.py (Python): computes natural language processing embeddings for the labels describing the sound stimuli, and the categories model (data from Santoro et al. 2017, only). * analyze_03_semantic_distances.m : Matlab; computes semantic between-stimulus distances using the natural language processing embeddings or the categories model. * analyze_04a_dnns_vggish.py (Python): fits the VGGish model to the sound stimuli; * analyze_04b_dnns_yamnet.py (Python): fits the Yamnet model to the sound stimuli; * analyze_04c_dnns_kell.py (Python): fits Kell's network to the sound stimuli; * analyze_04d_dnn_distances.m (Matlab): computes between-stimulus distances considering the representations in the deep neural network (DNN) models VGGish, Yamnet and Kell. * analyze_05_behaviour_fmri.m (Matlab): analyzes model representations in behavioural data and in fMRI data. * analyze_06_fmri_models_of_behaviour.m (Matlab): computes the DNN-based models of fMRI data. These models are used to predict behaviour with the code in analyze_05_behaviour_fmri.m * The rest of the code inside this folder is called by the main analysis scripts described above. * note about analyze_*.m scripts: change the variable "rootmain" at the beginning of each code section so that it points to the local path for the repository * /data/: analysed data, including model representations. The name of the /data/ subdirectories follows this convention: dataset_datatype, where dataset is either giordano or formisano for data from Giordano et al., (2010), and Santoro et al. (2017), respectively. * /data/dataset_acoustics/ (e.g., formisano_acoustics) includes several .mat (Matlab) files: * dataset_acousticmodel.mat files (e.g., formisano_cochleagram.mat) includes the stimulus representations in a specific acoustic model (e.g., cochleagram), for a specific dataset (e.g., formisano). * dataset_acousticmodel_dist_whichdistance.mat files contain between-stimulus distances for a specific dataset, according to a specific acoustic model, and based (whichdistance = cos for cosine; whichdistance = euc for Euclidean). E.g., formisano_cochleagram_dist_cos.mat includes the cosine distance between the stimuli in the formisano dataset, according to the cochleagram model. Each of the distance files includes four variables: * Components: cell containing strings that idenfity the components of the model; * D = distance matrix in vectorized format (rows = stimulus pairs; columns = models); * Model = cell containing a string that identifies the model; * ndims = vector specifying the number of model parameters for each component of the model. * /data/dataset_dnns/ (e.g., giordano_dnns) contains four subdirectories: * /data/dataset_dnns/kell/: stimulus representations in the Kell network (hdf5 files, one file per stimulus); * /data/dataset_dnns/vggish/: stimulus representations in the VGGish network (hdf5 files, one file per stimulus); * /data/dataset_dnns/vggishrandom/: stimulus representations in the untrained VGGish network initialized with random weights (one .mat files including a structure with the representation of each of the stimuli in each layer; stimuli in first dimension of each variable); * /data/dataset_dnns/yamnet/: stimulus representations in the Yamnet network (hdf5 files, one file per stimulus). * all .mat files inside /data/dataset_dnns/ (e.g., giordano_dnns) contain between-stimulus distances according to the different DNN models (naming conventions and contents as specified for the acoustic models, above). * /data/dataset_semantics/ (e.g., formisano_semantics) contains data considered for the natural language processing embeddings and for the categories model (only data from Santoro et al., 2017). * /data/dataset_semantics/dataset_labels.csv and dataset_labels.xlsx (e.g., formisano_labels.csv) include the strings describing the sound source for each of the sound stimuli; * dataset_semanticmodel.mat files (e.g., formisano_glove.mat) and dataset_semanticmodel.csv files (e.g., formisano_glove.csv) contain the natural language processing of each of the stimuli according to a specific semantic model (e.g., glove model). * dataset_semanticmodel_dist_whichdistance.mat files contain between-stimulus distances according to the different semantic models (see acoustic models for naming convention, and contents). * /data/dataset_stimuli/ (e.g., giordano_stimuli) contain the sound stimuli. * each of the subdirectories contains the wav files (one per sound stimulus) at the sampling rate specified by the directory name (e.g., wav_16kHz includes wav files at 16 kHz sampling rate). * stimuli_list.csv/mat/xlsx (e.g., stimuli_list.mat) contain the filename information for each of the stimuli saved in csv, mat, or xlsx format. * /data/formisano_fmri/ contains the fMRI-distance data. * fmridist_nospmean.mat = mat (Matlab) file including: * between-stimulus distances for the test and training sets (fmridist_test and fmridist_train, respectively), variables of size [n_pairs, n_participants, n_stimulus_folds, n_rois]; * numerical identifiers for the stimuli in each of the stimulus folds (idx_test and idx_train), variables of size [n_stimuli, n_stimulus_folds]; * name of each of the six regions of interest (rois) considered in the analyses (roi_names); * name of the sound stimuli in each of the stimulus folds (stimuli_test and stimuli_train), cell of size [n_stimuli, n_stimulus_folds]. * /data/giordano_behaviour/ contains the behavioural data. * behavdist.mat = mat (Matlab) file including: * behavioural distances in the sound dissimilarity condition (behavdist), variable of size [n_pairs, n_participants, n_stimulus_groups]; * numerical identifiers for the stimuli in each of the two stimulus groups, variable of size [n_stimuli,n_stimulus_groups]; * name of the sound stimuli in each of the stimulus groups (stimuli), cell of size [n_stimuli, n_stimulus_groups]. * behavdist_sem.mat = mat (Matlab) file including: * behavioural distances in the word dissimilarity condition (behavdist), variable of size [n_pairs, n_participants, n_stimulus_groups]; * numerical identifiers for the stimuli in each of the two stimulus groups, variable of size [n_stimuli,n_stimulus_groups]; * name of the sound stimuli in each of the stimulus groups (stimuli), cell of size [n_stimuli, n_stimulus_groups]. * /data/giordano_fmri_prediction/ contains the data considered to predict behavioural data from the DNN-mapped fMRI data. * formisano_*.mat = mat (Matlab) files contain the betas of the GLM models used to predict fMRI data distances using DNN distances (speech/nospeech, at the end of the filename = models fitted considering all stimuli, including speech, or after removing the speech stimuli); * giordano_fmri_whichroi*.mat = mat (Matlab) files contain the dnn estimate of the between-stimulus distances in the different fMRI rois (see acoustic models, for contents). * /results/: analysis results, including permutations * each of the xls files inside /results/ contains a statistics table output from analyze_05_behaviour_fmri.m, and read to compute the LaTeX code for the Supplementary Tables in the manuscript. * /results/matfiles/ includes several .mat files (Matlab) containing the results of the statistical tests. Each of them includes the following variables: * analysis_opt: struct defining the analysis options * ndims: cell defining n parameters for each of the models considered in the analysis * outfilename: string identifying the output file * varpart_info: cell variable defining the information of variance-partitioning analyses * out: struct containing the results of the analysis (see CV_GLM_fit.m, for further details) * /Toolboxes/ (Matlab): various pieces of code written to aid acoustic analyses, and to analyze the data. ### Installation instructions The following toolboxes need to be installed to run different portions of the code. #### Acoustic models - install cochleagram and MTF in /Toolboxes/, add to Matlab path code: http://nsl.isr.umd.edu/downloads.html - install SAI in /toolboxes/, add to Matlab path code: https://code.soundsoftware.ac.uk/projects/aim https://www.acousticscale.org/wiki/index.php/Category:Auditory_Image.html - install Texture in /Toolboxes/, add to Matlab path code: https://mcdermottlab.mit.edu/Sound_Texture_Synthesis_Toolbox_v1.7.zip - install MIR toolbox (roughness model) in /toolboxes/, add to Matlab path code: https://www.jyu.fi/hytk/fi/laitokset/mutku/en/research/materials/mirtoolbox - install Yin model (pitch/periodicity) in /Toolboxes/, add to Matlab path code: http://audition.ens.fr/adc/ - time-varying loudness and spectral centroid have been computed using the LoudnessToolbox by Genesis Acoustics. This toolbox should be installed in /Toolboxes/ and added to the Matlab path. code: The toolbox was announced on the auditory list, but the link is not valid anymore (http://www.auditory.org/mhonarc/2010/msg00135.html). For a copy of this toolbox contact Bruno L. Giordano (bruno dot giordano at univ-amu dot fr). #### DNN models - install yamnet and vggish in /code/nlp_dnn_models/audioset/ code: https://github.com/tensorflow/models weights: https://storage.googleapis.com/audioset/vggish_model.ckpt https://storage.googleapis.com/audioset/vggish_pca_params.npz https://storage.googleapis.com/audioset/yamnet.h5 - install kelletal2018 in /code/nlp_dnn_models/ https://github.com/mcdermottLab/kelletal2018 - install pychochleagram in /code/nlp_dnn_models/ https://github.com/mcdermottLab/pycochleagram #### NLP models - install universal-sentence-encoder_4 in /code/nlp_dnn_models weights: https://tfhub.dev/google/universal-sentence-encoder/4 - install GNewsW2V in /code/nlp_dnn_models/ weights: https://www.kaggle.com/datasets/leadbest/googlenewsvectorsnegative300 - install Glove (6B, 300D) in /code/nlp_dnn_models/ weights: https://nlp.stanford.edu/data/glove.6B.zip #### Matlab tools - install mtimesx, add to Matlab path code: https://www.mathworks.com/matlabcentral/fileexchange/25977-mtimesx-fast-matrix-multiply-with-multi-dimensional-support - install distribution plot, add to Matlab path code: https://www.mathworks.com/matlabcentral/fileexchange/23661-violin-plots-for-plotting-multiple-distributions-distributionplot-m