Course website: https://nbisweden.github.io/workshop_omicsint_ISMBECCB/
Labs: https://nbisweden.github.io/workshop_omicsint_ISMBECCB/labs
HackMD: https://hackmd.io/LI_HCxeRT8-Ty5qjikeFpQ?both
Rui Benfeitas - Course leader, NBIS bioinformatician, Stockholm University. Work in integrative omics projects, mostly employing transcriptomic, metabolomic, proteomic, epigenomic data. Favorite programming language is Python.
Nikolay Oskolkov - Course co-leader, NBIS bioinformatician, Lund University Sweden. I have my background in genomics and medical genetics. Interested in evolution and ancient DNA research. Working on different data types.
Ashfaq Ali - Course co-leader, NBIS bioinformatician, Lund University Sweden. Work with omics data using statitical and sytems biology approaches to analyze and integrate omics data for biomarker/target discovery.
If you have trouble with connecting to the Zoom room, or anything else you want to get in touch with us fast, write here:
Refer to the tutorial homepage for detailed information.
For bugs and installation instructions refer to the homepage. For any other questions feel free to write them here and we will try to answer them as soon as possible.
Will Docker be nessasary? Can I work locally on my computer?
Where can we find the docker images?
Will the recording of the talks will be available later.
Slide 6 of ML view of integration: are the data distributions coming from biological or simulated data?
As I am working mostly with translational research I get a lot of small studies that I need to integrate. You have stated that this type of analysis will not be covered by the tutorial. But can you advise on what tools are recommended in this case? Teanscriptomics both by arrays and RNA-seq of 3-12 samples each depending on the experiment
Does MOFA have limitation in terms of number of samples, features?
Clustering of clusters is the same with ensemble clustering?
What input is fed into UMAP for the case of two omics dataset integration? Just a concatenation of the two matrices?
When you say feature selection do you mean preclustering? Or do you mean choosing genes?
Why use autoencoders compared to NN since the integration is effectively accomplished at the hidden layers?
Do you suggest R library for Autoencoder for single cell?
Where should I read more about Similarity Network Fusion?
How was the single-cell data prepared for integration? Is the input normalized? Is it counts?
For integration of the scRNAseq, how the data is preprocessed. Should the data be normalized before integration?
Can these methods be used for integration of metagenomics with host omics?
In unsupervised integration using MOFA how do we decide upon the max number of iterations required?
What are the units of the input for the unsupervised datasets? What units is for the RNA-seq or the methylomics? Or the drug?
Could you comment on interpretability of the illustrated methods? Biologists some time are confused by transformed variables and loose the link between the analyis and biological concepts (genes, proteins, etc). How can we keep or rebuild this link?
Any special considerations to combine binary (mutations) with non-binary data?
What was the old way using anaconda to run the Rmd files from yesterday. The MultiOmics gives me problems both on windows and on linux.
mixomics
, perhaps the easiest way to install everything is to create an environment with only the basic needed packages as below:channels:
- conda-forge
- bioconda
- anaconda
- defaults
dependencies:
## languages
- conda-forge::r-base
## R packages
- bioconductor-mixOmics=6.16.0
- r-reticulate
- rstudio
Do you suggest papers, libraries allowing both + and - edge combined analysis?
You mentioned that due to differences in dynamic range between omics (e.g. proteomics and transcriptomics) building a common network may not make biological sense… for which questions do you think it would still be justified? and would it make sense to give both omics different 'weights' in terms of edge confidence or treat them differently somehow?
Regarding the discussion on positive and negative correlations as weights of the network edges, could you use absolute value? Then it is magnitude not direction of effect?
In a protein-protein interaction network (PPIs), is there any topologycal parameters (centrality, degree, pagerank, betweeness, …) used to identify key proteins in this type of network? which parameter is the most "widely" used in biological terms to identify key proteins?
Is there a module analysis that allows me to identify new pharmacological targets in a protein-protein interaction network? (all proteins being related in a specific pathology)
Which tools would you recommend for network analysis?
Do you suggest a library devoted to hypothetical protein function predictions or analysis especially with respect to networks?
Are there any good reading materials/pipelines that try to integrate both non-cell specific information such as PPIs and Gene Ontology hierarchies, together with cell specific multi-omics data and drug responses, to construct networks and reveal important genes relating to drug responses or disease mechanisms? Thanks.
If you have any feedback during the course, feel free to add it here:
thank you for the loaded, wonderful workshop!
Thank you very much. Please fill in the feedback form here https://forms.gle/y5s1Lm5yjn5pGxHJ8
Please share our longer integrative omics course with your contacts here https://www.scilifelab.se/event/elixir-omics-integration-and-systems-biology-online/
If you have any questions please email us at edu.omics-integration[ at ]nbis.se