**NBIS Workshop Nov 2021** # Introduction to Bioinformatics using NGS data **Course website**: https://uppsala.instructure.com/courses/47037 **HackMD (this document):** https://hackmd.io/OaPnEzqqRhe1p4s7olOshA **NEW ZOOM LINK** Time: Nov 26, 2021 08:30 - 17:30 Stockholm https://uu-se.zoom.us/j/67976146601?pwd=RWFRQno4QjJFczZ6YytYNCtSa2I5QT09 Meeting ID: 679 7614 6601 Passcode: 942661 ## Please ask for help here. # Notes from Friday ## RNAseq/scRNASeq - Number of replicates/reads? [Liu et al., 2014: RNA-seq differential expression studies: more sequence or more replication?](https://pubmed.ncbi.nlm.nih.gov/24319002/) NBIS Bioinformatics advisory program (for PhD students): https://www.scilifelab.se/training/the-swedish-bioinformatics-advisory-program NBIS bioinformatics drop-in: If you have questions about bioinformatics, feel free to join us over Zoom, Tuesdays 14:00-15:00, by following this link: https://uu-se.zoom.us/j/65398963465 The drop-ins are open to all researchers in Sweden. # Notes from Thursday **What is inbreeding coefficient?** "The coefficient of inbreeding of an individual is the probability that two alleles at any locus in an individual are identical by descent from the common ancestor of the two parents." - Path to reference data on UPPMAX: `/sw/data/igenomes/Homo_sapiens/` - Link to RNA-Seq slides as PDF: `https://nbisweden.github.io/workshop-ngsintro/2111/slide_rnaseq.pdf` # Notes from Tuesday Is CRAM files common and is it supported by most tools? - A lot of tools support CRAM, and if you create the CRAM in lossless mode you can always convert them back to BAM without losing any information. You just need to keep the genome reference as well. If you already know which tools you will use to process your data later on you can check if you can use CRAM as input files. - In that case I wonder how long does it take to convert a lest say human wgs bam <--> cram? or do you do it "on fly" and pipe the data to next step? - I converted ~100 human WGS samples from BAM -> CRAM last year and I recall that it took about/less than 24 hours when I did it in lossless mode. Is it possible to use copy paste from web browser to Mobaxterm? - We asked our colleagues and got the following advice, which may solve some of the problems: - Does the copying work at all? Sometimes it has been an issue with using ctrl-c and ctrl-v. If right click behaves weird then right click and choosing paste should work. Shift-ctrl-insert might as well. - You could check for “hidden” characters that may be the cause of the problem. That is usually the issue with pdfs. I use octal dump in the terminal to check. It is hard to read if it is a long command, but perhaps worth trying.. echo "ssh -Y username@rackham.uppmax.uu.se"| od -cb You can no see that there is a tab instead of a space in the ssh command above, but it is visible using od. Here is a screen shot from IGV from the morning lab, if someoen had problems looking at it on Uppmax: ![](https://i.imgur.com/GKXnsFD.png) # Notes from Monday - So, repeatedly, Mobaxterm suddenly decides that the backspace input means remove EVERYTHING to the left of the cursor. Any ideas why? Both me and my roommate have had this issue. Logging out works but is nothing I want to do every hour or so... Room 5: -I got a possible solution for this. Will see if it works. Thanks! - How to cancel unwanted jobs: `scancel <job id>` - How to estimate the time that my program will take? - No way to know before trying. Book a node for a week and make a couple of test runs, then you'll see how long time it can take. - One follow-up question: which exercise(s) are for the morning section? just the Linux 1? Thanks! Hongkai - Yes, "Linux 1: Introduction" and "Linux: File permissions" are the only ones we shoudl do before lunch. - Okay! - Just a note: I use the regular windows terminal for ssh and it works fine for me. If anyone has problems with the MobaXTerm :) - I use WSL2 and it works very well and can even run GUI apps, they are slow though - You need to install an external Xserver for that, right? - In the example of word count, $ grep "CATCATCAT" sample_1.sam | wc 60 957 15074 what is the biological meaning of these numbers? - None really, or it depends on what you are analyzing. The 60 number means that there were 60 lines in the file that had CATCATCAT mentioned on them. How that is interpreted biologically is depending on what data you have. The idea in this example is just to show you how to do it with a computer. - Then what is a line? The string finished by \n ? - In: $ drwxrwsr-x 10 username snic2021-23-591 4096 Nov 22 10:48 linux_tutorial what is the meaning of "s" in "drwxrwsr-x"? # Please ask for help in a room in the top of the HackMD document.