# Compulsory Assignment 1 - BINF200 H21 This is a shared Q&A document for the work on the Compulsory Assignment 1 of BINF200 at the University of Bergen in the autumn 2021. __Ask questions at the bottom__ of the document. You can ask anonymously or add your name. Please feel free to add your own comments or answers anywhere in this doc! 😉 ## Links and contact * [Information at MittUiB](https://mitt.uib.no/courses/29516/assignments/49236) * Assignment instructions (in [DOCX](https://mitt.uib.no/courses/29516/files/3401768/download?wrap=1) or [PDF](https://mitt.uib.no/courses/29516/files/3401769/download?wrap=1)) * [The Norwegian Galaxy server (**usegalaxy.no**)](https://usegalaxy.no/) * **Interactive sessions physically and on Zoom**, Fri 10 Sep 08:30-12:00 and Mon 13 Sep 08:30-10:00, where you can work on the assignment and get immediate answers to your questions (physically, or using Zoom breakout rooms and chat). * For Zoom, please [see connection details at MittUiB](https://mitt.uib.no/courses/29516/assignments/49236) * For joining physically, please bring a laptop and a charger! No need to install anything. There will be one spare laptop available, in case it's needed. We'll also have a couple of bigger screens to connect to. * [Recorded video introduction](https://mitt.uib.no/courses/29516/files/3401723/download?wrap=1) **from the last year(!)** (19min 160MB). * This HackMD document for shared Q&A, which will be available until "forever". [name=Matus] will keep answering your questions here, and feel free to "+1" others' questions or answer other questions yourself if you have an answer or a comment. This should be the place for lively async discussions online! 😸👾 * If you want to ask privately, you can do it during the Zoom sessions (in your breakout room or private chat), or anytime afterwards by messaging [name=Matus Kalas] on MittUib or emailing matus.kalas@uib.no ## How to edit in HackMD If you see this page in **view mode**, what you probably do after opening it, click on *Edit* on the top. Later you can always do either of the following to enable editing: 1. **Edit mode**: Ctrl+Alt+E, or click the button with a pencil on the top left to switch to edit mode 2. **Both mode**: Ctrl+Alt+B, or click the button with a square with a vertical line in the middle (among the 3 buttons on the top left). Nice with a big-enough screen. ## Immediate feedback Please add any feedback here, for immediate improvements. * [name=Matus]: **I mean it!** 😉 With immediate feedback, things can be improved immediately! * * * * * ## Questions and answers Please **ask new questions at the bottom** of this document. You can ask anonymously, or write your name. * writing e.g. `[name=Lisa]` will show up as [name=Lisa] ------------- * A sample question: How to write a question using **markdown** in HackMD? * You can just add a new line or a couple, and you can add a bullet point (`*`), use **bold** or *italics*, [links](https://mitt.uib.no), etc. * You can nest and indent bullet points under each other like this (using tab or at least 2 spaces) * [name=Matus] Another question: What to ask first? * ... * .. * blahblah * [name=Matus] **Some FAQs:** * **Can I close my tab / browser? Can I turn off my computer?** *etc*... * Another advantage of using Galaxy (or another server) is that you can safely turn off / reset / loose your laptop, and your analysis will continue running. * You can even check your work in Galaxy from another device, such as your phone! 📱 * **How long to wait until Step 8a returns some results?** * In the end the **blastn** jobs (Step 8a and 8b) behaved differently on this new Galaxy Norway server, as opposed to the previous Galaxy UiB. The results are the same, but the running times were very different. And it looked like they now do not output the data (BLAST hits) one by one, but probably wait for the whole result to be ready, before showing the output(?) So in the end it took for me a bit more than 4 hours for **Step 8a (default params)** to be fully finished, after I restarted it in the afternoon (maybe fewer jobs on Friday evening made the difference??) Anyway, if it didn't finish for you already, please feel free to just kill by the ❌ 'Delete' button, **or skip the Step 8a completely**, even if electricity in Norway has one of the lowest carbon footprints globally. 😉 I uploaded the result into NeLS, project UiB_BINF200 (outside of the Raw_Reads). Feel free to check it out (using 'Get files from NeLS storage' inside Galaxy Norway). N.B.: It's 7.2 GB! 🙀 **Step 8b** took shorter, but also hours. And you will see a major difference with 8c 👾 * **How can I find the number of sequences?** * To figure it out is the challenge for you 😉 * Some options to think about (no final answers): * in Galaxy, counting the number of lines in a data set is possible (there are tools for that there). Then dividing accordingly to get the # of seqs * for the FASTQ-to-FASTA data sets, you should be able to see it * for the raw reads, the SRA website shows "# of Spots", which actually might be the number of sequences? * **It sounds like the runtimes of BLAST are somewhat unpredictable on this new server** 🤔 * Some runs during the weekend were taking way too long * 8c ran the whole night for 1 student, although it normally should be within 10-30 minutes * For another student, 8b ran even longer than 8a 🙁 * For 1 student, it was possible to see the first results from blastn, for [name=Matus] not (just the whole 7.2GB result when it was done) * Could it be that there was still some maintenance happening? Or is it just so unpredictable with BLAST on this new Galaxy Norway server? * **Comparison with Simon & al.** * No need that you dive into it really deeply (unless you feel like), because many details in Simon & al. might be hard to understand for a non-expert. The question is what you can spot by "scratching the surface"... (there is not just one correct answer) * * * * * * * * * * * * * * * * *