or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
Read quality control
2024-05-21
Olivier Rué
Christophe Klopp
EBAII 2024 - Genome assembly school
The truth about bioinformatics
https://training.galaxyproject.org/training-material/topics/assembly/tutorials/get-started-genome-assembly/slides.html#4
QC is the first step of any sequence analysis
https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/quality-control/slides.html#7
QC is the first step for all sequence analyses
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Read caracteristics
Reads are not perfect (error rate profile)
https://doi.org/10.1093/nargab/lqab019
First contact with your sequences
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →FASTQ format
Four lines per sequence :
Quality score encoding
Quality score
Measure of the quality of the identification of the nucleobases generated by automated DNA sequencing
https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/quality-control/slides.html#12
FASTQ compression
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Answer to (not always) simple questions:
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Data for learning to assemble reads
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →Sequencing data
FastQC
https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
TP
Données partagées
Basic statistics
Per base sequence quality
Per base sequence quality - Illumina
GC content
GC content / contamination
Per base sequence content
Other QC tools
Meta QC tools
Nanopore QC tools
Kmer based QC tools
Other Kmer based QC tools
KAT: The K-mer Analysis Toolkit
Tools for cleaning reads: trimming & co
Tools for cleaning reads: decontamination
Take home messages