Galaxy / 自訂分析工具 === ###### tags: `生物資訊` ###### tags: `生物資訊`, `生物資訊計算平台`, `Galaxy`, `基因體` <br> 目錄: [TOC] <br> :::info :bulb: **重要文件** - **[Adding custom tools to Galaxy](https://galaxyproject.org/admin/tools/add-tool-tutorial/)** - **[Galaxy Tool XML File](https://docs.galaxyproject.org/en/latest/dev/schema.html)** ::: <br> ## 新增自訂工具 ### Step1 - 撰寫並測試自訂的工具 - 新增一個 toolExample.pl ```perl= #!/usr/bin/perl -w # usage : perl toolExample.pl <FASTA file> <output file> open (IN, "<$ARGV[0]"); open (OUT, ">$ARGV[1]"); while (<IN>) { chop; if (m/^>/) { s/^>//; if ($. > 1) { print OUT sprintf("%.3f", $gc/$length) . "\n"; } $gc = 0; $length = 0; } else { ++$gc while m/[gc]/ig; $length += length $_; } } print OUT sprintf("%.3f", $gc/$length) . "\n"; close( IN ); close( OUT ); ``` - 下載武漢肺炎病毒序列,當作測試例子 [Libraries](https://usegalaxy.org/library/list#) / [2019_nCoV](https://usegalaxy.org/library/list#/folders/Fe878daae442969ff) / [Assembled genomes](https://usegalaxy.org/library/list#folders/Fbd0d83390997b5b1) / nCoV_Jan31.fa - 在本機端進行測試 ```bash perl toolExample.pl nCoV_Jan31.fa out.txt ``` - 執行結果 ``` 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 0.380 ``` ### Step2 - 上傳自訂的工具到 Galaxy - 將 toolExample.pl 放到 ```${Galaxy安裝目錄}/tools/``` 目錄下 如 ```galaxy/tools/__tj_tools/toolExample/toolExample.pl``` - 定義自定義工具的 xml 檔 toolExample.xml ```xml <tool id="fa_gc_content_1" name="Compute GC content" version="0.1.0"> <description>for each sequence in a file</description> <command interpreter="perl">toolExample.pl $input $output</command> <inputs> <param format="fasta" name="input" type="data" label="Source file"/> </inputs> <outputs> <data format="tabular" name="output" /> </outputs> <help> This tool computes GC content from a FASTA file. </help> </tool> ``` - XML 呈現的結果 [![](https://i.imgur.com/V0pV7gO.png)](https://i.imgur.com/V0pV7gO.png) - 定義啟動工具的描述(description)、指令用法(command)、輸入參數(inputs)、輸出參數(outputs) <br> - 在 Galaxy 的工具組態,登錄該工具的 XML 檔 galaxy/config/tool_conf.xml ```xml <toolbox> ... <section name="TJTools" id="tjTools"> <tool file="__tj_tools/toolExample/toolExample.xml" /> </section> </toolbox> ``` - `name` 是用來定義 Tools 清單的分類名稱 <br> ### Step3 - 重啟 Galaxy - 重啟後,就可以在 Tools 清單看見自定義工具 ![](https://i.imgur.com/cwqiGrk.png) <br> - ```TJTools``` 則是定義在 ```<section>``` 中的 name 屬性 - ```Compute GC content``` 則是定義在 ```<tool>``` 中的 name 屬性 - ```for each sequence in a file``` 則是定義在 ```<description>``` 中 <br> ### Step4 - 點選自訂工具,進行測試 1. 輸入武漢肺炎病毒序列的 fa 檔,並點選執行 ![](https://i.imgur.com/V0pV7gO.png) 2. 輸出結果 ![](https://i.imgur.com/auxKCYm.png) 3. 檢視輸出檔案的內容 ![](https://i.imgur.com/4lkIGeo.png) <br> ## GATK workflow - 資料來源參考 - [Genome / NTUH Project](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA) - [[github] broadinstitute / GATK](https://github.com/broadinstitute/gatk) - [[github] ohsu-comp-bio / compbio-galaxy-wrappers](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/tree/master/gatk4) - 工作流程 - bwa - [指令用法來源](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA#runBWA) ```bash # Usage: bwa mem [options] <idxbase> <in1.fq> [in2.fq] # -M mark shorter split hits as secondary # -R STR read group header line such as '@RG\tID:foo\tSM:bar' [null] ./bwa mem -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta /Everythings/dataset/D15780_S13_L001_R2.fastq.gz | /Everythings/misc/samtools/samtools view -@ 2 -1 -o D15780_S13_L001.bam ``` - XML wrapper ```xml <tool id="bwa_mem" name="Execute the command: 'bwa mem'" version="0.1.0"> <description>map medium and long reads (> 100 bp) against reference genome (Galaxy Version 0.7.17.1)</description> <command>/Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta $input -o $output 2>&amp;1</command> <inputs> <param name="input" format="fastq" type="data" label="Source file of 'fastq'" /> </inputs> <outputs> <data name="output" format="sam" /> </outputs> <help> the wrapper of 'bwa mem' (path='__tj_tools/misc/bwa/bwa.xml') </help> </tool> ``` - ```$input``` & ```$output``` 表示變數資料 - [input 參數說明 (tool > inputs > param)](https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-inputs-param) 以 ```type="data"``` 定義,從表單界面選取匹配的檔案。 實際內部運作,應該是從 ``` /Everythings/galaxy/database/files/000/dataset_??.dat``` 尋找匹配的檔案,列舉在表單中的清單 - output 檔,以 ```<data>``` 定義 輸出到 ```/Everythings/galaxy/database/files/000/dataset_??.dat``` ![](https://i.imgur.com/KdfbV0S.png) - 錯誤處理 - 該 bwa 程式,即使在正常情況下,也會將正常訊息輸出到「標準錯誤輸出」,進而導致 Galaxy 在執行時,判定該程式有 error。 - 輸出到「標準錯誤輸出」的訊息 ``` [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 66530 sequences (20000278 bp)... [M::process] read 66530 sequences (20000076 bp)... [M::mem_process_seqs] Processed 66530 reads in 81.090 CPU sec, 40.345 real sec [M::process] read 66530 sequences (20000048 bp)... [M::mem_process_seqs] Processed 66530 reads in 73.791 CPU sec, 36.629 real sec [M::process] read 47540 sequences (14291513 bp)... [M::mem_process_seqs] Processed 66530 reads in 77.920 CPU sec, 38.771 real sec [M::mem_process_seqs] Processed 47540 reads in 55.027 CPU sec, 27.465 real sec [main] Version: 0.7.17-r1188 [main] CMD: /Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem -M -R @RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina -t 2 -o bwa_mem_output.bam /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta /Everythings/dataset/D15780_S13_L001_R2.fastq.gz [main] Real time: 148.109 sec; CPU: 292.716 sec ``` - 需要將 bwa 程式的「標準錯誤輸出」的訊息,導入到「標準輸出」 ``` 2&>1 ``` - 也許有參數,可以抑制 debug 訊息? - 修正前&修正後的執行結果 ![](https://i.imgur.com/mmzw4BV.png) - [BWA-MEM 的複雜界面設計](#BWA-MEM-的複雜界面設計) (請參考底下的章節) <br> ## BWA-MEM 的複雜界面設計 - ### [指令用法來源](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA#runBWA) ```bash # Usage: bwa mem [options] <idxbase> <in1.fq> [in2.fq] # -M mark shorter split hits as secondary # -R STR read group header line such as '@RG\tID:foo\tSM:bar' [null] /Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta $input -o $output 2>&1 ``` - ### 預設參數選項 v.s. 自訂參數選項 - 表單 [![](https://i.imgur.com/SoT8onR.png)](https://i.imgur.com/SoT8onR.png) - 表單 / 選項 [![](https://i.imgur.com/aErpzcn.png)](https://i.imgur.com/aErpzcn.png) - 表單對應的 XML ```xml <inputs> ... <conditional name="params"> <param name="source_select" type="select" label="BWA settings to use" help="For most mapping needs use Commonly Used settings. If you want full control use Full Parameter List"> <option value="pre_set">Commonly Used</option> <option value="full">Full Parameter List</option> </param> <when value="pre_set" /> <when value="full"> ... </when> </conditional> </inputs> ``` <br> - ### [Hello World] 重導「標準輸出錯誤」到「標準輸出」的啟用選項 - #### 實際參數 ```2>&1``` - #### 表單 - 表單 / 選項 [![](https://i.imgur.com/xcwgrCP.png)](https://i.imgur.com/xcwgrCP.png) - 表單對應的 XML ```xml <inputs> ... <conditional name="params"> ... <when value="full"> <param name='hide_stderr' type="boolean" checked="false" label="redirect stderr to stdout" help="avoid program failure" /> </when> </conditional> </inputs> ``` - #### 指令 - 指令對應的 XML ```xml <command>/Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta $input_seq -o $output #if $params.source_select != "pre_set" #if $params.hide_stderr 2>&amp;1 #end if #end if </command> ``` - ```#if``` 和 ```#end if``` 是一個對稱的指令 若缺乏結尾,會產生執行錯誤 ``` Some #directives are missing their corresponding #end ___ tag: if, if ``` ![](https://i.imgur.com/TnCJogW.png) - 變數特性 - $variable_name 表示變數 - 變數具有階層性 如 ```$params``` 底下的 ```hide_stderr``` 變數 以 ```$params.hide_stderr``` 表示 - 啟用&關閉的差別 Yes -> 34 No -> 33 ![](https://i.imgur.com/m5JMp2D.png) <br> - ### -M 參數 - #### 實際參數 ```-M``` - #### 表單 - 表單 / 選項 [![](https://i.imgur.com/WwayPPL.png)](https://i.imgur.com/WwayPPL.png) - 表單對應的 XML ```xml <inputs> ... <conditional name="params"> ... <when value="full"> <param name="mark" type="boolean" checked="true" label="Mark shorter split hits as secondary (-M)" help="For Picard/GATK compatibility" /> </when> </conditional> </inputs> ``` - #### 指令 - 指令對應的 XML ```xml <command>/Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem #if $params.source_select != "pre_set" #if $params.mark -M #end if #end if -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta $input_seq -o $output 2>&amp;1 </command> ``` - #### 其他參考資料 [bwa_mem.xml # mark](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/blob/master/bwa/bwa_mem.xml#L164) <br> - ### -R 參數 - #### 實際參數 ```-R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina'``` - #### 表單1: ID - 表單 / 選項 ![](https://i.imgur.com/qTbtKTI.png) - 表單對應的 XML ```xml ... <when value="full"> ... <conditional name="readGroup"> <param name="read_group" type="select"> <option value="yes" selected="true">Yes</option> <option value="no">No</option> </param> <when value="no"/> <when value="yes"> <param name="read_group_id" type="text" label="Read group identifier (ID). Each @RG line must have a unique ID. The value of ID is used in the RG tags of alignment records. Must be unique among all read groups in header section." help="Required if RG specified. Read group IDs may be modified when merging SAM files in order to handle collisions."> </param> </when> </conditional> </when> ``` - #### 表單2: ID / 驗證器 - 表單 / 選項 [![](https://i.imgur.com/JYCoUqP.png)](https://i.imgur.com/JYCoUqP.png) - 表單對應的 XML : ```<validator>``` ```xml ... <param name="read_group_id" type="text" ... > <validator type="empty_field" /> </param> ``` - #### 表單3:完整 - 表單 / 選項 [![](https://i.imgur.com/LYYraJm.png)](https://i.imgur.com/LYYraJm.png) - 表單對應的 XML ```xml <conditional name="read_group"> <param name="read_group_enabled" type="select" label='Enabled Read Group(@RG) (-R)'> <option value="yes" selected="true">Yes</option> <option value="no">No</option> </param> <when value="no"/> <when value="yes"> <param name="read_group_id" type="text" value="D15780_S13_L001" label="Read group identifier (ID). Each @RG line must have a unique ID. The value of ID is used in the RG tags of alignment records. Must be unique among all read groups in header section." help="Required if RG specified. Read group IDs may be modified when merging SAM files in order to handle collisions."> <validator type="empty_field" /> </param> <param name="read_group_sm" type="text" from_dataset="input_seq" value="D15780_S13_L001" label="Sample (SM)." help="Required if RG specified. Use pool name where a pool is being sequenced"> <validator type="empty_field" /> </param> <param name="read_group_pl" type="select" label="Platform/technology used to produce the reads (PL)" help="Optional"> <option value=""></option> <option value="CAPILLARY">CAPILLARY</option> <option value="LS454">LS454</option> <option value="ILLUMINA" selected='true'>ILLUMINA</option> <option value="SOLID">SOLID</option> <option value="HELICOS">HELICOS</option> <option value="IONTORRENT">IONTORRENT</option> <option value="PACBIO">PACBIO</option> </param> </when> </conditional> ``` - 執行結果 [![](https://i.imgur.com/2BYqICv.png)](https://i.imgur.com/2BYqICv.png) - #### bwa_mem.xml 完整版 ```xml <tool id="bwa_mem" name="Execute the command: 'bwa mem'" version="0.1.0"> <description>map medium and long reads (> 100 bp) against reference genome (Galaxy Version 0.7.17.1)</description> <command> ## /Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem ## -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' ## -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta ## /Everythings/dataset/D15780_S13_L001_R2.fastq.gz ## -o bwa_mem_output.bam /Everythings/galaxy/tools/__tj_tools/misc/bwa/bwa mem #if $params.source_select == "pre_set" -M #else #if $params.mark -M #end if #if $params.read_group.read_group_enabled == 'no' ## no param: -R STR read group header line such as '@RG\tID:foo\tSM:bar' [null] #pass #else ## -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' #set $rg_id = $params.read_group.read_group_id #set $rg_sm = $params.read_group.read_group_sm #set $rg_pl = $params.read_group.read_group_pl #if $rg_sm #set $rg_sm = '\\tSM:%s' % $rg_sm #end if #if $rg_pl #set $rg_pl = '\\tPL:%s' % $rg_pl #end if -R '@RG\tID:${rg_id}${rg_sm}${rg_pl}' #end if #end if -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta $input_seq -o $output #if $params.source_select == "pre_set" 2>&amp;1 #else #if $params.hide_stderr 2>&amp;1 #end if #end if </command> <inputs> <param name="input_seq" format="fastq" type="data" label="Source file of 'fastq'" /> <conditional name="params"> <param name="source_select" type="select" label="BWA settings to use" help="For most mapping needs use Commonly Used settings. If you want full control use Full Parameter List"> <option value="pre_set">Commonly Used</option> <option value="full">Full Parameter List</option> </param> <when value="pre_set" /> <when value="full"> <param name='hide_stderr' type="boolean" checked="true" label="redirect stderr to stdout" help="avoid program failure" /> <param name="mark" type="boolean" checked="true" label="Mark shorter split hits as secondary (-M)" help="For Picard/GATK compatibility" /> <conditional name="read_group"> <param name="read_group_enabled" type="select" label='Enabled Read Group(@RG) (-R)'> <option value="yes" selected="true">Yes</option> <option value="no">No</option> </param> <when value="no"/> <when value="yes"> <param name="read_group_id" type="text" value="D15780_S13_L001" label="Read group identifier (ID). Each @RG line must have a unique ID. The value of ID is used in the RG tags of alignment records. Must be unique among all read groups in header section." help="Required if RG specified. Read group IDs may be modified when merging SAM files in order to handle collisions."> <validator type="empty_field" /> </param> <param name="read_group_sm" type="text" value="D15780_S13_L001" label="Sample (SM)." help="Required if RG specified. Use pool name where a pool is being sequenced"> <validator type="empty_field" /> </param> <param name="read_group_pl" type="select" label="Platform/technology used to produce the reads (PL)" help="Optional"> <option value=""></option> <option value="CAPILLARY">CAPILLARY</option> <option value="LS454">LS454</option> <option value="ILLUMINA" selected='true'>ILLUMINA</option> <option value="SOLID">SOLID</option> <option value="HELICOS">HELICOS</option> <option value="IONTORRENT">IONTORRENT</option> <option value="PACBIO">PACBIO</option> </param> </when> </conditional> </when> </conditional> </inputs> <outputs> <data name="output" format="sam" /> </outputs> <help> the wrapper of 'bwa mem' (path='__tj_tools/misc/bwa/bwa_mem.xml') </help> </tool> ``` - #### 其他參考資料 [bwa_mem.xml # readGroup](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/blob/master/bwa/bwa_mem.xml#L165) <br> ## BWA-MEM 的 wrapper - ### 使用 python 來打包 BWA-MEM - [BWA-MEM wrapper](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/tree/master/bwa) - [bwa_mem.xml](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/blob/master/bwa/bwa_mem.xml) - [bwa_mem.py](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/blob/master/bwa/bwa_mem.py) - 參考基因體的佈署 - UI 的選項 ![](https://i.imgur.com/LS24Ev3.png) - [tool_data_table_config_path (官方說明)](https://docs.galaxyproject.org/en/master/admin/config.html#tool-data-table-config-path) > XML config file that contains data table entries for the ToolDataTableManager. This file is manually # maintained by the Galaxy administrator (.sample used if default does not exist). - 更多資訊 - [Data Preparation documentation](https://galaxyproject.org/admin/data-preparation/) - 下載 Galaxy team 建置的參考基因體索引 (Galaxy Datacache) - http://datacache.galaxyproject.org/ - http://datacache.galaxyproject.org/indexes/hg19/ - http://datacache.galaxyproject.org/indexes/hg19/hg19full/bwa_index/ - 新增 data-table 的入口,並定義表格的欄位 - config/tool_data_table_conf.xml (或是 config/tool_data_table_conf.xml.sample)檔案中,附加底下內容 ```xml <table name="bwa_mem_indexes" comment_char="#"> <columns>value, dbkey, name, path</columns> <file path="/Everythings/galaxy/tools/__tj_tools/misc/bwa/tool-data/bwa_index.loc" /> </table> ``` - 在 bwa_index.loc 中,列舉所使用的參考基因體資訊,以 tab 隔開 - bwa_index.loc ``` human_g1k_v37 b37 human_g1k_v37_decoy /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta ``` - 欄位1:<unique_build_id> 該參考基因體的ID - 欄位2:<dbkey> 常用的參考基因體代碼,如 b37(hg19), b38(hg38) - 欄位3:<display_name> 顯示在 UI 上的選項名稱 - 欄位4:<file_path> 參考基因體的實際位置 <br> ## 完整版的 bwa mem 配置(官方版) - 檔案路徑galaxy/database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/bwa/01ac0a5fedc3/bwa/ - bwa.xml - bwa-mem.xml - bwa_macros.xml - read_group_macros <br> ## SAM-to-BAM - ### 安裝套件 sam_to_bam > Convert SAM format to BAM format. - ### 需要參考基因體 - 所需的 data_table 名稱是 fasta_indexes ```xml <param name="index" type="select" label="Using reference genome"> <options from_data_table="fasta_indexes"> <filter column="dbkey" key="dbkey" ref="input1" type="data_meta" /> <validator message="No reference genome is available for the build associated with the selected input dataset" type="no_options" /> </options> </param> ``` - 新增 data-table 的入口,並定義表格的欄位 - config/tool_data_table_conf.xml (或是 config/tool_data_table_conf.xml.sample)檔案中,附加底下內容 ```xml <table name="fasta_indexes" comment_char="#"> <columns>value, dbkey, name, path</columns> <file path="tool-data/fasta_indexes.loc" /> </table> ``` - 在 fasta_indexes.loc 中,列舉所使用的參考基因體資訊,以 tab 隔開 - fasta_indexes.loc ``` human_g1k_v37 hg_g1k_v37 human_g1k_v37_decoy /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta ``` - 欄位1:<unique_build_id> 該參考基因體的ID - 欄位2:<dbkey> 常用的參考基因體代碼,如 b37(hg19), b38(hg38) - 欄位3:<display_name> 顯示在 UI 上的選項名稱 - 欄位4:<file_path> 參考基因體的實際位置 - 注意事項 - 需將 dbkey,從 b37 改成 hg_g1k_v37 才會符合內建的清單,才能被 ```<filter>``` 過濾出來 - 或是先拿掉 ```<filter>``` 進行測試 ## 指令測試備註 - ### [BWA-MEM](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA#runBWA) ```bash ./bwa mem -M -R '@RG\tID:D15780_S13_L001\tSM:D15780_S13_L001\tPL:Illumina' -t 2 /Everythings/misc/bundle/b37/human_g1k_v37_decoy.fasta /Everythings/dataset/D15780_S13_L001_R2.fastq.gz -o D15780_S13_L001.sam ``` - ### [SAM-to-BAM](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA#runBWA) - 直接安裝內建的套件 - sam_to_bam.xml ```database/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/sam_to_bam/cf1ffd88f895/sam_to_bam/sam_to_bam.xml``` - ### [SortSamSpark](https://hackmd.io/XtsPHvS1RC25IlS6K2AcNA#runSortSam) ```bash ./gatk SortSamSpark --input D15780_S13_L001.bam --output D15780_S13_L001.sorted.bam --sort-order coordinate --java-options "-XX:+UseNUMA -Xmx16G" --tmp-dir . -- --spark-runner LOCAL --spark-master local[4] --conf spark.local.dir=./tmp # 備註1:簡單版也可以跑,但不確定後面參數的用途 ./gatk SortSamSpark --input D15780_S13_L001.bam --output D15780_S13_L001.sorted.bam --sort-order coordinate --java-options "-XX:+UseNUMA -Xmx16G" --tmp-dir . # 備註2: # - 把參數 spark-runner 搬到 -- 之前、或是搬到 -- 後面,看 log 一樣都有偵測到該參數 # - -- 測試起來,沒有實際作用,應該視為空參數,感覺只是給人類閱讀,單純用來區開參數 ``` - 錯誤排除 - 工具所接收的 input 檔,必須是 sam 檔 - 但是 Galaxy 輸入/輸出檔案的副檔名,皆命名為 .dat - 因此,工具會丟出例外 ``` A USER ERROR has occurred: Failed to read bam header from /Everythings/galaxy/database/files/000/dataset_142.dat Caused by:Cannot find format extension for /Everythings/galaxy/database/files/000/dataset_142.dat ``` - 暫時解法 ```xml <command detect_errors="exit_code"><![CDATA[ cp ${input} ${input}.bam; ## rename it to *.bam @CMD_BEGIN@ SortSamSpark ##include source=$bam_req_opts# -I ${input}.bam -O ${output} --sort-order "${sort_order}" ## #include source=$bam_opt_opts# --tmp-dir . -- --spark-runner LOCAL --spark-master local[4] --conf spark.local.dir=./tmp ; rm -f '${input}.bam' ]]></command> ``` - ```cp ${input} ${input}.bam; ## rename it to *.bam``` <br> <hr> <br> ## Cheetah - ### 簡介 - 免費開源的樣板引擎 - 也是一個程式碼生成工具 - 由 python2/3 驅動 - Python 官網:https://pypi.org/project/Cheetah3/ - Cheetah User’s Guide:https://cheetahtemplate.org/users_guide/index.html - ### 安裝與執行 - #### 套件安裝 - 安裝指令 ```pip install Cheetah3``` - python2 - [```sudo apt install python-pip```](https://blog.csdn.net/Mr_Cat123/article/details/79221012) - python3 安裝失敗的處理方式 - 錯誤訊息 >ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/lib/python3.5/site-packages' Consider using the `--user` option or check the permissions. - 如何解決 ```bash sudo python3 -m pip install Cheetah3 ``` - [[Errno 13] Permission denied How i solve this problem #236](https://github.com/googlesamples/assistant-sdk-python/issues/236) - #### 範例程式1 ([Quickstart tutorial](https://cheetahtemplate.org/users_guide/gettingStarted.html#quickstart-tutorial)) ```python from Cheetah.Template import Template templateDef = """ <HTML> <HEAD><TITLE>$title</TITLE></HEAD> <BODY> $contents ## this is a single-line Cheetah comment and won't appear in the output #* This is a multi-line comment and won't appear in the output blah, blah, blah *# </BODY> </HTML>""" nameSpace = {'title': 'Hello World Example', 'contents': 'Hello World!'} t = Template(templateDef, searchList=[nameSpace]) print(t) ``` 執行結果: ``` <HTML> <HEAD><TITLE>Hello World Example</TITLE></HEAD> <BODY> Hello World! </BODY> </HTML> ``` - #### 範例程式2 ```python from Cheetah.Template import Template templateDef = """ #set $people = [ {'name' : 'Tom', 'mood' : 'Happy'}, {'name' : 'Dick', 'mood' : 'Sad'}, {'name' : 'Harry', 'mood' : 'Hairy'}] <strong>How are you feeling?</strong> <ul> #for $person in $people <li> $person['name'] is $person['mood'] </li> #end for </ul> """ print(Template(templateDef)) ``` 執行結果: ``` <strong>How are you feeling?</strong> <ul> <li> Tom is Happy </li> <li> Dick is Sad </li> <li> Harry is Hairy </li> </ul> ``` - #### 範例程式3 ([https://cheetahtemplate.org/](https://cheetahtemplate.org/)) ```cheetah #from Cheetah.Template import Template #extends Template #set $people = [{'name' : 'Tom', 'mood' : 'Happy'}, {'name' : 'Dick', 'mood' : 'Sad'}, {'name' : 'Harry', 'mood' : 'Hairy'}] <strong>How are you feeling?</strong> <ul> #for $person in $people <li> $person['name'] is $person['mood'] </li> #end for </ul> ``` 填入 ```bash $ cheetah fill test.py Filling test.py -> test.py.html ``` 開啟 test.py.html ![](https://i.imgur.com/kERluzP.png) <br> ## 參考資料 - [Installing Tools into Galaxy](https://galaxyproject.org/admin/tools/add-tool-from-toolshed-tutorial/) - [Adding custom tools to Galaxy](https://galaxyproject.org/admin/tools/add-tool-tutorial/) - [Galaxy Tool XML File](https://docs.galaxyproject.org/en/latest/dev/schema.html) <br> ## [On-Going] Tab list - ### ESC4000 - http://10.78.26.241:9696/ - ### Github - [compbio-galaxy-wrappers/gatk4/gatk4_markduplicates.xml](https://github.com/ohsu-comp-bio/compbio-galaxy-wrappers/blob/master/gatk4/gatk4_markduplicates.xml) - ### Galaxy - [Galaxy Tool XML File](https://docs.galaxyproject.org/en/latest/dev/schema.html#tool-outputs-collection) - [Creating a histogram tool tutorial.](https://galaxyproject.org/admin/tools/adding-tools/)