# Giant weta assembly Updates
## Running fmlrc on Mahuika
**Script for jobid 14270030**
```bash
if [ ! -f ${asmdir}/weta_msbwt.npy ]; then
gunzip -c ${illuminadir}/*.fastq.gz | awk "NR % 4 == 2" | sort -T $TMPDIR | tr NT TN | ropebwt2 -LR | tr NT TN | fmlrc-convert ${amsdir}/weta_msbwt.npy
fi
#run fmlrc
NUM_PROCS=24
fmlrc -p $NUM_PROCS -e 400 ${asmdir}/weta_msbwt.npy ${datadir}/weta_?.fasta ${asmdir}/corrected_final.fa
```
**Chris' code that worked**
```bash
awk 'NR % 4 == 2' Trimmed_Hericium_all.fastq | sort | tr NT TN | ropebwt2 -LR | tr NT TN | fmlrc-convert all.npy
```
**fmlrc command line execute**
```bash
fmlrc -p 20 all.npy HKGP-3.6.1-alldata.fasta HKGP-3.6.1-alldata-FMLRC-corrected.fasta
```
**End of Chris' code that worked**
* **Script as it currently stands**
```bash
if [ ! -f ${asmdir}/weta_msbwt.npy ]; then
gunzip -c ${illuminadir}/*.fastq.gz | awk "NR % 4 == 2" | sort -T $TMPDIR | tr NT TN | ropebwt2 -LR | tr NT TN | fmlrc-convert ${amsdir}/weta_msbwt.npy
fi
```
**Annabel: Split FMLRC**
```bash
/home/awhi701/nobackup_02613/MYNA_SRA/TrimGalore_out/13099_extra.R2_val_2.fq.gz | awk 'NR % 4 == 2' | sort | gzip > FMLRC_13099_extra_trimmed_R2.sorted.txt.gz
```
**MDs attempt at split fmlrc**
```bash
if [ ! -f ${asmdir}/weta_msbwt.npy ]; then
awk "NR % 4 == 2" ${illuminadir}/H07456-L1_S1_L001_R1_001.fastq | sort |gzip > H07456-L1_S1_L001_R1_001_sorted.txt.gz
fi
tr NT TN H07456-L1_S1_L001_R1_001_sorted.txt.gz | ropebwt2 -LR | tr NT TN | fmlrc-convert ${amsdir}/weta_msbwt.npy
```
**conda location**
/nesi/project/landcare00070/mahuika_project/modules
```
vi ~/.condarc
```
```
channels:
- bioconda
- conda-forge
- defaults
- etetoolkit
pkgs_dirs:
- /nesi/project/landcare00070/mahuika_project/modules/pkgs
create_default_packages:
- setuptools
envs_dirs:
- /nesi/project/landcare00070/mahuika_project/modules
```
***
**Updates since 18 August 2020**
**FMLRC**
* Job running, waiting for output
* 14270673 manpreet ga03048 fmlrc-correct R None 2020-08-18T15:07:15 1:42:36 2-22:17:24 1 24
* This job failed with the following error:
* tr: extra operand ‘H07456-L1_S1_L001_R1_001_sorted.txt.gz’
Try 'tr --help' for more information.
[M::main_ropebwt2] inserted 1 symbols in 0.001 sec, 0.000 CPU sec
[M::main_ropebwt2] constructed FM-index in 0.002 sec, 0.000 CPU sec
[M::main_ropebwt2] symbol counts: ($, A, C, G, T, N) = (1, 0, 0, 0, 0, 0)
[M::main] Version: r187
[M::main] CMD: ropebwt2 -LR
[M::main] Real time: 0.002 sec; CPU: 0.009 sec
/var/spool/slurm/job14270673/slurm_script: line 37: 54178 Exit 1 tr NT TN H07456-L1_S1_L001_R1_001_sorted.txt.gz
54179 Done | ropebwt2 -LR
54180 Done | tr NT TN
54181 Segmentation fault (core dumped) | fmlrc-convert ${amsdir}/weta_msbwt.npy
**Updates 19 Aug 2020**
* Looks like the first half of the script worked. Output > sorted.txt.gz
* But second step failed. tr due to path error.
* Error fixed and this job has been resubmitted.
* New error: tr: extra operand ‘H07456-L1_S1_L001_R1_001_sorted.txt.gz’
Try 'tr --help' for more information.
[M::main_ropebwt2] inserted 1 symbols in 0.001 sec, 0.000 CPU sec
[M::main_ropebwt2] constructed FM-index in 0.001 sec, 0.000 CPU sec
[M::main_ropebwt2] symbol counts: ($, A, C, G, T, N) = (1, 0, 0, 0, 0, 0)
[M::main] Version: r187
[M::main] CMD: ropebwt2 -LR
[M::main] Real time: 0.001 sec; CPU: 0.003 sec
[fmlrc-convert] Reading from stdin
[fmlrc-convert] symbol counts ($, A, C, G, N, T) = (1, 0, 0, 0, 0, 0)
[fmlrc-convert] RLE-BWT byte length: 1
[fmlrc-convert] RLE-BWT conversion complete.
ERROR: Fasta file does not exist
* Error fixed resubmitted.
* New issue:
* The script was run in three successive steps:
*1 Illumina reads -> sorted,txt.gz
```bash
awk "NR % 4 == 2" ${illuminadir}/H07456-L1_S1_L001_R1_001.fastq | sort |gzip > H07456-L1_S1_L001_R1_001_sorted.txt.gz
```
*2 tr step > msbwt.npy
```bash
tr NT TN H07456-L1_S1_L001_R1_001_sorted.txt.gz | ropebwt2 -LR | tr NT TN | fmlrc-convert ${asmdir}/weta_msbwt.npy
```
*3 correction via fmlrc
```bash
fmlrc -p $NUM_PROCS -e 400 ${asmdir}/weta_msbwt.npy ${datadir}/*.fasta ${asmdir}/corrected_final.fa
```
stdout:
loaded bwt with 1 compressed values
Finished processing reads [0, 400)
This did not produce a corrected_final.fa, but instead one of the input longread fasta files is now a fraction of the original input file. WHAT?
***Update on 24 August 2020***
Job 14270030 with full dataset killed as below.
```bash
[M::main_ropebwt2] inserted 10415295816 symbols in 889.926 sec, 2454.212 CPU sec
[M::main_ropebwt2] inserted 10415295816 symbols in 908.420 sec, 2480.353 CPU sec
[M::main_ropebwt2] inserted 10415295816 symbols in 934.089 sec, 2484.000 CPU sec
/var/spool/slurm/job14270030/slurm_script: line 35: 70496 Done
gunzip -c ${illuminadir}/*.fastq.gz
70497 | awk "NR % 4 == 2"
70498 Broken pipe | sort -T $TMPDIR
70499 Broken pipe | tr NT TN
70500 Killed | ropebwt2 -LR
70501 | tr NT TN
70502 Segmentation fault (core dumped) | fmlrc-convert ${amsdir}/weta_msbwt.npy
slurmstepd: error: Detected 1 oom-kill event(s) in step 14270030.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
(END)
```
## Raven
* Job running with mecat-corrected data
* 14271250 manpreet landcar pb-assemble-ra R None 2020-08-18T16:47:52 1:59 1-23:58:01 1 48
* end of slurm out:
[raven::Graph::Polish] reached checkpoint 10.279901s
[raven::] 7610.063556s
* output = mecat-raven-asm1.fasta
* saact = 02:07:05 2-18:48:54 48 4620535+ COMPLETED
*Updates 19 Aug 2020*
* running quast report for this assembly
* Assembly length: 310817757 (0.3G)
* Resubmitted assembly job with full (uncorrected dataset): 14285453
***Updates. from 24-08-2020***
* Job exited with time out. Resubmitted (14335221) with '--resume' flag.
```bash
[manpreet.dhami@mahuika02 PacBIO]$ less slurm-14335221.out
[raven::] loaded previous run 5.006878s
[raven::] 5.038449s
```
* Job failed almost immediately after this message. sacct says the followi g
```bash
JobID JobName Elapsed TotalCPU Alloc MaxRSS State
-------------- -------------- ----------- ------------ ----- -------- ----------
14335221 pb-assemble-r+ 00:00:13 00:07.165 48 COMPLETED
14335221.batch batch 00:00:13 00:07.163 48 875K COMPLETED
14335221.exte+ extern 00:00:13 00:00.001 48 771K COMPLETED
```
* Job resubmitted with different output file name, incase its a write issue.
```bash
[raven::] loaded previous run 5.015174s
[raven::] 5.047051s
```
* Job resumed when run without the --resume flag: 14335243
*Updates from 25 Aug 2020*
* Completed
```bash
for 250463 / 267160 windows [===============>] 1263.725767s^M[racon::Polisher::Polish] called consensus for 267160 / 267160 windows [================] 1274.289830s
[raven::Graph::Polish] reached checkpoint 8.304683s
[raven::] 6357.015190s
JobID JobName Elapsed TotalCPU Alloc MaxRSS State
-------------- -------------- ----------- ------------ ----- -------- ----------
14335243 pb-assemble-r+ 00:47:10 1-03:54:05 48 COMPLETED
14335243.batch batch 00:47:10 1-03:54:05 48 4503528+ COMPLETED
14335243.exte+ extern 00:47:11 00:00.001 48 291K COMPLETED
```
* Another job submitted with same uncorrected pb dataset, but without the --weaken flag: 14335500
* COMPLETED
```bash
^M[racon::Polisher::Polish] called consensus for 11832 / 13522 windows [==============> ] 1234.783347s^M[racon::Polisher::Polish] called consensus for 12677 / 13522 windows [===============>] 1287.126407s^M[racon::Polisher::Polish] called consensus for 13522 / 13522 windows [================] 1295.214860s
[raven::Graph::Polish] reached checkpoint 7.215402s
[raven::] 2806.383468s
JobID JobName Elapsed TotalCPU Alloc MaxRSS State
-------------- -------------- ----------- ------------ ----- -------- ----------
14335500 pb-assemble-r+ 01:46:20 2-08:45:39 48 COMPLETED
14335500.batch batch 01:46:20 2-08:45:39 48 6288458+ COMPLETED
14335500.exte+ extern 01:46:21 00:00.002 48 825K COMPLETED
```
* Assembly size (without --weaken flag, full pb dataset) = 0.13 GB
## Mecat correction + assembly
* resubmitted with coverage = 10
* 14272160 manpreet ga03048 mecat-correct R None 2020-08-18T17:23:36 0:05 23:59:55 1 16
* job timed out.
*Updates 19 Aug 2020*
* Resubmitted with 72 hours: 14285230
* Job completed in 68 h. But assembly is 0.02 GB. Poor. Mecat ruled out.
*****
## Ratatosk
* submitted trial job with 1xpb readfile, 1lane of illumina, 16 cpus, 40G mem, and 24 hours: 14285852
* Job killed with Bus error
```bash
CompactedDBG::filter(): Number of blocks in Bloom filter is 36724071
/var/spool/slurm/job14285852/slurm_script: line 31: 145625 Bus error
```
* resubmitted with 80G mem x 24 cores : 14286903
*Updates from 24 Aug 2020*
* Exited with stupid mkdir error. Resubmitted with correction: 14335227
* Job exited with a bus error as follows;
```bash
CompactedDBG::construct(): Joining unitigs
CompactedDBG::construct(): After join: 201427129 unitigs
CompactedDBG::construct(): Joined 144829996 unitigs
Ratatosk::Ratatosk(): Adding coverage to vertices and edges (1/2).
Ratatosk::addCoverage(): Anchoring reads on graph.
/var/spool/slurm/job14335227/slurm_script: line 31: 57513 Bus error (core dumped) $Ratatosk -v -c 24 -s ${scriptdir}/illumina_readlist_1lane.txt -l ${scriptdir}/pb_readlist_1file.txt -o pb-ratatosk-1.fa
```
*****
## FALCON PIPELINE
Have Hi-C data along with PB, potentially this is worth testing out.
https://github.com/audreystott/dunnart
Falcon Nextflow pipeline installed & tested by Joseph, run into permissions issue. [03/02/2021]
Install dir:
```bash=
/nesi/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest
nextflow falcon/main.nf
```
Manpreet tested nextflow pipeline as below:
[18 Feb 2020]
```bash=
module load Nextflow/21.02.0
module load Miniconda3
nextflow falcon/main.nf
```
log output: nextflow.log:
```bash=
N E X T F L O W ~ version 21.02.0-edge
Launching `falcon/main.nf` [sick_mclean] - revision: 57cb2d38cf
[- ] process > fc_run -
[- ] process > fc_unzip -
[- ] process > fc_phase -
executor > slurm (1)
[b7/2ba2a0] process > fc_run [ 0%] 0 of 1
[- ] process > fc_unzip -
[- ] process > fc_phase -
executor > slurm (1)
[b7/2ba2a0] process > fc_run [ 0%] 0 of 1
[- ] process > fc_unzip -
[- ] process > fc_phase -
Error executing process > 'fc_run'
Caused by:
Process `fc_run` terminated with an error exit status (1)
Command executed:
sed -i "s/outs.write('/#outs.write('/" /scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/*/lib/python3.7/site-packages/falcon_kit/mains/ovlp_filter.py
fc_run /scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/fc_run.cfg
Command exit status:
1
Command output:
(empty)
Command error:
"job.step.dust": {},
"job.step.la": {
"MB": "32768",
"NPROC": "4",
"njobs": "240"
},
"job.step.pda": {
"MB": "32768",
"NPROC": "4",
"njobs": "240"
},
"job.step.pla": {
"MB": "32768",
"NPROC": "4",
"njobs": "240"
}
}
[INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.blocking' from '/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pwatcher/blocking.py'>
[INFO]job_type='slurm', (default)job_defaults={'job_type': 'slurm', 'pwatcher_type': 'blocking', 'JOB_QUEUE': 'default', 'MB': '102400', 'NPROC': '6', 'njobs': '32', 'submit': 'srun \\\n-J ${JOB_NAME} \\\n--mem=${MB}M \\\n--cpus-per-task=${NPROC} \\\n"${JOB_SCRIPT}"', 'use_tmpdir': False}, use_tmpdir=False, squash=False, job_name_style=0
[INFO]Setting max_jobs to 32; was None
[INFO]Num unsatisfied: 2, graph: 2
[INFO]About to submit: Node(0-rawreads/build)
[INFO]Popen: 'srun \
-J P445a2258be0a69 \
--mem=4000M \
--cpus-per-task=1 \
"/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pwatcher/mains/job_start.sh"'
[INFO](slept for another 0.0s -- another 1 loop iterations)
srun: error: Unable to create step for job 18100314: Memory required by task is not available
[ERROR]Task Node(0-rawreads/build) failed with exit-code=1
[ERROR]Some tasks are recently_done but not satisfied: {Node(0-rawreads/build)}
[ERROR]ready: set()
submitted: set()
[ERROR]Noop. We cannot kill blocked threads. Hopefully, everything will die on SIGTERM.
Traceback (most recent call last):
File "/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/bin/fc_run", line 11, in <module>
load_entry_point('falcon-kit==1.8.1', 'console_scripts', 'fc_run')()
File "/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 706, in main
main1(argv[0], args.config, args.logger)
File "/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 73, in main1
input_fofn_fn=input_fofn_fn,
File "/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/run1.py", line 235, in run
dist=Dist(NPROC=4, MB=4000, job_dict=config['job.step.da']),
File "/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/pype.py", line 106, in gen_parallel_tasks
wf.refreshTargets()
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/simple_pwatcher_bridge.py", line 278, in refreshTargets
self._refreshTargets(updateFreq, exitOnFailure)
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/simple_pwatcher_bridge.py", line 362, in _refreshTargets
raise Exception(msg)
Exception: Some tasks are recently_done but not satisfied: {Node(0-rawreads/build)}
Work dir:
/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/96/6ec0f232064acd68fe7ee260e92bb0
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
```
There seems to be an issue with memory allocation. Looked at fc_run.cfg and at every step the memory allocation is greater than 512MB, but in the case of actual allocation, it seems to be stuck on 2 cpus x 512 MB mem, which is the default allocation for slurm when it doesnt have allocation info. So possibly an issue with the srun blocks?
[19/02/2021]
Troubleshooting with Joseph & Dini
It seems that slurm is not able to understand and allocate correct memory. We have tried with add account=landcare00070. but still the same error.
Perhaps there is an environmental variable being set up by Pypeflow thats messing things up for slurm. Trying to get Falcon to submit jobs locally rather than back to slurm. New error message:
```bash=
/scale_wlg_persistent/filesets/project/landcare00070/mahuika_project/scripts/genome_assembly/PacBIO/josephtest/work/fe/e2bcae55212683e03290e0ee5d6c53/0-rawreads/build/run-P837d7e819ddee0.bash.stderr
```
This made falcon work but all allocations are the same for each step. To avoid Force kill due t not enough time, srun has been given hard allocation of 7 day run time for each step.
Have also moved the falcon run directory and cfg files to a new location:
```bash=
/nesi/nobackup/ga03048/assemblies/falcon/
```
And running nextflow>>falcon from there. No backup shouldnt run out of space.
Run process has started and logs for first step (0-rawreads) is here:
```bash=
/nesi/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build
```
First step of 0-rawreads converting fasta2DB is currently running!!! Havent seen this since June last year :)
UPDATE 26-05-2021
----
Error: Not enough reads for desired coverage.
```
Adding 'weta_m54214_200421_180821.subreads.fasta.gz.fasta' ...
INFO:root: #7 count= 91,479
INFO:root: #12 count= 179,589
INFO:root: #25 count= 341,593
INFO:root: #60 count= 670,460
INFO:root: #109 count= 1,312,000
INFO:root: #211 count= 2,560,965
INFO:root: #435 count= 5,068,225
INFO:root: #849 count= 10,060,264
INFO:root: #1,735 count= 20,040,510
INFO:root: #3,484 count= 39,984,665
INFO:root: #6,827 count= 79,859,225
INFO:root: #13,472 count= 159,609,520
INFO:root: #26,969 count= 319,115,009
INFO:root: #54,247 count= 638,117,229
INFO:root: #109,371 count= 1,276,127,202
INFO:root: #219,004 count= 2,552,122,329
INFO:root: #436,041 count= 5,104,112,108
INFO:root: #871,997 count= 10,208,061,125
+ read fn
#cat fc.fofn | xargs rm -f
DBdust raw_reads
+ DBdust raw_reads
DBsplit -f -x500 -s400 -a raw_reads
+ DBsplit -f -x500 -s400 -a raw_reads
#LB=$(cat raw_reads.db | LD_LIBRARY_PATH= awk '$1 == "blocks" {print $3}')
#echo -n $LB >| db_block_count
CUTOFF=$(python3 -m falcon_kit.mains.calc_cutoff --coverage 10.0 6500000000 <(DBstats -b1 raw_reads))
++ python3 -m falcon_kit.mains.calc_cutoff --coverage 10.0 6500000000 /dev/fd/63
+++ DBstats -b1 raw_reads
falcon-kit 1.8.1 (pip thinks "falcon-kit 1.8.1")
pypeflow 2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476
Traceback (most recent call last):
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/calc_:
+ CUTOFF=
[WARNING]Call 'bash -vex build_db.sh' returned 256.
Traceback (most recent call last):
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/runpy.py", line 193, in _run_module_
as_main
"__main__", mod_spec)
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/dazzl
er.py", line 1532, in <module>
main()
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 1528, in main
args.func(args)
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 1050, in cmd_build
build_db(ours, args.input_fofn_fn, args.db_fn, args.length_cutoff_fn)
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/site-packages/falcon_kit/mains/dazzler.py", line 169, in build_db
io.syscall('bash -vex {}'.format(script_fn))
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/io.py", line 29, in syscall
raise Exception(msg)
Exception: Call 'bash -vex build_db.sh' returned 256.
2021-02-19 14:21:40,190 - root - WARNING - Call '/bin/bash user_script.sh' returned 256.
2021-02-19 14:21:40,234 - root - WARNING - CD: '/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build' -> '/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build'
2021-02-19 14:21:40,235 - root - WARNING - CD: '/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build' -> '/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build'
2021-02-19 14:21:40,264 - root - CRITICAL - Error in /home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py with args="{'json_fn': '/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build/task.json',\n 'timeout': 30,\n 'tmpdir': None}"
Traceback (most recent call last):
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/conda/pb-assembly-71bfdfbd43464806dce898a62108a83c/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py", line 267, in <module>
main()
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py", line 259, in main
run(**vars(parsed_args))
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py", line 253, in run
run_cfg_in_tmpdir(cfg, tmpdir, '.')
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py", line 228, in run_cfg_in_tmpdir
run_bash(bash_template, myinputs, myoutputs, parameters)
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/do_task.py", line 187, in run_bash
util.system(cmd)
File "/home/manpreet.dhami/.local/lib/python3.7/site-packages/pypeflow-2.1.1+git.d63b0e79f5a7b2d370b7de84a890f88271afa476-py3.7.egg/pypeflow/io.py", line 29, in syscall
raise Exception(msg)
Exception: Call '/bin/bash user_script.sh' returned 256.
+++ pwd
++ echo 'FAILURE. Running top in /scale_wlg_nobackup/filesets/nobackup/ga03048/assemblies/falcon/work/91/4de6d4162742c3235cb93b0d5116c3/0-rawreads/build (If you see -terminal database is inaccessible- you are using the python bin-wrapper, so you will not get diagnostic info. No big deal. This process is crashing anyway.)'
++ rm -f top.txt
++ which python
++ which top
++ env -u LD_LIBRARY_PATH top -b -n 1
++ env -u LD_LIBRARY_PATH top -b -n 1
++ pstree -apl
real 105m27.818s
user 97m12.563s
sys 2m33.992s
+ finish
+ echo 'finish code: 1'
```
fc_run.config seed_coverage option changed to 5
Job resubmitted
job id: 20177507
ERROR:
```
fasta2DB: raw_reads.db is corrupted, read failed
```
restart job.
Job Finished with the following sacct output:
```
JobID JobName Alloc Elapsed TotalCPU ReqMem MaxRSS State
--------------- ---------------- ----- ----------- ------------ ------- -------- ----------
20188290 Pd5b516a622eb5e 2 01:43:46 01:43:14 4000Mc FAILED
20188290.extern extern 2 01:43:46 00:00.001 4000Mc 0 COMPLETED
20188290.0 Pd5b516a622eb5e 2 01:43:44 01:43:14 4000Mc 239252K COMPLETED
```
Location of .sdterr output:
```
//nesi/nobackup/ga03048/assemblies/falcon/work/e4/7440b9ac7ba3660659ca9ee0fcdc24/0-rawreads/build
```
Looks like one of the input files may have an issue?
/nesi/project/ga03048/weta/weta_m54219_190817_073801.subreads.fasta.gz
File removed from subreads.fasta.fofn
```
/nesi/project/ga03048/weta/weta_m54219_190817_073801.subreads.fasta.gz
```
Job resubmitted.
- still having issues with "readsDB corrupted" but the reads keep being added and the jobs moves past this error.
- New error this time:
```
slurmstepd: error: *** STEP 20191425.0 ON wbl001 CANCELLED AT 2021-05-27T09:33:46 ***
```
Not sure what this is about?
location of error file:
```
/nesi/nobackup/ga03048/assemblies/falcon/work/17/8441484c815d57fcfaf2ffea685587/0-rawreads/build
```
Another file was giving same read_db corrupted error so removed it as well.
Re-running now.
- same issue. A bit of googling suggests that this may be the issue:
- https://github.com/PacificBiosciences/pbbioconda/issues/111
- multiple jobs are being submitted concurrently leading to the read_db corrupted error rather than actual issue with the read files.
*****
## Canu correction
```bash=
CRASH: Last 50 lines of the relevant log file (correction/weta-asm4.ovlStore.BUILDING/scripts/1-bucketize.jobSubmit-01.out):
CRASH:
CRASH: sbatch: error: Please specify one of your project codes as the Slurm account for this job.
CRASH: sbatch: error: AssocMaxSubmitJobLimit
CRASH: sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
CRASH:
srun: error: wbh001: task 0: Exited with exit code 1
```
### potential solution
added grid option to include account