# HLA-on-aws Upload your fastq, we will run HLA pipeline for your in aws. Github: https://github.com/linnil1/hla-on-aws ## Architecture ### API API logic 1. I deploy my nuxt APP in Cloudflare 2. APIGateway + lambda as API server 3. lambda give user an ID and a temporary s3_url 4. User upload fastq into s3 5. lambda trigger step function 6. lambda retrieve status for specific ID from s3 (I store status in s3 not in database XD) ### step function AWS Step function = data-pipeline 1. lambda copy s3 object to EFS 2. batch run hisat2 or bwakit 3. lambda parse the result and upload to s3 4. lambda set running status ## Setup IAM by root user (In console) I create a HLA user/group to run the aws command below. The user has the right the create, list, update, write the settings, thus the permission is not minimal (Exclude IAM part, and IAM part is the only part we need to setup manually). Here is the policy * AmazonEC2FullAccess * AmazonEC2ContainerRegistryFullAccess * AmazonS3FullAccess * AmazonAPIGatewayAdministrator * AWSBatchFullAccess * AmazonVPCFullAccess * AmazonElasticFileSystemFullAccess * AWSStepFunctionsFullAccess * AWSLambda_FullAccess * Permission_for_assigning_role_to_service(Inline Policy) ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "iam:PassRole", "Resource": "*" } ] } ``` ## AWS cli After the user setup, you can get the access key and token in security credentials. Then, you can configure your awscli(An AWS command line tool) I recommend to set profile as default, otherwise you will add `--profile=awshla` for every command. ``` bash pip install awscli aws configure --profile=awshla ``` ## Setup IAM role (In console) Here is our Role ### Lambda Name: hla_lambda * AWSLambdaVPCAccessExecutionRole * Permission_for_trigger_stepfunctions(Inline Policy) ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "states:StartExecution", "Resource": "arn:aws:states:us-east-2:493445452763:stateMachine:hla" } ] } ``` * Permission_for_readwrite_s3(Inline Policy) ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject" ], "Resource": "arn:aws:s3:::hla-bucket/*" } ] } ``` ### Step Functions Name: hla_step * AWSBatchServiceRole * AWSLambdaRole * Permission_for_run_batch(Inline Policy)(It's needed) ``` { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "batch:SubmitJob", "batch:DescribeJobs", "batch:TerminateJob" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "events:PutTargets", "events:PutRule", "events:DescribeRule" ], "Resource": [ "*" ] } ] } ``` ### API gateway Name: hla_api * AWSLambdaRole ### Batch Name: hla * AmazonECSTaskExecutionRolePolicy ## Index prepare The index data will save into EFS ### bwakit follow https://github.com/lh3/bwa/tree/master/bwakit#introduction ``` bash wget http://sourceforge.net/projects/bio-bwa/files/bwakit/bwakit-0.7.12_x64-linux.tar.bz2/download -O bwakit-0.7.12_x64-linux.tar.bz2 tar xf bwakit-0.7.12_x64-linux.tar.bz2 cd bwa.kit dk quay.io/biocontainers/bwakit:0.7.17.dev1--0 run-gen-ref hs38DH dk quay.io/biocontainers/bwakit:0.7.17.dev1--0 bwa index hs38DH.fa mkdir bwakit_index mv hs38* bwakit_index tar zcf bwakit.tar.gz bwakit_index cd .. ``` ### hisat2 ``` bash! cd hisat2 mkdir hisat_index_1 cd hisat_index_1 wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat-genotype/data/genotype_genome_20180128.tar.gz tar xf genotype_genome_20180128.tar.gz cd .. git clone https://github.com/DaehwanKimLab/hisat-genotype.git echo "2.2.1" >> hisat-genotype/hisat2/VERSION docker build . -f Dockerfile_hisat2 -t linnil1/hisat2-conda dk -e PYTHONPATH=hisat-genotype/hisatgenotype_modules linnil1/hisat2-conda hisat-genotype/hisatgenotype -z hisat2_index_1/ --base hla -v --keep-alignment --keep-extract -1 hla-a.R1.fq.gz -2 hla-a.R2.fq.gz --out-dir result --threads 16 mkdir hisat_index mv hisat2_index_1/hla* hisat2_index mv hisat2_index_1/geno* hisat2_index mv hisat2_index/genotype_genome_20180128.tar.gz hisat2_index_1 mkdir hisat2_index/grch38 hisat2_index/hisatgenotype_db tar zcf hisat2.tar.gz hisat2_index cd .. ``` ## s3 Create a s3 bucket for saving * fastq * HLA result * status ``` bash aws s3 mb s3://hla-bucket --region us-east-2\ aws s3 ls ``` ## EFS Create two EFS * `hla_index` for saving index * `hla_tmp` for saving temporary data ``` bash aws efs create-file-system --tags Key=Name,Value=hla_index --encrypted aws efs create-file-system --tags Key=Name,Value=hla_tmp --encrypted aws efs describe-file-systems ``` ## ECR Because samtools is not in hisat2 container, so we need to build a new docker Image. (Change `493445452763.dkr.ecr.us-east-2.amazonaws.com` to your URL) ``` bash # create repo aws ecr create-repository --repository-name linnil1/hisat2_conda aws ecr describe-repositories # upload image aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 493445452763.dkr.ecr.us-east-2.amazonaws.com docker tag linnil1/hisat2-conda 493445452763.dkr.ecr.us-east-2.amazonaws.com/linnil1/hisat2_conda:2.2.1 docker push 493445452763.dkr.ecr.us-east-2.amazonaws.com/linnil1/hisat2_conda:2.2.1 # check aws ecr list-images --repository-name linnil1/hisat2_conda ``` ## Network ### VPC Create a private network (Must of aws services need to use it) and subnet also. ``` bash aws ec2 create-vpc --cidr-block 10.0.0.0/16 --tag-specifications ResourceType=vpc,Tags='[{Key=Name,Value="hla"}]' aws ec2 describe-vpcs aws ec2 modify-vpc-attribute --vpc-id vpc-0ffe39707fa04e0e6 --enable-dns-hostnames "{\"Value\": true}" aws ec2 modify-vpc-attribute --vpc-id vpc-0ffe39707fa04e0e6 --enable-dns-support "{\"Value\": true}" ``` DNS support is important when mounting EFS ### subnet Create two subnet under the VPC (Change `vpc-id` to your own VPC) ``` bash! aws ec2 describe-availability-zones aws ec2 create-subnet --cidr-block 10.0.0.0/18 \ --tag-specifications ResourceType=subnet,Tags='[{Key=Name,Value="hla-1"}]' \ --vpc-id vpc-0ffe39707fa04e0e6 \ --availability-zone us-east-2b aws ec2 create-subnet --cidr-block 10.0.64.0/18 \ --tag-specifications ResourceType=subnet,Tags='[{Key=Name,Value="hla-2"}]' \ --vpc-id vpc-0ffe39707fa04e0e6 \ --availability-zone us-east-2a aws ec2 describe-subnets ``` ### Security group Once you create VPC, aws will create a default security for you ``` aws ec2 describe-security-groups \ --filters Name=vpc-id,Values=vpc-0ffe39707fa04e0e6 ``` ## Routing This allow instances in VPC to access internet. internet-gateway ``` bash aws ec2 create-internet-gateway \ --tag-specifications "ResourceType=internet-gateway,Tags=[{Key=Name,Value=hla_admin_internet}]" aws ec2 attach-internet-gateway \ --internet-gateway-id igw-0f479995f0feaeae9 \ --vpc-id vpc-0ffe39707fa04e0e6 aws ec2 describe-internet-gateways ``` Routing table (add another rule in default route-table) ``` bash aws ec2 describe-route-tables aws ec2 create-route \ --route-table-id rtb-077a46c1b42cd1f8c \ --destination-cidr-block 0.0.0.0/0 \ --gateway-id igw-0f479995f0feaeae9 aws ec2 describe-route-tables ``` ## Mounting EFS ### Allow NFS(port 2049) Add security group for NFS port 2049 ``` bash aws ec2 create-security-group \ --group-name hla-efs \ --description "EFS group" \ --vpc-id vpc-0ffe39707fa04e0e6 aws ec2 authorize-security-group-ingress \ --group-id sg-01b3f3fbfc5be9118 \ --cidr 10.0.0.0/16 --port 2049 --protocol tcp ``` ### accessible in VPC Allow ec2 or ecs(container) to access EFS, we need to set EFS under the same VPC, subnet ``` bash aws efs create-mount-target \ --file-system-id fs-0b6fcc539fde3326d \ --subnet-id subnet-0d2af03055f6c8198 \ --security-groups sg-01b3f3fbfc5be9118 aws efs create-mount-target \ --file-system-id fs-0b6fcc539fde3326d \ --subnet-id subnet-08822fdba8b2a6572 \ --security-groups sg-01b3f3fbfc5be9118 aws efs describe-mount-targets \ --file-system-id fs-0b6fcc539fde3326d aws efs create-mount-target \ --file-system-id fs-02b3281e00a6df32a \ --subnet-id subnet-0d2af03055f6c8198 \ --security-groups sg-01b3f3fbfc5be9118 aws efs create-mount-target \ --file-system-id fs-02b3281e00a6df32a \ --subnet-id subnet-08822fdba8b2a6572 \ --security-groups sg-01b3f3fbfc5be9118 ``` ### Accessible in lambda To give the lambda a permission to access EFS, we need to set the access point in EFS `hla_tmp` (`hla_index` is not needed for lambda to read) ``` bash aws efs create-access-point \ --file-system-id fs-0b6fcc539fde3326d \ --posix-user Uid=0,Gid=0 aws efs describe-access-points ``` ## lambda Create many function for APIGateway and step function. ``` bash zip -jr hla_lambda.zip lambda aws lambda create-function \ --function-name hla_init \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --memory-size 1024 \ --timeout 60 \ --handler hla_init.main \ --file-system-configs Arn=arn:aws:elasticfilesystem:us-east-2:493445452763:access-point/fsap-037456da3db417cbc,LocalMountPath=/mnt/data \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 aws lambda create-function \ --function-name hla_bwakit_result \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --file-system-configs Arn=arn:aws:elasticfilesystem:us-east-2:493445452763:access-point/fsap-037456da3db417cbc,LocalMountPath=/mnt/data \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_bwakit_result.main aws lambda create-function \ --function-name hla_hisat2_result \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --file-system-configs Arn=arn:aws:elasticfilesystem:us-east-2:493445452763:access-point/fsap-037456da3db417cbc,LocalMountPath=/mnt/data \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_hisat2_result.main aws lambda create-function \ --function-name hla_final \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --file-system-configs Arn=arn:aws:elasticfilesystem:us-east-2:493445452763:access-point/fsap-037456da3db417cbc,LocalMountPath=/mnt/data \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_final.main \ --timeout 5 aws lambda create-function \ --function-name hla_api \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_api.main aws lambda create-function \ --function-name hla_set_method_status \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_set_method_status.main ``` ### Access stepfuncitons and s3 in lambda s3 ``` bash aws ec2 create-vpc-endpoint \ --service-name hla_lambda_s3_gateway \ --vpc-id vpc-0ffe39707fa04e0e6 \ --service-name com.amazonaws.us-east-2.s3 \ --vpc-endpoint-type Gateway \ --route-table-ids rtb-077a46c1b42cd1f8c ``` step function ``` bash aws ec2 create-vpc-endpoint \ --service-name hla_lambda_stepfunction_gateway \ --vpc-id vpc-0ffe39707fa04e0e6 \ --service-name com.amazonaws.us-east-2.states \ --vpc-endpoint-type Interface \ --subnet-ids '["subnet-0d2af03055f6c8198","subnet-08822fdba8b2a6572"]' \ --security-group-ids=sg-0851d5b74a506b8e7 ``` ### Developing lambda You can change lambda code and then re-upload and testing ``` bash zip -jr hla_lambda.zip lambda aws lambda update-function-code \ --function-name hla_init \ --zip-file fileb://hla_lambda.zip aws lambda invoke --function-name hla_init --payload '{ "name": "test1" }' test.json && cat test.json | jq ``` ## EC2 Moving our index data to EFS by EC2 instances Add ssh key and security group for port 22 ``` bash aws ec2 create-key-pair --key-name hlakey | jq ".KeyMaterial" -r > hlakey.pem aws ec2 create-security-group --group-name hla-admin --description "admin for HLA ec2" --vpc-id vpc-0ffe39707fa04e0e6 aws ec2 authorize-security-group-ingress --group-id sg-01308e7f097c05a5c --cidr 0.0.0.0/0 --port 22 --protocol tcp ``` Create t2.nano to write ``` bash aws ec2 run-instances \ --image-id ami-0b614a5d911900a9b \ --instance-type t2.nano \ --subnet-id subnet-08822fdba8b2a6572 \ --key-name hlakey \ --security-group-ids sg-01308e7f097c05a5c \ --network-interfaces AssociatePublicIpAddress=true,DeviceIndex=0 aws ec2 describe-instances ``` You can run anything in ec2 ``` bash # init ssh -i hlakey.pem ec2-user@18.221.199.110 sudo yum install -y amazon-efs-utils mkdir index sudo mount -t efs -o tls fs-02b3281e00a6df32a:/ index sudo chown ec2-user:ec2-user index exit # copy scp -i hlakey.pem bwakit/bwa.kit/bwakit.tar.gz ec2-user@18.221.199.110:~/index/ scp -i hlakey.pem hisat2/hisat2.tar.gz ec2-user@18.221.199.110:~/index/ scp -i hlakey.pem run_bwakit.sh ec2-user@18.221.199.110:~/index/ ssh -i hlakey.pem ec2-user@18.221.199.110 cd index git clone https://github.com/DaehwanKimLab/hisat-genotype.git echo "2.2.1" >> hisat-genotype/hisat2/VERSION tar xf bwakit.tar.gz rm bwakit.tar.gz tar xf hisat2.tar.gz rm hisat2.tar.gz exit # remember to stop it # it's costly aws ec2 stop-instances --instance-ids i-0dc46050bf7812889 ``` ## Batch Batch is the system to queue our job and run the job in container ``` bash! # setup environemt to FARGATE_SPOT(cheapest solution) aws batch create-compute-environment \ --compute-environment-name hla_env \ --type MANAGED \ --compute-resources type=FARGATE_SPOT,maxvCpus=32,subnets=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,securityGroupIds=sg-0851d5b74a506b8e7 aws batch describe-compute-environments # setup queue aws batch create-job-queue --job-queue-name hla_queue --priority 1 --compute-environment-order order=1,computeEnvironment=arn:aws:batch:us-east-2:493445452763:compute-environment/hla_env aws batch describe-job-queues ``` Setup definition (hisat2 and bwakit) ``` bash aws batch register-job-definition \ --cli-input-json file://job_bwakit.json aws batch register-job-definition \ --cli-input-json file://job_hisat2.json aws batch describe-job-definitions --status ACTIVE ``` ### Developing batch ``` bash # no edit in definition, i will automatically add revision number aws batch register-job-definition \ --cli-input-json file://job_bwakit.json # remove previous revision aws batch deregister-job-definition \ --job-definition hla-bwakit:2 aws batch submit-job \ --job-name hla_test2 \ --job-queue hla_queue \ --job-definition hla-bwakit:2 \ --parameters read1=/mnt/data/test1/test1.R1.fq.gz,read2=/mnt/data/test1/test1.R2.fq.gz,outputname=/mnt/data/test1/bwakit/test1 aws batch submit-job \ --job-name hla_test4 \ --job-queue hla_queue \ --job-definition hla_hisat2:1 \ --parameters read1=/mnt/data/test1/test1.R1.fq.gz,read2=/mnt/data/test1/test1.R2.fq.gz,output_folder=/mnt/data/test1/hisat2_1 aws batch list-jobs \ --job-queue hla_queue \ --job-status FAILED ``` ## Step Function Create pipeline, state machine language is written in `step_hla.json` ``` bash aws stepfunctions create-state-machine \ --name hla --role-arn "arn:aws:iam::493445452763:role/hla-step" \ --definition "$(cat step_hla.json)" aws stepfunctions list-state-machines aws stepfunctions describe-state-machine \ --state-machine-arn "arn:aws:states:us-east-2:493445452763:stateMachine:hla" ``` ### Developing stepfunction But I recommend to read the result in console and using Workflow studio to write the language ``` bash aws stepfunctions update-state-machine \ --state-machine-arn "arn:aws:states:us-east-2:493445452763:stateMachine:hla" \ --definition "$(cat step_hla.json)" aws stepfunctions start-execution \ --state-machine-arn "arn:aws:states:us-east-2:493445452763:stateMachine:hla" \ --input '{"name": "test1"}' aws stepfunctions list-executions \ --state-machine-arn "arn:aws:states:us-east-2:493445452763:stateMachine:hla" ``` ## APIGateway The apigateway can * Associate path and method to lambda function * Add stage: In this project is `/hla` * Limit the API calling rate https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html ``` bash # create API aws apigateway create-rest-api --name hla_api aws apigateway get-rest-apis aws apigateway get-resources \ --rest-api-id oy1431r9p1 # create path and method aws apigateway create-resource \ --rest-api-id oy1431r9p1 \ --parent-id cinl4m8ph3 \ --path-part "{proxy+}" aws apigateway get-resources \ --rest-api-id oy1431r9p1 aws apigateway put-method \ --rest-api-id oy1431r9p1 \ --resource-id n1qb8d \ --http-method ANY \ --authorization-type None # lambda aws apigateway put-integration \ --rest-api-id oy1431r9p1 \ --resource-id n1qb8d \ --http-method ANY \ --type AWS_PROXY \ --uri "arn:aws:apigateway:us-east-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-2:493445452763:function:hla_api/invocations" \ --integration-http-method POST \ --credentials "arn:aws:iam::493445452763:role/hla-api" aws apigateway test-invoke-method \ --rest-api-id oy1431r9p1 \ --resource-id n1qb8d \ --http-method POST \ --path-with-query-string "/create" ## deploy: The url will become `https://oy1431r9p1.execute-api.us-east-2.amazonaws.com/hla` aws apigateway create-deployment \ --rest-api-id oy1431r9p1 \ --stage-name hla aws apigateway get-deployments \ --rest-api-id oy1431r9p1 ``` ## Deploy Frontend I write the web interface by nuxt in `web/` Edit `wrnagler.toml` and `nuxt.config.ts` to change `hla.linnil1.me` and `AWS_API` ``` dk -p 2002:3000 -p 2003:24678 node:17-alpine sh yarn install yarn global add @cloudflare/wrangler wrangler publish ``` see https://hla.linnil1.me/ ## And new tools(kourami) ### Kourami Build the index in local ``` bash wget https://github.com/Kingsford-Group/kourami/releases/download/v0.9.6/kourami-0.9.6_bin.zip unzip kourami-0.9.6_bin.zip cd kourami-0.9.6 wget https://github.com/Kingsford-Group/kourami/releases/download/v0.9/kouramiDB_3.24.0.tar.gz tar xf kouramiDB_3.24.0.tar.gz dk quay.io/biocontainers/bwakit:0.7.17.dev1--0 bwa index db/All_FINAL_with_Decoy.fa.gz bash ./scripts/download_grch38.sh hs38NoAltDH dk quay.io/biocontainers/bwakit:0.7.17.dev1--0 bwa index ./resources/hs38NoAltDH.fa mkdir kourami_index mv db/* kourami_index/ mv resources/hs38NoAltDH.fa* kourami_index/ mv build/Kourami.jar kourami_index tar czf kourami_index.tar.gz kourami_index cd .. ``` ### aws ``` # copy index scp -i hlakey.pem run_bwakit.sh ec2-user@18.221.199.110:~/index/ ssh -i hlakey.pem kourami-0.9.6/kourami_index.tar.gz ec2-user@18.221.199.110:~/index cd index tar xf kourami_index.tar.gz rm kourami_index.tar.gz # lambda zip -jr hla_lambda.zip lambda aws lambda create-function \ --function-name hla_kourami_result \ --role arn:aws:iam::493445452763:role/hla-lambda \ --runtime python3.9 --architectures arm64 \ --zip-file fileb://hla_lambda.zip \ --file-system-configs Arn=arn:aws:elasticfilesystem:us-east-2:493445452763:access-point/fsap-037456da3db417cbc,LocalMountPath=/mnt/data \ --vpc-config SubnetIds=subnet-0d2af03055f6c8198,subnet-08822fdba8b2a6572,SecurityGroupIds=sg-0851d5b74a506b8e7 \ --handler hla_kourami_result.main \ --timeout 5 aws batch register-job-definition \ --cli-input-json file://job_kourami_preprocess.json aws batch register-job-definition \ --cli-input-json file://job_kourami_main.json ``` ### testing ``` aws s3 cp hisat2/hla-a.R1.fq.gz s3://hla-bucket/test1.R1.fq.gz aws s3 cp hisat2/hla-a.R2.fq.gz s3://hla-bucket/test1.R2.fq.gz aws batch submit-job \ --job-name hla_test6 \ --job-queue hla_queue \ --job-definition hla_kourami_preprocess:1 \ --parameters bam=/mnt/data/test1/bwakit/test1.aln.bam,output_folder=/mnt/data/test1/kourami,kourami_panel=/mnt/index/kourami_index/All_FINAL_with_Decoy.fa.gz,kourami_hs38=/mnt/index/kourami_index/hs38NoAltDH.fa aws batch submit-job \ --job-name hla_test8 \ --job-queue hla_queue \ --job-definition hla_kourami:1 \ --parameters bam=/mnt/data/test1/kourami/test1.aln.panel.bam,outputname=/mnt/data/test1/kourami/test1.aln.panel.kourami,kourami_db=/mnt/index/kourami_index,kourami_jar=/mnt/index/kourami_index/Kourami.jar aws lambda invoke --function-name hla_kourami_result --payload '{ "name": "test1" }' test.json && cat test.json | jq ``` tmp: ``` dk -e PYTHONPATH=hisat-genotype/hisatgenotype_modules linnil1/hisat2-conda hisat-genotype/hisatgenotype -z hisat2_index_1/ --base hla -v --keep-alignment --keep-extract -1 ERR194147_1.fastq.gz -2 ERR194147_2.fastq.gz --out-dir result --threads 16 samtools view ERR194147_1_fastq_gz-hla-extracted-1_fq.bam "A*BACKBONE" -o hla-a.bam samtools sort -n hla-a.bam -o hla-a.sort.bam samtools fastq hla-a.sort.bam -1 hla.R1.fq.gz -2 hla.R2.fq.gz -0 /dev/null -s /dev/null aws s3 cp hisat2/hla-a.R1.fq.gz s3://hla-bucket/test1.R1.fq.gz aws s3 cp hisat2/hla-a.R2.fq.gz s3://hla-bucket/test1.R2.fq.gz ``` ## step function pipeline ![](https://i.imgur.com/pNI8yaN.png)