Nanopore Guppy GPU basecalling on Windows using WSL2

Author: Miles Benton (GitHub; Twitter)
Created: 2021-06-16 21:05:32
Last modified: 2021-06-21 11:24:06

tags: `Nanopore` `GPU` `notes` `documentation` `Linux` `Windows`

WARNING: this is still very much 'experimental' in terms of the process and packages that are available. I will highlight throughout this section exactly which software and/or drivers are currently in beta - be warned it's pretty much everything.

Foreward

This is documentation of my notes and experiences getting a brand new laptop running Windows 10 to perform GPU basecalling via Windows Subsystem for Linux 2 (WSL2).

For those that know me well, they know that it has been a very long time since I owned (or really used) a Windows based device. When I made the shift to Linux was during my masters, which would have been around the time that Windows 7 was quite new on the scene.

Disclaimer: I did not have a good time! But to be fair I did get it working and once working it seems pretty good. I still believe that Linux offers the best experience for Nanopore sequencing and all downstream processing and analysis. The Linux desktop experience has come a very long way in the last 5-10 years and it's actually very easy and quick to get up and running. Plus Nanopore seem to do all their development either on, or geared towards, Linux (Ubuntu runs on all ONT hardware), and I don't see that changing anytime soon.

Resources

Before I launch into how I got set up I want to provide a list of some of the sites/resources I visited during the process:

As I mentioned previously, I had some real issues getting the Windows 'WSL' drivers communicating with the Ubuntu installed inside WSL2. I 'think' it was the reinstall of the Nvidia preview CUDA drivers with WSL, followed by a random system update (that seemed to take >30 mins) that finally 'fixed' my issue… but your milage may well vary. However the above resources should be enough to help get around any issues that arise (fingers crossed).

The system

For completeness sake I'll record the system specs of the laptop that was used for this experiment. It was a new HP ZBook Fury 17 G7 Mobile Workstation, a very 'beefy'/powerful laptop in the scheme of things.

OS Name:         Microsoft Windows 10 Pro for Workstations
Version:         10.0.21390 Build 21390
System Model:    HP ZBook Fury 17 G7 Mobile Workstation
System Type:	  x64-based PC
Processor:	  Intel(R) Xeon(R) W-10885M CPU @ 2.40GHz, 2400 Mhz, 8 Core(s), 16 Logical Processor(s)
System memory:   64Gb RAM
Display:         Nvidia Quadro RTX4000
Storage:         2x 2Tb NVMe SSD

The process

The basic overview of what is required looks something like this:

upgrade to Windows Insider -> install WSL -> install Nvidia drivers -> install CUDA toolkit -> basecall with GPU Guppy

It seems simple, it should be simple, but it's not as simple as it's made out to be. Hopefully the below provides some use to those embarking on this journey.

DISCLAIMER: I find writing 'guides' for setting things up on Windows difficult. It's so foreign to me trying to explain things like "navigate here", "click this", "install that", etc, as opposed to something reproducible like apt install package. Therefore I'm going to leverage websites and guides that have been already written for most of that type of thing. This is just a heads up that you will need to navigate around a little bit, mainly at the start - once we get into the Linux side of things life becomes 'simple' again.

Let's go!

Windows Insider installation/upgrade

BETA/PREVIEW SOFTWARE: this is just a warning that this software is still in preview/beta and may not behave as expected.

First you will need to sign up to the Windows developer program. You can do that here.

Once you have access to the Insider program have a read through this page and follow the instructions there to get upgraded to a fresh cutting edge Windows Insider build.

Installing WSL2

The below gives an overview of the process. It looks simple, it should be simple, I did not find it simple - again you're milage may vary.

Enable WSL 2
Enable ‘Virtual Machine Platform’
Set WSL 2 as default
Install a Linux distro

There is a 'preview' simplified approach:

The wsl --install simplified install command requires that you join the Windows Insiders Program and install a preview build of Windows 10 (OS build 20262 or higher), but eliminates the need to follow the manual install steps. All you need to do is open a command window with administrator privileges and run wsl --install, after a restart you will be ready to use WSL. (source)

So in theory once you are set up with Windows Insider it should now be possible to install WSL, including Ubuntu 20.04, in a single command (in an Admin Powershell window):


wsl --install

For whatever reason this didn't work for me so I documented the manual approach I took below. Hopefully this one line approach works for others and improves with age.

So on to the manual setup…

Enable WSL 2

Run the below code in Admin elevated Powershell:



# Eanble Windows Subsystem for Linux
# PowerShell as Admin
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart

Enable ‘Virtual Machine Platform’

Run the below code in Admin elevated Powershell:

# Enable Virtual Machine Platform (for WSL2)
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart

You should now restart your system before moving to the next step.

Download and install WSL2 update package

Run the below to download the require installer:


# Download and install WSL2 Linux kernel update package for x64 machines
https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi

Run the update package downloaded (Double-click to run - you will be prompted for elevated permissions, select ‘yes’ to approve this installation). The system may want tp restart here.

Set WSL 2 as default

Now in the Powershell window we set the deault environment to WSL2:


# PowerShell as Admin
wsl --set-default-version 2

I can't recall if there was a system restart here, probably.

Install a Linux distro

Time to install our Linux distribution. I picked Ubuntu 18.04 as this at the moment is the easiest to 'mesh' with current ONT software. Other Debian/Ubuntu distros will work fine, you'll just need to change a few software versions in the below code.

So first go to the Windows Store and search for Ubuntu. Select Ubuntu 18.04 and follow the instructions to 'get' and install. Once it's installed you should be able to launch from the start menu. The first time you will be asked to create a user name and password - don't forget these!

Once you're in you can update the distro:



# Update system
sudo apt update
sudo apt upgrade # be careful, it takes time!

Hooray! We now have Linux in Windows. :)

Nvidia CUDA driver (Windows)

Time to grab and install some fancy preview drivers that allow a bridging between Windows and the Linux kernel in WSL2.

BETA SOFTWARE: this is just a warning that this software is still in preview/beta and may not behave as expected.

First you will need to join the Nvidia developers programme. Once you have done this you will be able to access the preview drivers, as well as a LOT more cool stuff that really is worth checking out if you are at all interested.

Once you have access it's time to download the preview driver, you can get that here. Depending on your card you'll want to grab either the GeForce or Quadro version of the driver. I grabbed the Quadro version for the mobile RTX4000.

Once that is downloaded it can be installed. Selecting the recommended options is fine and installation should go smoothly.

If you want more detailed instructions for the above please follow along with the first section of this guide (link).

Install CUDA-toolkit inside WSL

Now we need to install some packages within our new WSL2 environment.

IMPORTANT: At the moment it seems that versions of the cuda-toolkit other than 11.0 and 10.2 (i.e. 11.2 and greater) seem to be 'broken'. I have gone with 11.0 here as this is require for the later versions of Guppy. It seems to work well so it is my current suggestion. I noted that this issue has been clocked and a fix is inbound shortly.

Because the latest versions of Guppy are built against CUDA 11 we'll grab the CUDA 11.0 toolkit and install it in our WSL2 environment.

First we need to set up the Nvidia public key:


sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub

Now we add the Nvidia repositories to our system:


sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list'

Time to update:


sudo apt-get update

Now we are able to install the CUDA toolkit:


sudo apt-get --yes install cuda-toolkit-11-0 cuda-toolkit-10-2

I grabbed cuda-toolkit-10-2 as well for some 'legacy' needs that may arise for some other things I'm testing.

Once the CUDA toolkit is installed in WSL it's probably worth another restart of the computer (Windows has way too many restarts!). After reboot you can check if a folder named /usr/lib/wsl is present in the WSL2 environment. If you can find the folder, the whole installation process has worked. If not, then like me you'll be going back over numerous steps trying to figure out what went wrong. For me re-installing the Nvidia preview driver and doing another system update seemed to do the trick. So hopefully people following along with this get it to work first time, if not don't give up as it is possible - or you could give up and install Linux and have a much easier time… ;)

Now you can check if CUDA and your GPU works on Ubuntu with a sample program.

Testing CUDA

CUDA ships with a large range of builtin samples. These can be built and used to test the GPU/CUDA/Driver environment.

If all has gone well and we now have a working set up we should be able to build any of these samples, run them and get a 'pass'. Below I have done this and report the output (which will be specific to this system).

deviceQuery

First you'll need to move to the specific directory and make the sample 'program':


cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make

You can then run the program:














































$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Quadro RTX 4000 with Max-Q Design"
  CUDA Driver Version / Runtime Version          11.4 / 11.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 8192 MBytes (8589934592 bytes)
  (40) Multiprocessors, ( 64) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1380 MHz (1.38 GHz)
  Memory Clock rate:                             6001 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.0, NumDevs = 1
Result = PASS

Hooray! All looks like it is working here!

BlackScholes





































$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Turing" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.

Executing Black-Scholes GPU kernel (512 iterations)...
Options count             : 8000000
BlackScholesGPU() time    : 1.981922 msec
Effective memory bandwidth: 40.364860 GB/s
Gigaoptions per second    : 4.036486

BlackScholes, Throughput = 4.0365 GOptions/s, Time = 0.00198 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128

Reading back GPU results...
Checking the results...
...running CPU calculations.

Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05

Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.

[BlackScholes] - Test Summary

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Test passed

So with that we can conclude that CUDA is working in WSL2 and utalising the GPU. Feel free to explore the various samples contained in that directory, there are some fun little examples in there.

We can finally get to the point of the while excerise.

Basecalling with GPU in WSL

So with all the above done and working the rest is actually straight forward.

First you will require access to the Nanopore Community space to download the software. If you haven't already you can sign up here.

Once that is sorted you can proceed with downloading Guppy.

Download and Extract Guppy

In the software section you'll find Guppy listed. You can either manually download the "Linux 64-bit GPU" binaries or you can right click and copy the url (link). If you modify the below code with that url you'll be able to run the following code block in a WSL2 terminal and download and extract Guppy in one go.



version=5.0.11
wget https://[paste your link here]/ont-guppy_${version}_linux64.tar.gz
tar -zxvf ont-guppy_${version}_linux64.tar.gz

This approach has the added benefit of being able to easily modify the version number should you wish to download an older version of Guppy.

We can now check that the download binary works:


$ ~/ont-guppy/bin/guppy_basecaller -v
: Guppy Basecalling Software, (C) Oxford Nanopore Technologies, Limited. Version 5.0.11+2b6dbff

Great, this is looking good.

Basecall some data

Time to throw some data at Guppy and the GPU and see what sort of performance we get. I'm going to test with both the fast model as well as the new super high accuracy model - I'm very interested to see how this laptop performs with the sup model.

WARNING: OK, so I got caught out here. My years of not using Windows meant I overlooked the fact that it doesn't default to the best performance in terms of power mode when plugged in. What this means is that certain pieces of hardware (likely CPU and GPU) are 'scaled' down in their performance to save power. When I noticed and changed to the 'Best' performance profile, it greatly impacted basecalling - in a very positive way!

So please consider this a public service announcment if you are using, or plan to use Windows for GPU basecalling with Nanopore data.

Ok, on with the fun stuff! First up we'll run the FAST model.

FAST model

First up the FAST model with default parameters.























$ ~/ont-guppy/bin/guppy_basecaller -c dna_r9.4.1_450bps_fast.cfg -i fast5/ -s fastq -x 'auto' --recursive
ONT Guppy basecalling software version 5.0.11+2b6dbff
config file:        /home/miles/ont-guppy/data/dna_r9.4.1_450bps_fast.cfg
model file:         /home/miles/ont-guppy/data/template_r9.4.1_450bps_fast.jsn
input path:         fast5/
save path:          fastq
chunk size:         2000
chunks per runner:  160
minimum qscore:     8
records per file:   4000
num basecallers:    4
gpu device:         auto
kernel path:
runners per device: 8
Found 10 fast5 files to process.
Init time: 803 ms

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 15874 ms, Samples called: 189550286, samples/s: 1.19409e+07
Finishing up any open output files.
Basecalling completed successfully.

It took a total of 15 secs, nice! Before changing power mode profiles the FAST model with default settings took 37secs, so that's a reduction of more than 50% in terms of time, or more than a doubling in base calling speed. Cool!

Now we have confirmation of a fully working GPU Guppy set up in Windows WSL2. Let's try the other basecalling models.

HAC model

This first run was before I noticed the power mode issue.























$ ~/ont-guppy/bin/guppy_basecaller -c dna_r9.4.1_450bps_hac.cfg -i fast5/ -s fastq -x 'auto' --recursive
ONT Guppy basecalling software version 5.0.11+2b6dbff
config file:        /home/miles/ont-guppy/data/dna_r9.4.1_450bps_hac.cfg
model file:         /home/miles/ont-guppy/data/template_r9.4.1_450bps_hac.jsn
input path:         fast5/
save path:          fastq
chunk size:         2000
chunks per runner:  256
minimum qscore:     9
records per file:   4000
num basecallers:    4
gpu device:         auto
kernel path:
runners per device: 4
Found 10 fast5 files to process.
Init time: 973 ms

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 251015 ms, Samples called: 189550286, samples/s: 755135
Finishing up any open output files.
Basecalling completed successfully.

That's 4mins 11secs.

Once I had clocked that the laptop was not running at full power while plugged I made the required change and reran the models. The HAC model with defaul (as above) completed in around 3mins 30secs, so a good increase in basecalling rate just by changing the power profile.

I then had a play with tweaking the model. Trying to optimise basecalling speed for this RTX4000 mobile GPU I increased the parameter --chunks_per_runner. I did this in small increments, keeping an eye on GPU memory being used. At around 384 the basecalling rate 'stabilised', by this I mean that increasing the parameter lead to smaller and smaller gains in speed. At 412 I recorded the below run:























$ ~/ont-guppy/bin/guppy_basecaller -c dna_r9.4.1_450bps_hac.cfg -i fast5/ -s fastq -x 'auto' --recursive --chunks_per_runner 412
ONT Guppy basecalling software version 5.0.11+2b6dbff
config file:        /home/miles/ont-guppy/data/dna_r9.4.1_450bps_hac.cfg
model file:         /home/miles/ont-guppy/data/template_r9.4.1_450bps_hac.jsn
input path:         fast5/
save path:          fastq
chunk size:         2000
chunks per runner:  412
minimum qscore:     9
records per file:   4000
num basecallers:    4
gpu device:         auto
kernel path:
runners per device: 4
Found 10 fast5 files to process.
Init time: 927 ms

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 123090 ms, Samples called: 189550286, samples/s: 1.53993e+06
Finishing up any open output files.
Basecalling completed successfully.

That completed in 2mins 3secs - which is nearly a minute and a half faster than the default HAC model parameters. That's some really nice gains!

NOTE: It should be noted that this is going to be very different between GPU models. Some GPUs will respond well to parameter optimisation, some won't. Most of the time the default model will be a fine option.

SUP model

The SUP (super high accuracy) model came in with Guppy 5.0.7 and is much more taxing on hardware than the HAC model. So we expect this model to run anywhere from 2-8 times slower, depending greatly on the hardware that you have at hand.

This first run is the SUP model (default parameters) before adjusting the performance profile of the laptop.























$ ~/ont-guppy/bin/guppy_basecaller -c dna_r9.4.1_450bps_sup.cfg -i fast5/ -s fastq -x 'auto' --recursive
ONT Guppy basecalling software version 5.0.11+2b6dbff
config file:        /home/miles/ont-guppy/data/dna_r9.4.1_450bps_sup.cfg
model file:         /home/miles/ont-guppy/data/template_r9.4.1_450bps_sup.jsn
input path:         fast5/
save path:          fastq
chunk size:         2000
chunks per runner:  256
minimum qscore:     10
records per file:   4000
num basecallers:    4
gpu device:         auto
kernel path:
runners per device: 12
Found 10 fast5 files to process.
Init time: 2666 ms

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 553242 ms, Samples called: 189550286, samples/s: 342617
Finishing up any open output files.
Basecalling completed successfully.

So that SUP model run completed in 9mins 13secs. I've yet to do a lot of testing with SUP to have a feel for what this result is like performance wise, but I'm feeling pleasently surprised with what the mobile RTX4000 is able to do here, I was expecting it to take quite a long time.

This next run is still SUP model default but now running in full performance mode.























$ ~/ont-guppy/bin/guppy_basecaller -c dna_r9.4.1_450bps_sup.cfg -i fast5/ -s fastq -x 'auto' --recursive
ONT Guppy basecalling software version 5.0.11+2b6dbff
config file:        /home/miles/ont-guppy/data/dna_r9.4.1_450bps_sup.cfg
model file:         /home/miles/ont-guppy/data/template_r9.4.1_450bps_sup.jsn
input path:         fast5/
save path:          fastq
chunk size:         2000
chunks per runner:  256
minimum qscore:     10
records per file:   4000
num basecallers:    4
gpu device:         auto
kernel path:
runners per device: 12
Found 10 fast5 files to process.
Init time: 1625 ms

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
Caller time: 261553 ms, Samples called: 189550286, samples/s: 724711
Finishing up any open output files.
Basecalling completed successfully.

Well that made a MASSIVE difference. We've gone from 9mins 13secs down to 4mins 21secs! I am even more impressed with the mobile RTX4000 GPU that is in this HP laptop. I believe that it will easily keep up with HAC live basecalling of one, maybe even 2 MinIONs. I also think it might be able to SUP live basecall a single MinION, but we'll need to get it into the lab and starts some real life sequencing runs to confirm.

NOTE: I played with model parameter optimisation for all 3 models but it was only the HAC model that I found made any significant difference. This likely means that the FAST and SUP models are fairly well optimised, at least for the GPU performance that the mobile RTX4000 brings to the table. You're milage may vary based on the specific GPU(s) that you are using. In general the default models are going to do a really good job, but sometimes a little tweaking can eek out a bit more performance.

I was acutally very surprised that I was able to drastically increase the basecalling rate of the HAC model with the change in power mode and adjustment of the chunks per runner parameter - that must have just been the sweet spot for this particular GPU.

Native Linux vs WSL2

I finally found some time to install Linux in a dual boot setup on the HP Zbook Fury G7 17. I ended up going with Pop_OS! (21.10) as I've heard lots of great things about it, and I haven't been dissapointed - I'll find time to write about that experience elsewhere (spoiler: everything just works!).

So now I've got a native Linux environment I thought it might be fun to see how basecalling compares between WSL2 and "full-blown" Linux. Here is a comparision table based on the same data and models above. I've recorded samples per second as the metric (rate of basecalling).

Model	WSL2 (samples/s)	Pop_OS! (samples/s)	Speed Up
FAST	1.19409e+07	2.88644e+07	2.4x
HAC	1.53993e+06	4.8192e+06	3.1x
SUP	724711	1.36953e+06	1.9x

This seems crazy to me! I wasn't sure what I was expecting, maybe a little faster under native Linux but not this much faster. They are obviously a lot of various overheads that are part of the WSL2 system. I reached out on Twitter and a WSL2 commented with some suggestions but at this stage I don't believe WSL2 is going to give the same level of performance that you will see under native Linux.

What's next?

So after a fair few pain points with getting set up the goal was achieved and GPU base calling inside WSL2 seems to work nicely. Instead of wiping Windows I might actually dual boot this laptop with Ubuntu 20.04 (or similar). This will allow me to do some more robust testing between the two operating systems. I don't imagine there is much, if any, performance hit to basecalling in Windows vs Linux but I don't think anyone has offically documented this (to the best of my knowledge).

If I find the time I would like to see if MinKNOW can be run inside WSL and accessed remotely for a sequencing run with live basecalling. There are going to be some rather large hurdles in that experiment so I'll probably let sleeping dogs lie for a while longer before revisiting.

So hopefully the process above was of some use to anyone that wants to use a Windows machine for GPU basecalling. I personally will continue to adovcate for Linux, but it's nice to have been able to get this process working and I learned a lot, which is always a win in my books.

As with my other notes and documents I see this as fairly dynamic and will update as and when I find time. So please do feel to check back occasionally.

Thanks for reading!

Andrew D. Montecillo

2021/06/22 03:29:30

Do you think RTX3050Ti (Mobile) will be able to handle the SUP model? Thanks! (Edited)

Miles Benton

2021/06/24 02:27:01

Hard to say without testing. If you look at the specs at techpowerup (https://www.techpowerup.com/gpu-specs/geforce-rtx-3050-ti-mobile.c3778) it's actually a lot less powerful than the RTX4000 mobile that I'm benchmarking here. The big thing for me that is a warning sign for the RTX3050Ti mobile GPU is that it only has 4Gb of RAM - that is actually too low at the moment for out of the box live basecalling (you can do it but have to modify some parameters). I'd look at RTX3060 mobil

2021/06/24 02:36:18

Something like this would probably be quite nice: https://www.amazon.com/HIDevolution-Zephyrus-GA401QM-3200MHz-Windows/dp/B08WPNS9PY?th=1 Plus it's quite a bargain price wise.

Guest Moreno2022/07/08 18:34:00

```shell= sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub ```

I am having troubles regarding the public key, do you have anything more recent? (Edited)

Guest Dawson2023/01/05 20:50:41

The public key that worked for me was: (Edited)

Guest Nunez2023/01/05 20:50:50

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub

Guest Nunez2023/01/05 20:53:58

Make sure the Ubuntu versions align. I had an issue with the pub key because I was running Ubuntu 2204, not 1804. If this does not work, paste this into a search engine to find a directory... https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.

Nanopore Guppy GPU basecalling on Windows using WSL2

tags: Nanopore GPU notes documentation Linux Windows

Foreward

Resources

The system

The process

Windows Insider installation/upgrade

Installing WSL2

Enable WSL 2

Enable ‘Virtual Machine Platform’

Download and install WSL2 update package

Set WSL 2 as default

Install a Linux distro

Nvidia CUDA driver (Windows)

Install CUDA-toolkit inside WSL

Testing CUDA

deviceQuery

BlackScholes

Basecalling with GPU in WSL

Download and Extract Guppy

Basecall some data

FAST model

HAC model

SUP model

Native Linux vs WSL2

What's next?

tags: `Nanopore` `GPU` `notes` `documentation` `Linux` `Windows`