# NPL in PyTorch for Monkey: 1. PyTorch Basics
This tutorial is a cheatsheet of the book "<i class="fa fa-book fa-fw"></i> Natural Language Processing with PyTorch: Build Intelligent Language Applications Using Deep Learning".
https://github.com/delip/PyTorchNLPBook
Outlines
---
1. [PyTorch Basics](https://hackmd.io/@martinliu/Hkt4VBggi)
2. Feed-forward Networks for NLP
3. Embedding Words and Types
4. Sequence Modeling for NLP
5. Intermediate Sequence Modeling for NLP
6. Advanced Sequence Modeling for NLP
7. My Note
---
PyTorch Basics
---
Environment setting and library loading
Function: a summary function of seeing a tensor
Usage: describe(tensor.obj)
```
import torch
import numpy as np
torch.manual_seed(1234)
def describe(x):
print("Type: {}".format(x.type()))
print("Shape/size: {}".format(x.shape))
print("Values: \n{}".format(x))
```
Create a random tensor
```
x = torch.rand(2, 3)
describe(x)
```
Create a 0-filled or 1-filled tensor then fill it with 5
```
x = torch.zeros(2, 3)
describe(x)
x = torch.ones(2, 3)
describe(x)
x.fill_(5)
describe(x)
y = torch.Tensor(3,4).fill_(5)
describe(y)
```
Tensors can be initialized from a list of lists
```
x = torch.Tensor([[1, 2,],
[2, 4,]])
describe(x)
```
Tensors can be initialized from numpy matrices
But the format will be float64
```
npy = np.random.rand(2, 3)
describe(torch.from_numpy(npy))
print(npy.dtype)
```
Tensor Types
The FloatTensor has been the default tensor that we have been creating all along
```
import torch
x = torch.arange(6).view(2, 3)
describe(x)
```
FloatTensor and LongTensor
We can directly create them or transform them
```
#create a float tensor
x = torch.FloatTensor([[1, 2, 3],
[4, 5, 6]])
describe(x)
# then transform it
x = x.long()
describe(x)
#create a long tensor
x = torch.tensor([[1, 2, 3],
[4, 5, 6]], dtype=torch.int64)
describe(x)
#then transform it
x = x.float()
describe(x)
```
---
## Tensor Basic
### Add
there are two ways:
```
x = torch.randn(2, 3)
describe(x)
x2 <- torch.add(x, x)
describe(x2)
x2 <- x+x
describe(x2)
```
### Transform from vector to matrix
create a vector then transform it to the shape you want
```
x = torch.arange(6)
describe(x)
x = x.view(2, 3)
describe(x)
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
[3, 4, 5]])
```
then we can sum it by row or column
#### by column
```
describe(torch.sum(x, dim=0))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([3])
Values:
tensor([3, 5, 7])
```
#### by row
```
describe(torch.sum(x, dim=1))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2])
Values:
tensor([ 3, 12])
```
#### transpose
```
describe(torch.transpose(x, 0, 1))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([3, 2])
Values:
tensor([[0, 3],
[1, 4],
[2, 5]])
```
### Index: take specific positions
```
x = torch.arange(6).view(2, 3)
describe(x)
describe(x[:1, :2])
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
[3, 4, 5]])
Type: torch.LongTensor
Shape/size: torch.Size([1, 2])
Values:
tensor([[0, 1]])
```
```
describe(x[0, 1])
```
```
Type: torch.LongTensor
Shape/size: torch.Size([])
Values:
1
```
#### take some specific position by index
```
x = torch.arange(6).view(2, 3)
describe(x)
indices = torch.LongTensor([0, 2])
describe(torch.index_select(x, dim=1, index=indices))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
[3, 4, 5]])
Type: torch.LongTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[0, 2],
[3, 5]])
```
#### take some specific position and duplicate it
```
indices = torch.LongTensor([0, 0])
describe(torch.index_select(x, dim=0, index=indices))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
[0, 1, 2]])
```
#### Long Tensors are used for indexing operations
```
row_indices = torch.arange(2).long()
col_indices = torch.LongTensor([0, 1])
describe(x[row_indices, col_indices])
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2])
Values:
tensor([0, 4])
```
#### or mirror the int64 numpy type
```
x = torch.LongTensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
describe(x)
print(x.dtype)
print(x.numpy().dtype)
```
#### You can convert a FloatTensor to a LongTensor
```
x = torch.FloatTensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
x = x.long()
describe(x)
```
#### Special Tensor initializations: long format for indexing
```
x = torch.arange(0, 10).long()
print(x)
```
```
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```
---
## Operations
### view: reshaping a tensor
Reshaping allows you to move the numbers in a tensor around.
One can be sure that the order is preserved.
```
x = torch.arange(0, 20)
print(x.view(1, 20))
print(x.view(2, 10))
print(x.view(4, 5))
print(x.view(5, 4))
print(x.view(10, 2))
print(x.view(20, 1))
```
```
tensor([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19]])
tensor([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
tensor([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
tensor([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15],
[16, 17],
[18, 19]])
tensor([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12],
[13],
[14],
[15],
[16],
[17],
[18],
[19]])
```
We can use view to add size-1 dimensions, which can be useful for combining with other tensors.
This is called broadcasting.
```
x = torch.arange(12).view(3, 4)
y = torch.arange(4).view(1, 4)
z = torch.arange(3).view(3, 1)
print(x)
print(y)
print(z)
print(x + y)
print(x + z)
```
```
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
tensor([[0, 1, 2, 3]])
tensor([[0],
[1],
[2]])
tensor([[ 0, 2, 4, 6],
[ 4, 6, 8, 10],
[ 8, 10, 12, 14]])
tensor([[ 0, 1, 2, 3],
[ 5, 6, 7, 8],
[10, 11, 12, 13]])
```
Unsqueeze and squeeze will add and remove 1-dimensions.
```
x = torch.arange(12).view(3, 4)
print(x.shape)
x = x.unsqueeze(dim=1)
print(x.shape)
x = x.squeeze()
print(x.shape)
```
```
torch.Size([3, 4])
torch.Size([3, 1, 4])
torch.Size([3, 4])
```
all of the standard mathematics operations apply (such as add below)
```
x = torch.rand(3,4)
print("x: \n", x)
print("--")
print("torch.add(x, x): \n", torch.add(x, x))
print("--")
print("x+x: \n", x + x)
```
```
x:
tensor([[0.6662, 0.3343, 0.7893, 0.3216],
[0.5247, 0.6688, 0.8436, 0.4265],
[0.9561, 0.0770, 0.4108, 0.0014]])
--
torch.add(x, x):
tensor([[1.3324, 0.6686, 1.5786, 0.6433],
[1.0494, 1.3377, 1.6872, 0.8530],
[1.9123, 0.1540, 0.8216, 0.0028]])
--
x+x:
tensor([[1.3324, 0.6686, 1.5786, 0.6433],
[1.0494, 1.3377, 1.6872, 0.8530],
[1.9123, 0.1540, 0.8216, 0.0028]])
```
The convention of _ indicating in-place operations continues:
```
x = torch.arange(12).reshape(3, 4)
print(x)
print(x.add_(x))
```
```
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
tensor([[ 0, 2, 4, 6],
[ 8, 10, 12, 14],
[16, 18, 20, 22]])
```
There are many operations for which reduce a dimension. Such as sum:
```
x = torch.arange(12).reshape(3, 4)
print("x: \n", x)
print("---")
print("Summing across rows (dim=0): \n", x.sum(dim=0))
print("---")
print("Summing across columns (dim=1): \n", x.sum(dim=1))
```
```
x:
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
---
Summing across rows (dim=0):
tensor([12, 15, 18, 21])
---
Summing across columns (dim=1):
tensor([ 6, 22, 38])
```
---
### Indexing, Slicing, Joining and Mutating
```
x = torch.arange(6).view(2, 3)
print("x: \n", x)
print("---")
print("x[:2, :2]: \n", x[:2, :2])
print("---")
print("x[0][1]: \n", x[0][1])
print("---")
print("Setting [0][1] to be 8")
x[0][1] = 8
print(x)
```
```
x:
tensor([[0, 1, 2],
[3, 4, 5]])
---
x[:2, :2]:
tensor([[0, 1],
[3, 4]])
---
x[0][1]:
tensor(1)
---
Setting [0][1] to be 8
tensor([[0, 8, 2],
[3, 4, 5]])
```
We can select a subset of a tensor using the index_select
```
x = torch.arange(9).view(3,3)
print(x)
print("---")
indices = torch.LongTensor([0, 2])
print(torch.index_select(x, dim=0, index=indices))
print("---")
indices = torch.LongTensor([0, 2])
print(torch.index_select(x, dim=1, index=indices))
```
```
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
---
tensor([[0, 1, 2],
[6, 7, 8]])
---
tensor([[0, 2],
[3, 5],
[6, 8]])
```
We can also use numpy-style advanced indexing:
```
x = torch.arange(9).view(3,3)
indices = torch.LongTensor([0, 2])
print(x[indices])
print("---")
print(x[indices, :])
print("---")
print(x[:, indices])
```
```
tensor([[0, 1, 2],
[6, 7, 8]])
---
tensor([[0, 1, 2],
[6, 7, 8]])
---
tensor([[0, 2],
[3, 5],
[6, 8]])
```
We can combine tensors by concatenating them. First, concatenating on the rows
```
x = torch.arange(6).view(2,3)
describe(x)
describe(torch.cat([x, x], dim=0))
describe(torch.cat([x, x], dim=1))
describe(torch.stack([x, x]))
```
```
Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
[3, 4, 5]])
Type: torch.LongTensor
Shape/size: torch.Size([4, 3])
Values:
tensor([[0, 1, 2],
[3, 4, 5],
[0, 1, 2],
[3, 4, 5]])
Type: torch.LongTensor
Shape/size: torch.Size([2, 6])
Values:
tensor([[0, 1, 2, 0, 1, 2],
[3, 4, 5, 3, 4, 5]])
Type: torch.LongTensor
Shape/size: torch.Size([2, 2, 3])
Values:
tensor([[[0, 1, 2],
[3, 4, 5]],
[[0, 1, 2],
[3, 4, 5]]])
```
We can concentate along the first dimension.. the columns.
```
x = torch.arange(9).view(3,3)
print(x)
print("---")
new_x = torch.cat([x, x, x], dim=1)
print(new_x.shape)
print(new_x)
```
```
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
---
torch.Size([3, 9])
tensor([[0, 1, 2, 0, 1, 2, 0, 1, 2],
[3, 4, 5, 3, 4, 5, 3, 4, 5],
[6, 7, 8, 6, 7, 8, 6, 7, 8]])
```
We can also concatenate on a new 0th dimension to "stack" the tensors:
```
x = torch.arange(9).view(3,3)
print(x)
print("---")
new_x = torch.stack([x, x, x])
print(new_x.shape)
print(new_x)
```
```
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
---
torch.Size([3, 3, 3])
tensor([[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]],
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]],
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]])
```
---
### Linear Algebra Tensor Functions
Transposing allows you to switch the dimensions to be on different axis.
So we can make it so all the rows are columsn and vice versa.
```
x = torch.arange(0, 12).view(3,4)
print("x: \n", x)
print("---")
print("x.tranpose(1, 0): \n", x.transpose(1, 0))
```
```
x:
tensor([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
---
x.tranpose(1, 0):
tensor([[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]])
```
A three dimensional tensor would represent a batch of sequences, where each sequence item has a feature vector.
It is common to switch the batch and sequence dimensions so that we can more easily index the sequence in a sequence model.
Note: Transpose will only let you swap 2 axes. Permute (in the next cell) allows for multiple
```
batch_size = 3
seq_size = 4
feature_size = 5
x = torch.arange(batch_size * seq_size * feature_size).view(batch_size, seq_size, feature_size)
print("x.shape: \n", x.shape)
print("x: \n", x)
print("-----")
print("x.transpose(1, 0).shape: \n", x.transpose(1, 0).shape)
print("x.transpose(1, 0): \n", x.transpose(1, 0))
```
```
x.shape:
torch.Size([3, 4, 5])
x:
tensor([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
-----
x.transpose(1, 0).shape:
torch.Size([4, 3, 5])
x.transpose(1, 0):
tensor([[[ 0, 1, 2, 3, 4],
[20, 21, 22, 23, 24],
[40, 41, 42, 43, 44]],
[[ 5, 6, 7, 8, 9],
[25, 26, 27, 28, 29],
[45, 46, 47, 48, 49]],
[[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34],
[50, 51, 52, 53, 54]],
[[15, 16, 17, 18, 19],
[35, 36, 37, 38, 39],
[55, 56, 57, 58, 59]]])
```
Permute is a more general version of tranpose:
```
batch_size = 3
seq_size = 4
feature_size = 5
x = torch.arange(batch_size * seq_size * feature_size).view(batch_size, seq_size, feature_size)
print("x.shape: \n", x.shape)
print("x: \n", x)
print("-----")
print("x.permute(1, 0, 2).shape: \n", x.permute(1, 0, 2).shape)
print("x.permute(1, 0, 2): \n", x.permute(1, 0, 2))
```
```
x.shape:
torch.Size([3, 4, 5])
x:
tensor([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
-----
x.permute(1, 0, 2).shape:
torch.Size([4, 3, 5])
x.permute(1, 0, 2):
tensor([[[ 0, 1, 2, 3, 4],
[20, 21, 22, 23, 24],
[40, 41, 42, 43, 44]],
[[ 5, 6, 7, 8, 9],
[25, 26, 27, 28, 29],
[45, 46, 47, 48, 49]],
[[10, 11, 12, 13, 14],
[30, 31, 32, 33, 34],
[50, 51, 52, 53, 54]],
[[15, 16, 17, 18, 19],
[35, 36, 37, 38, 39],
[55, 56, 57, 58, 59]]])
```
mm: Matrix multiplication
```
torch.randn(2, 3, requires_grad=True)
```
```
tensor([[-0.4790, 0.8539, -0.2285],
[ 0.3081, 1.1171, 0.1585]], requires_grad=True)
```
```
x1 = torch.arange(6).view(2, 3).float()
describe(x1)
x2 = torch.ones(3, 2)
x2[:, 1] += 1
describe(x2)
describe(torch.mm(x1, x2))
```
```
Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0., 1., 2.],
[3., 4., 5.]])
Type: torch.FloatTensor
Shape/size: torch.Size([3, 2])
Values:
tensor([[1., 2.],
[1., 2.],
[1., 2.]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[ 3., 6.],
[12., 24.]])
```
```
x = torch.arange(0, 12).view(3,4).float()
print(x)
x2 = torch.ones(4, 2)
x2[:, 1] += 1
print(x2)
print(x.mm(x2))
```
```
tensor([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
tensor([[1., 2.],
[1., 2.],
[1., 2.],
[1., 2.]])
tensor([[ 6., 12.],
[22., 44.],
[38., 76.]])
```
### Computing Gradients
We create a tensor and multiply it by 3.
```
x = torch.tensor([[2.0, 3.0]], requires_grad=True)
z = 3 * x
print(z)
tensor([[6., 9.]], grad_fn=<MulBackward0>)
```
Then, we create a scalar output using sum(). A Scalar output is needed as the the loss variable. Then, called backward on the loss means it computes its rate of change with respect to the inputs. Since the scalar was created with sum, each position in z and x are independent with respect to the loss scalar.
The rate of change of x with respect to the output is just the constant 3 that we multiplied x by.
```
x = torch.tensor([[2.0, 3.0]], requires_grad=True)
print("x: \n", x)
print("---")
z = 3 * x
print("z = 3*x: \n", z)
print("---")
loss = z.sum()
print("loss = z.sum(): \n", loss)
print("---")
loss.backward()
print("after loss.backward(), x.grad: \n", x.grad)
```
```
x:
tensor([[2., 3.]], requires_grad=True)
---
z = 3*x:
tensor([[6., 9.]], grad_fn=<MulBackward0>)
---
loss = z.sum():
tensor(15., grad_fn=<SumBackward0>)
---
after loss.backward(), x.grad:
tensor([[3., 3.]])
```
Example: Computing a conditional gradient
```
def f(x):
if (x.data > 0).all():
return torch.sin(x)
else:
return torch.cos(x)
x = torch.tensor([1.0], requires_grad=True)
y = f(x)
y.backward()
print(x.grad)
tensor([0.5403])
```
We could apply this to a larger vector too, but we need to make sure the output is a scalar:
```
x = torch.tensor([1.0, 0.5], requires_grad=True)
y = f(x)
y.sum().backward()
print(x.grad)
tensor([0.5403, 0.8776])
```
but there was an issue.. this isn't right for this edge case:
```
x = torch.tensor([1.0, -1], requires_grad=True)
y = f(x)
y.sum().backward()
print(x.grad)
tensor([-0.8415, 0.8415])
x = torch.tensor([-0.5, -1], requires_grad=True)
y = f(x)
y.sum().backward()
print(x.grad)
tensor([0.4794, 0.8415])
```
This is because we aren't doing the boolean computation and subsequent application of cos and sin on an elementwise basis.
So, to solve this, it is common to use masking:
```
def f2(x):
mask = torch.gt(x, 0).float()
return mask * torch.sin(x) + (1 - mask) * torch.cos(x)
x = torch.tensor([1.0, -1], requires_grad=True)
y = f2(x)
y.sum().backward()
print(x.grad)
tensor([0.5403, 0.8415])
def describe_grad(x):
if x.grad is None:
print("No gradient information")
else:
print("Gradient: \n{}".format(x.grad))
print("Gradient Function: {}".format(x.grad_fn))
```
```
import torch
x = torch.ones(2, 2, requires_grad=True)
describe(x)
describe_grad(x)
print("--------")
y = (x + 2) * (x + 5) + 3
describe(y)
z = y.mean()
describe(z)
describe_grad(x)
print("--------")
z.backward(create_graph=True, retain_graph=True)
describe_grad(x)
print("--------")
```
```
Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
No gradient information
--------
Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[21., 21.],
[21., 21.]], grad_fn=<AddBackward0>)
Type: torch.FloatTensor
Shape/size: torch.Size([])
Values:
21.0
No gradient information
--------
Gradient:
tensor([[2.2500, 2.2500],
[2.2500, 2.2500]], grad_fn=<CloneBackward>)
Gradient Function: None
--------
```
### CUDA Tensors
PyTorch's operations can seamlessly be used on the GPU or on the CPU.
There are a couple basic operations for interacting in this way.
```
print(torch.cuda.is_available())
True
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
cuda
```
difference between tensor in cpu and gpu
```
x = torch.rand(3,3)
describe(x)
Type: torch.FloatTensor
Shape/size: torch.Size([3, 3])
Values:
tensor([[0.9149, 0.3993, 0.1100],
[0.2541, 0.4333, 0.4451],
[0.4966, 0.7865, 0.6604]])
x = torch.rand(3, 3).to(device)
describe(x)
print(x.device)
Type: torch.cuda.FloatTensor
Shape/size: torch.Size([3, 3])
Values:
tensor([[0.1303, 0.3498, 0.3824],
[0.8043, 0.3186, 0.2908],
[0.4196, 0.3728, 0.3769]], device='cuda:0')
cuda:0
```
Two tensor must be in the same device.
Use to(device)
```
x = torch.rand(3, 3).to(device)
y = torch.rand(3, 3)
#this will be break
x + y
#they must be put into the same device
cpu_device = torch.device("cpu")
y = y.to(cpu_device)
x = x.to(cpu_device)
x + y
tensor([[0.8394, 0.5273, 0.8267],
[0.9273, 1.2824, 1.0603],
[0.4574, 0.5968, 1.0541]])
```
Two tensor must be in the same device.
use .cuda() or .cpu()
```
if torch.cuda.is_available(): # only is GPU is available
a = torch.rand(3,3).to(device='cuda:0') # CUDA Tensor
print(a)
tensor([[0.5274, 0.6325, 0.0910],
[0.2323, 0.7269, 0.1187],
[0.3951, 0.7199, 0.7595]], device='cuda:0')
b = torch.rand(3,3).cuda()
print(b)
tensor([[0.5311, 0.6449, 0.7224],
[0.4416, 0.3634, 0.8818],
[0.9874, 0.7316, 0.2814]], device='cuda:0')
print(a + b)
tensor([[1.0585, 1.2775, 0.8134],
[0.6739, 1.0903, 1.0006],
[1.3825, 1.4515, 1.0409]], device='cuda:0')
a = a.cpu() # Error expected
print(a + b)
```
###### tags: `Python` `pytorch` `NPL`