# **3.7 批次梯度下降法和隨機梯度下降法**
## **3.7.1 MNIST手寫數字集**
1. MNIST手寫數字集是一些手寫數字的圖形,每幅圖都有一個手寫數字(0,1,2...,9 共10種數字)。
2. 訓練集中共有784(28*28)個像素點,每個像素點的值介於0~255之間。
3. 60000張訓練影像,10000張測試影像。
```python=
import numpy as np
import pandas as pd
from keras.utils import np_utils
np.random.seed(10)
# 匯入資料
from keras.datasets import mnist
(x_train_image,y_train_label),(x_test_image,y_test_label)=mnist.load_data()
print('train data= ',len(x_train_image))
print('test data=', len(x_test_image))
# train data= 60000
# test data= 10000
```
用以上程式便可匯入MNIST,然而通常為了避免梯度爆炸,並提升收斂速度,我們通常會將每個像素點的值除以255.0,使其變為介於0~1之間的浮點數。
## **3.7.2 用部份訓練樣本訓練邏輯回歸模型**
MNIST的訓練集中有60000個樣本,若全部使用會耗費大量運算資源和時間,因此可以改成僅使用部分資料進行訓練,方法如下:
```python=
subset_size = 500 # 選擇子集大小
trainset_subset = trainset[:subset_size] # 創建子集
# 創建新的dataloader
batch_size = 32
trainloader_subset = torch.utils.data.DataLoader(trainset_subset, batch_size=batch_size, shuffle=True)
```
## **3.7.3 批次梯度下降法**
隨機抽取少量樣本對樣本進行梯度更新,一般作法如下:
1. 對原本的訓練集樣本重新排序
2. 對重新排序過的訓練集,從頭開始,按照順序取少量樣本批次計算模型函數的損失並更新模型的參數
3. 多次重覆1. 2. 兩步(1. 2. 兩步稱為一個epoch)
## **3.7.4 隨機梯度下降法**
批次梯度下降法每次只取1個樣本即為隨機梯度下降法
## **範例**
以MNIST進行手寫辨識並只取500個樣本訓練,採用批次梯度下降法,程式如下:
```python=
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np
# 資料預處理和加載 MNIST 訓練數據集
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
# 下載完整的 MNIST 訓練數據集
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
# 要使用的子集大小(前500個樣本)
subset_size = 500
# 創建 MNIST 訓練子集
indices = np.random.choice(len(trainset), subset_size, replace=False)
trainset_subset = torch.utils.data.Subset(trainset, indices)
# 設定批次大小
batch_size = 32
# 創建 DataLoader
trainloader_subset = torch.utils.data.DataLoader(trainset_subset, batch_size=batch_size, shuffle=True)
# CNN 模型定義
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3)
self.conv2 = nn.Conv2d(32, 64, 3)
self.fc1 = nn.Linear(64 * 5 * 5, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2)
x = x.view(x.size(0), -1) # 攤平特徵
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
# 初始化模型、損失函數和優化器
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
# 訓練模型
num_epochs = 10
history = {'train_acc': [], 'train_loss': []}
for epoch in range(num_epochs):
net.train() # 設定為訓練模式
running_loss = 0.0
correct = 0
total = 0
for i, data in enumerate(trainloader_subset, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = outputs.max(1)
total += labels.size(0)
correct += predicted.eq(labels).sum().item()
train_acc = 100. * correct / total
train_loss = running_loss / len(trainloader_subset)
# 將訓練準確度和損失加入 history 字典中
history['train_acc'].append(train_acc)
history['train_loss'].append(train_loss)
print(f"Epoch {epoch+1}, Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%")
# 繪製學習曲線
plt.plot(range(1, num_epochs+1), history['train_acc'], label='Train Accuracy')
plt.plot(range(1, num_epochs+1), history['train_loss'], label='Train Loss')
plt.xlabel('Epoch')
plt.legend()
plt.title('Training Curve')
plt.show()
```