# **3.7 批次梯度下降法和隨機梯度下降法** ## **3.7.1 MNIST手寫數字集** 1. MNIST手寫數字集是一些手寫數字的圖形,每幅圖都有一個手寫數字(0,1,2...,9 共10種數字)。 2. 訓練集中共有784(28*28)個像素點,每個像素點的值介於0~255之間。 3. 60000張訓練影像,10000張測試影像。 ```python= import numpy as np import pandas as pd from keras.utils import np_utils np.random.seed(10) # 匯入資料 from keras.datasets import mnist (x_train_image,y_train_label),(x_test_image,y_test_label)=mnist.load_data() print('train data= ',len(x_train_image)) print('test data=', len(x_test_image)) # train data= 60000 # test data= 10000 ``` 用以上程式便可匯入MNIST,然而通常為了避免梯度爆炸,並提升收斂速度,我們通常會將每個像素點的值除以255.0,使其變為介於0~1之間的浮點數。 ## **3.7.2 用部份訓練樣本訓練邏輯回歸模型** MNIST的訓練集中有60000個樣本,若全部使用會耗費大量運算資源和時間,因此可以改成僅使用部分資料進行訓練,方法如下: ```python= subset_size = 500 # 選擇子集大小 trainset_subset = trainset[:subset_size] # 創建子集 # 創建新的dataloader batch_size = 32 trainloader_subset = torch.utils.data.DataLoader(trainset_subset, batch_size=batch_size, shuffle=True) ``` ## **3.7.3 批次梯度下降法** 隨機抽取少量樣本對樣本進行梯度更新,一般作法如下: 1. 對原本的訓練集樣本重新排序 2. 對重新排序過的訓練集,從頭開始,按照順序取少量樣本批次計算模型函數的損失並更新模型的參數 3. 多次重覆1. 2. 兩步(1. 2. 兩步稱為一個epoch) ## **3.7.4 隨機梯度下降法** 批次梯度下降法每次只取1個樣本即為隨機梯度下降法 ## **範例** 以MNIST進行手寫辨識並只取500個樣本訓練,採用批次梯度下降法,程式如下: ```python= import torch import torchvision import torchvision.transforms as transforms import torch.nn as nn import torch.optim as optim import torch.nn.functional as F import matplotlib.pyplot as plt import numpy as np # 資料預處理和加載 MNIST 訓練數據集 transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))]) # 下載完整的 MNIST 訓練數據集 trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform) # 要使用的子集大小(前500個樣本) subset_size = 500 # 創建 MNIST 訓練子集 indices = np.random.choice(len(trainset), subset_size, replace=False) trainset_subset = torch.utils.data.Subset(trainset, indices) # 設定批次大小 batch_size = 32 # 創建 DataLoader trainloader_subset = torch.utils.data.DataLoader(trainset_subset, batch_size=batch_size, shuffle=True) # CNN 模型定義 class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(1, 32, 3) self.conv2 = nn.Conv2d(32, 64, 3) self.fc1 = nn.Linear(64 * 5 * 5, 128) self.fc2 = nn.Linear(128, 10) def forward(self, x): x = F.relu(self.conv1(x)) x = F.max_pool2d(x, 2) x = F.relu(self.conv2(x)) x = F.max_pool2d(x, 2) x = x.view(x.size(0), -1) # 攤平特徵 x = F.relu(self.fc1(x)) x = self.fc2(x) return x # 初始化模型、損失函數和優化器 net = Net() criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) # 訓練模型 num_epochs = 10 history = {'train_acc': [], 'train_loss': []} for epoch in range(num_epochs): net.train() # 設定為訓練模式 running_loss = 0.0 correct = 0 total = 0 for i, data in enumerate(trainloader_subset, 0): inputs, labels = data optimizer.zero_grad() outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() _, predicted = outputs.max(1) total += labels.size(0) correct += predicted.eq(labels).sum().item() train_acc = 100. * correct / total train_loss = running_loss / len(trainloader_subset) # 將訓練準確度和損失加入 history 字典中 history['train_acc'].append(train_acc) history['train_loss'].append(train_loss) print(f"Epoch {epoch+1}, Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%") # 繪製學習曲線 plt.plot(range(1, num_epochs+1), history['train_acc'], label='Train Accuracy') plt.plot(range(1, num_epochs+1), history['train_loss'], label='Train Loss') plt.xlabel('Epoch') plt.legend() plt.title('Training Curve') plt.show() ```