tags: `machine learning`|`python`

機器學習 - 多項式回歸(Polynomial Regression)

介紹

一個應變數(
$Y$ )和一個或多個自變數(
$X$ )間多項式的回歸分析方式
一個自變量 –> 一元多項式回歸
多個自變量 –> 多元多項式回歸
一元回歸分析中，應變數(
$Y$ )與自變數(
$X$ )為非線性關係時，可採用一元多項式回歸

目的：

解釋data過去現象

利用自變數(
$X$ )來預測應變數(
$Y$ )的未來可能數值

方程式：

y = b_{0} + b_{1} x_{1} + b_{2} x_{1}^{2} + . . . + b_{n} x_{1}^{n}

(圖形為拋物線)
Why "Linear"？
項與項之間都是線性組合的關係(都是相乘再相加)

程式碼操作


from sklearn.preprocessing import PolynomialFeatures

使用sklearn中的preprocessing的PolynomialFeatures類別
提供多項式特徵處理的方法


poly_reg = PolynomialFeatures(degree = 2)

degree = 2 代表最高次方為2


x_poly = poly_reg.fit_transform(x)

使用PolynomialFeatures中的fit_transfor()進行資料擬合與轉換

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →


x_grid = np.arange(min(x), max(x), 0.1)

將x的差距改為0.1(原本差距為1)

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

有更多點，使曲線更平滑






new_x = 6.5
new_x = np.array(new_x).reshape(-1, 1)

lin_reg.predict(new_x)

lin_reg2.predict(poly_reg.fit_transform(new_x))

假設有一位級數為6.5級的應徵者，分別用簡單線性回歸和多項式回歸來操作，結果如下

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

由此可知，若用簡單線性回歸模型，公司需支付給此人的薪水超出太多

練習

機器學習-作業10

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →









































































# Importing the libraries 
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv("Position_Salaries.csv")
x = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

'''
# Missing Data
# Categorical Data

# Splitting the Dataset into the Training set and Test set
1.只有10筆資料，訓練集合會太小，模型誤差大
2.level希望全部做分析，才能完整觀察與薪水的關係

# Feature Scaling
Linear Regression自帶特徵縮放，不做
'''

# Simple Linear Regression
from sklearn.linear_model import LinearRegression

lin_reg = LinearRegression()
lin_reg.fit(x, y)

# Graph of Simple Linear Regression
plt.scatter(x, y, color = 'red')
plt.plot(x, lin_reg.predict(x), color = 'blue')
plt.title("Truth or Bluff (Simple Linear Regression)")
plt.xlabel("Position Level")
plt.ylabel("Salary")
plt.show()


# Polynomial Regression
from sklearn.preprocessing import PolynomialFeatures

poly_reg = PolynomialFeatures(degree = 5)
x_poly = poly_reg.fit_transform(x)

lin_reg2 = LinearRegression()
lin_reg2.fit(x_poly, y)

# Graph of Polynomial Regression
plt.scatter(x, y, color = 'red')
plt.plot(x, lin_reg2.predict(x_poly), color = 'blue')
plt.title("Truth or Bluff (Polynomial Regression)")
plt.xlabel("Position Level")
plt.ylabel("Salary")
plt.show()

'''
讓線條平滑
'''
x_grid = np.arange(min(x), max(x), 0.1)
x_grid = x_grid.reshape(len(x_grid), 1)

plt.scatter(x, y, color = 'red')
plt.plot(x_grid, lin_reg2.predict(poly_reg.fit_transform(x_grid)), color = 'blue')
plt.title("Truth or Bluff (Polynomial Regression)")
plt.xlabel("Position Level")
plt.ylabel("Salary")
plt.show()

new_x = 6.5
new_x = np.array(new_x).reshape(-1, 1)

lin_reg.predict(new_x)

lin_reg2.predict(poly_reg.fit_transform(new_x))

tags: machine learning|python

機器學習 - 多項式回歸(Polynomial Regression)

介紹

程式碼操作

練習

Read more

1️⃣機器學習 - 資料預處理

2️⃣機器學習 - 簡單線性回歸(Simple Linear Regression)

8️⃣機器學習 - 單純貝氏分類器(Naive Bayes)

5️⃣機器學習 - R平方(R Squared)

tags: `machine learning`|`python`