最小二乘法估计boston房屋价格_用最小二乘法解决波士顿房价问题

作者：小小林熬夜学编程 | 2024-02-22 21:44:33

踩

用最小二乘法解决波士顿房价问题

最小二乘法估计boston房屋价格

最小二乘法估计Boston房价

最小二乘法估计boston房屋价格
- 线性回归的推导
- 代码部分：

线性回归的推导

$f(\theta, b) = ||A\theta+b-y||_2^2$
$其中A\theta+b-y=$

[\begin{matrix} x_{11} & x_{12} & \dots & x_{1 n} \\ x_{21} & x_{22} & \dots & x_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n 1} & x_{n 2} & \dots & x_{n n} \end{matrix}]

$\begin{bmatrix}x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots\\ x_{n1} & x_{n2} & \cdots & x_{nn} \\ \end{bmatrix}$

[\begin{matrix} θ_{1} \\ θ_{2} \\ ⋮ \\ θ_{n} \end{matrix}]

$\begin{bmatrix}\theta_1\\ \theta_2 \\ \vdots \\\theta_n \\ \end{bmatrix}$ +

[\begin{matrix} b_{1} \\ b_{2} \\ ⋮ \\ b_{n} \end{matrix}]

$\begin{bmatrix}b_1\\ b_2\\ \vdots \\b_n \\ \end{bmatrix}$ -

[\begin{matrix} y_{1} \\ y_{2} \\ ⋮ \\ y_{n} \end{matrix}]

$\begin{bmatrix}y_1\\ y_2\\ \vdots \\y_n \\ \end{bmatrix}$

其 中 A θ + b - y = ⎣ ⎢ ⎢ ⎢ ⎡ x_{11} x_{21} ⋮ x_{n 1} x_{12} x_{22} ⋮ x_{n 2} \dots \dots ⋱ \dots x_{1 n} x_{2 n} ⋮ x_{n n} ⎦ ⎥ ⎥ ⎥ ⎤ ⎣ ⎢ ⎢ ⎢ ⎡ θ_{1} θ_{2} ⋮ θ_{n} ⎦ ⎥ ⎥ ⎥ ⎤ + ⎣ ⎢ ⎢ ⎢ ⎡ b_{1} b_{2} ⋮ b_{n} ⎦ ⎥ ⎥ ⎥ ⎤ - ⎣ ⎢ ⎢ ⎢ ⎡ y_{1} y_{2} ⋮ y_{n} ⎦ ⎥ ⎥ ⎥ ⎤

将上式转化为下式时，需要在A矩阵前加一列数值为1的列向量，

[\begin{matrix} 1 & x_{11} & \dots & x_{1 n} \\ 1 & x_{21} & \dots & x_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & x_{n 1} & \dots & x_{n n} \end{matrix}]

目标函数：

f(\hat\theta) = ||\hat A \hat\theta-y||_2^2

目标函数对

\hat\theta

求导，求解

argminf(\theta)

简化书写，令 $\hat A = A ,\hat \theta = \theta$

$\theta-y||_2^2 = (A\theta - y)^T( A \theta - y) = \theta^T A^TA\theta - \theta^TAy-y^TA\theta+y^Ty$
上式对 $\theta$ 求导后,并令其等于0，
$2A^TA\theta - 2A^Ty = 0$
解得，
$\theta = (A^TA)^{-1}A^Ty,其中A^TA伪逆$

代码部分：

#载入库
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

#提取数据并进行分割训练集和测试集
house = datasets.load_boston()
x = house.data
y = house.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3)

#调用sklearn中LinerRegression模型
model = LinearRegression()
model.fit(x_train, y_train)
y_sk_pred = model.predict(x_test)
error_1 = mean_squared_error(y_test, y_sk_pred)
print(y_sk_pred)
print(error_1)

#根据最小二乘法建立线性回归模型
class LR:
    def fit(self, X, Y):
        X = np.asmatrix(X.copy())
        Y = np.asmatrix(Y).reshape(-1,1) #列向量
        print(np.shape(X)[1])
        self.w = (X.T * X).I * X.T * Y  #调用最小二乘法求得的系数矩阵
    def predict(slef, X):
        X = np.asmatrix(X.copy())
        result = X * slef.w
        return np.asarray(result).ravel()


#改变输入矩阵，在最前边增加一列
b = np.ones(len(x_train))
c = np.ones(len(x_test))
x_train = np.insert(x_train, 0, values = b, axis = 1)
x_test = np.insert(x_test, 0, values = c, axis = 1)

#调用线性回归函数
lr = LR()
lr.fit(x_train, y_train)
y_lr_pred = lr.predict(x_test)
print(y_lr_pred)
print(lr.w) #系数矩阵theta
error_2 = mean_squared_error(y_test, y_lr_pred)
print(error_2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/小小林熬夜学编程/article/detail/131473