当前位置:   article > 正文

最小二乘法估计boston房屋价格_用最小二乘法解决波士顿房价问题

用最小二乘法解决波士顿房价问题

最小二乘法估计boston房屋价格

线性回归的推导

f ( θ , b ) = ∣ ∣ A θ + b − y ∣ ∣ 2 2 f(\theta, b) = ||A\theta+b-y||_2^2 f(θ,b)=Aθ+by22
其 中 A θ + b − y = [ x 11 x 12 ⋯ x 1 n x 21 x 22 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ x n 1 x n 2 ⋯ x n n ] [ θ 1 θ 2 ⋮ θ n ] + [ b 1 b 2 ⋮ b n ] − [ y 1 y 2 ⋮ y n ] 其中A\theta+b-y=

[x11x12x1nx21x22x2nxn1xn2xnn]
[θ1θ2θn]
+
[b1b2bn]
-
[y1y2yn]
Aθ+by=x11x21xn1x12x22xn2x1nx2nxnnθ1θ2θn+b1b2bny1y2yn
将上式转化为下式时,需要在A矩阵前加一列数值为1的列向量,
A θ + b − y = [ 1 x 11 ⋯ x 1 n 1 x 21 ⋯ x 2 n ⋮ ⋮ ⋱ ⋮ 1 x n 1 ⋯ x n n ] [ θ 1 θ 2 ⋮ θ n ] − [ y 1 y 2 ⋮ y n ] = A ^ θ ^ − y A\theta+b-y =
[1x11x1n1x21x2n1xn1xnn]
[θ1θ2θn]
-
[y1y2yn]
= \hat A \hat\theta-y
Aθ+by=111x11x21xn1x1nx2nxnnθ1θ2θny1y2yn=A^θ^y

目标函数:
f ( θ ^ ) = ∣ ∣ A ^ θ ^ − y ∣ ∣ 2 2 f(\hat\theta) = ||\hat A \hat\theta-y||_2^2 f(θ^)=A^θ^y22
目标函数对 θ ^ \hat\theta θ^求导,求解 a r g m i n f ( θ ) argminf(\theta) argminf(θ):

简化书写,令 A ^ = A , θ ^ = θ \hat A = A ,\hat \theta = \theta A^=A,θ^=θ

a r g m i n ∣ ∣ A θ − y ∣ ∣ 2 2 = ( A θ − y ) T ( A θ − y ) = θ T A T A θ − θ T A y − y T A θ + y T y argmin||A \theta-y||_2^2 = (A\theta - y)^T( A \theta - y) = \theta^T A^TA\theta - \theta^TAy-y^TA\theta+y^Ty argminAθy22=(Aθy)T(Aθy)=θTATAθθTAyyTAθ+yTy
上式对 θ \theta θ求导后,并令其等于0,
2 A T A θ − 2 A T y = 0 2A^TA\theta - 2A^Ty = 0 2ATAθ2ATy=0
解得,
θ = ( A T A ) − 1 A T y , 其 中 A T A 伪 逆 \theta = (A^TA)^{-1}A^Ty,其中A^TA伪逆 θ=(ATA)1ATy,ATA

代码部分:

#载入库
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

#提取数据并进行分割训练集和测试集
house = datasets.load_boston()
x = house.data
y = house.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3)

#调用sklearn中LinerRegression模型
model = LinearRegression()
model.fit(x_train, y_train)
y_sk_pred = model.predict(x_test)
error_1 = mean_squared_error(y_test, y_sk_pred)
print(y_sk_pred)
print(error_1)

#根据最小二乘法建立线性回归模型
class LR:
    def fit(self, X, Y):
        X = np.asmatrix(X.copy())
        Y = np.asmatrix(Y).reshape(-1,1) #列向量
        print(np.shape(X)[1])
        self.w = (X.T * X).I * X.T * Y  #调用最小二乘法求得的系数矩阵
    def predict(slef, X):
        X = np.asmatrix(X.copy())
        result = X * slef.w
        return np.asarray(result).ravel()


#改变输入矩阵,在最前边增加一列
b = np.ones(len(x_train))
c = np.ones(len(x_test))
x_train = np.insert(x_train, 0, values = b, axis = 1)
x_test = np.insert(x_test, 0, values = c, axis = 1)

#调用线性回归函数
lr = LR()
lr.fit(x_train, y_train)
y_lr_pred = lr.predict(x_test)
print(y_lr_pred)
print(lr.w) #系数矩阵theta
error_2 = mean_squared_error(y_test, y_lr_pred)
print(error_2)

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小小林熬夜学编程/article/detail/131473
推荐阅读
相关标签
  

闽ICP备14008679号