当前位置:   article > 正文

吴恩达机器学习(三)逻辑回归 2/2 —— 模型向量化_模型输出方向后如何向量化

模型输出方向后如何向量化

注:线性回归的向量化表示见:线性回归向量化

在实际应用中为了计算更为方便,例如在编程中都是使用矩阵进行计算(参考 编程作业(2)逻辑回归),我们可以将整个模型向量化。

对于整个训练集而言:

1. 输入输出及参数

和线性回归一样,用 特征矩阵 X X X 来描述所有特征,用参数向量 θ \theta θ 来描述所有参数,用输出向量 y y y 表示所有输出变量:
X = [ x 0 ( 1 ) x 1 ( 1 ) x 2 ( 1 ) ⋅ ⋅ ⋅ x n ( 1 ) x 0 ( 2 ) x 1 ( 2 ) x 2 ( 2 ) ⋅ ⋅ ⋅ x n ( 2 ) : : : ⋅ ⋅ ⋅ : x 0 ( m ) x 1 ( m ) x 2 ( m ) ⋅ ⋅ ⋅ x n ( m ) ]   ,   θ = [ θ 0 θ 1 : θ n ]   ,   y = [ y ( 1 ) y ( 2 ) : y ( m ) ] X=

[x0(1)x1(1)x2(1)···xn(1)x0(2)x1(2)x2(2)···xn(2):::···:x0(m)x1(m)x2(m)···xn(m)]
\ ,\ \theta=
[θ0θ1:θn]
\ ,\ y=
[y(1)y(2):y(m)]
X=x0(1)x0(2):x0(m)x1(1)x1(2):x1(m)x2(1)x2(2):x2(m)xn(1)xn(2):xn(m) , θ=θ0θ1:θn , y=y(1)y(2):y(m) X X X 的维度是 m ∗ ( n + 1 ) m*(n+1) m(n+1) x 0 = 1 x_0=1 x0=1 θ \theta θ的维度为 ( n + 1 ) ∗ 1 (n+1)*1 (n+1)1 y y y 的维度为 m ∗ 1 m*1 m1 y ( i ) = 0 , 1 y^{(i)}=0,1 y(i)=0,1

2. 假设函数

整个训练集所有假设结果 也可以用一个 m ∗ 1 m*1 m1 维的向量表示:
h θ ( x ) = g ( X θ ) = [ g ( x 0 ( 1 ) θ 0 + x 1 ( 1 ) θ 1 + x 2 ( 1 ) θ 2 + ⋅ ⋅ ⋅ + x n ( 1 ) θ n ) g ( x 0 ( 2 ) θ 0 + x 1 ( 2 ) θ 1 + x 2 ( 2 ) θ 2 + ⋅ ⋅ ⋅ + x n ( 2 ) θ n ) : g ( x 0 ( m ) θ 0 + x 1 ( m ) θ 1 + x 2 ( m ) θ 2 + ⋅ ⋅ ⋅ + x n ( m ) θ n ) ] = [ h θ ( x ( 1 ) ) h θ ( x ( 2 ) ) : h θ ( x ( m ) ) ] = y ^ = [ y ^ ( 1 ) y ^ ( 2 ) : y ^ ( m ) ] h_\theta(x)=g(X\theta)=

[g(x0(1)θ0+x1(1)θ1+x2(1)θ2+···+xn(1)θn)g(x0(2)θ0+x1(2)θ1+x2(2)θ2+···+xn(2)θn):g(x0(m)θ0+x1(m)θ1+x2(m)θ2+···+xn(m)θn)]
=
[hθ(x(1))hθ(x(2)):hθ(x(m))]
=\hat{y}=
[y^(1)y^(2):y^(m)]
hθ(x)=g(Xθ)=g(x0(1)θ0+x1(1)θ1+x2(1)θ2++xn(1)θn)g(x0(2)θ0+x1(2)θ1+x2(2)θ2++xn(2)θn):g(x0(m)θ0+x1(m)θ1+x2(m)θ2++xn(m)θn)=hθ(x(1))hθ(x(2)):hθ(x(m))=y^=y^(1)y^(2):y^(m)这里引入的新符号(读作y帽) y ^ = h θ ( x ) \hat{y}=h_\theta(x) y^=hθ(x) ,有的地方也用 y ^ \hat{y} y^ 来表示样本的预测值,跟假设函数 h θ ( x ) h_\theta(x) hθ(x)的含义其实一样。

3 代价函数

原始公式: J ( θ ) = − 1 m ∑ i = 1 m [ y ( i ) ∗ log ⁡ ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) ∗ log ⁡ ( 1 − h θ ( x ( i ) ) ) ] = − 1 m ∑ i = 1 m [ y ( i ) ∗ log ⁡ ( y ^ ( i ) ) + ( 1 − y ( i ) ) ∗ log ⁡ ( 1 − y ^ ( i ) ) ]

J(θ)=1mi=1m[y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))]=1mi=1m[y(i)log(y^(i))+(1y(i))log(1y^(i))]
J(θ)=m1i=1m[y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))]=m1i=1m[y(i)log(y^(i))+(1y(i))log(1y^(i))]向量化表示为:
J ( θ ) = − 1 m S U M [ y ∗ log ⁡ ( h θ ( x ) ) + ( 1 − y ) ∗ log ⁡ ( 1 − h θ ( x ) ) ] = − 1 m S U M [ y ∗ log ⁡ ( y ^ ) + ( 1 − y ) ∗ log ⁡ ( 1 − y ^ ) ]
J(θ)=1mSUM[ylog(hθ(x))+(1y)log(1hθ(x))]=1mSUM[ylog(y^)+(1y)log(1y^)]
J(θ)=m1SUM[ylog(hθ(x))+(1y)log(1hθ(x))]=m1SUM[ylog(y^)+(1y)log(1y^)]
上式中括号里的计算结果仍是一个向量,因此 S U M SUM SUM 表示对向量的所有项求和,最终得一个标量值。

1.4 梯度下降函数

原公式为:
θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_j:=\theta_j-\alpha\frac{1}{m} \displaystyle\sum_{i=1}^{m} ( h_θ( x^{(i)} ) - y^{(i)})x_j^{(i)} θj:=θjαm1i=1m(hθ(x(i))y(i))xj(i)现用向量来表示所有参数的更新过程: θ = θ − α δ \theta=\theta-\alpha\delta θ=θαδ其中: θ = [ θ 0 θ 1 : θ n ]    ,    δ = 1 m [ ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x 0 ( i ) ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x 1 ( i ) ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x n ( i ) ] \theta=

[θ0θ1:θn]
\ \ ,\ \ \delta=\frac{1}{m}
[i=1m(hθ(x(i))y(i))x0(i)i=1m(hθ(x(i))y(i))x1(i)······i=1m(hθ(x(i))y(i))xn(i)]
θ=θ0θ1:θn  ,  δ=m1i=1m(hθ(x(i))y(i))x0(i)i=1m(hθ(x(i))y(i))x1(i)i=1m(hθ(x(i))y(i))xn(i)又因为:
δ = 1 m [ x 0 ( 1 ) x 0 ( 2 ) ⋅ ⋅ ⋅ x 0 ( m ) x 1 ( 1 ) x 1 ( 2 ) ⋅ ⋅ ⋅ x 1 ( m ) : : ⋅ ⋅ ⋅ : x 0 ( 1 ) x 0 ( 2 ) ⋅ ⋅ ⋅ x 0 ( m ) ] [ h θ ( x ( 1 ) ) − y ( 1 ) h θ ( x ( 2 ) ) − y ( 2 ) ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ h θ ( x ( m ) ) − y ( m ) ] = 1 m X T [ g ( X θ ) − y ] \delta=\frac{1}{m}
[x0(1)x0(2)···x0(m)x1(1)x1(2)···x1(m)::···:x0(1)x0(2)···x0(m)]
[hθ(x(1))y(1)hθ(x(2))y(2)······hθ(x(m))y(m)]
=\frac{1}{m}X^T\left [ g(X\theta)-y \right]
δ=m1x0(1)x1(1):x0(1)x0(2)x1(2):x0(2)x0(m)x1(m):x0(m)hθ(x(1))y(1)hθ(x(2))y(2)hθ(x(m))y(m)=m1XT[g(Xθ)y]
因此,梯度下降可以表示为:
θ = θ − α 1 m X T [ g ( X θ ) − y ] \theta=\theta-\alpha\frac{1}{m}X^T\left [ g(X\theta)-y \right] θ=θαm1XT[g(Xθ)y]

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/菜鸟追梦旅行/article/detail/347653?site
推荐阅读
相关标签
  

闽ICP备14008679号