机器学习： Logistic Regression--python

作者：我家自动化 | 2024-05-15 21:45:37

踩

今天介绍 logistic regression，虽然里面有 regression 这个词，但是这其实是一种分类的方法，这个分类方法输出的也是 0-1 之间的一个数，可以看成是一种概率输出，这个分类器利用一种 BP 迭代和随机梯度下降的方法来训练求得参数和建立分类模型。

首先来看看这个分类器用到的主要函数，即 sigmoid 函数：

y = σ (x) = \frac{1}{1 + e^{- x}}

$y = \sigma (x) = \frac{1}{1 + e^{-x}}$

这个函数有一个很好的特性，就是它的导数，

\frac{\partial y}{\partial x} = σ (x) (1 - σ (x))

$\frac{\partial y }{\partial x} = \sigma (x) ( 1- \sigma (x) )$

下面看看，如何利用这个函数来做分类，假设样本为向量 $\mathbf{x}$ , 经过权重系数 $\mathbf{w}$ 以及 bias 的转换，变成 $u = \mathbf{w}^{T} \mathbf{x} + b$ ，再经过 sigmoid 函数的转换，最终输出一个预测概率 $y = \sigma(u)$ , 样本的 ground truth 为 $t$ , 则预测值与真实 label 之间的误差可以用最小均方误差表示：

e = \frac{1}{2} (y - t)^{2}

$e = \frac{1}{2} (y-t)^2$

我们可以通过不断的调整 $\mathbf{w}$ 和 $b$ 让预测值和真实 label 之间逐渐接近，根据链式法则，我们可以得到：

\frac{\partial e}{\partial w} = \frac{\partial e}{\partial y} \frac{\partial y}{\partial u} \frac{\partial u}{\partial w}

$\frac{\partial e }{\partial \mathbf{w}} = \frac{\partial e }{\partial y} \frac{\partial y }{\partial u} \frac{\partial u }{\partial \mathbf{w}}$

而每一部分的偏导数都可以求得：

$\frac{\partial e }{\partial y} = y- t$
$\frac{\partial y }{\partial u} = \sigma(u) (1 - \sigma(u))$
$\frac{\partial u }{\partial \mathbf{w}} = \mathbf{x}$

根据求得的偏导数，可以对权重系数进行更新：

w := w + α \frac{\partial e}{\partial w}

$\mathbf{w}: = \mathbf{w} + \alpha \frac{\partial e }{\partial \mathbf{w}}$

下面给出一个用 logistic regression 做分类的例子：

import numpy as np
from sklearn import datasets

def Sigmoid(x):
    return 1.0/(1 + np.exp(-x))

def Generate_label(y, N_class):
    N_sample = len(y)
    label = np.zeros((N_sample, N_class))
    for ii in range(N_sample):
        label[ii, int(y[ii])]=1     
    return label

# load the iris data
iris = datasets.load_iris()
x_data = iris.data
y_label = iris.target
class_name = iris.target_names

n_sample = len(x_data)
n_class = len(set(y_label))

np.random.seed(0)
index = np.random.permutation(n_sample)
x_data = x_data[index]
y_label = y_label[index].astype(np.float)

train_x = x_data[: int(.8 * n_sample)]
train_y = y_label[: int( .8 * n_sample)]
test_x = x_data[int(.8 * n_sample) :]
test_y = y_label[int(.8 * n_sample) :]

train_label = Generate_label(train_y, n_class)
test_label = Generate_label(test_y, n_class)

# training process
D = train_x.shape[1]
W = 0.01 * np.random.rand(D, n_class)
b = np.zeros((1, n_class))    

step_size = 1e-1
reg = 1e-3
train_sample = train_x.shape[0]    
batch_size = 10
num_batch = train_sample / batch_size
train_epoch = 1000

for ii in range (train_epoch):

    for batch_ii in range(num_batch):

        batch_x = train_x[batch_ii * batch_size:
            (batch_ii+1) * batch_size, :]
        batch_y = train_label[batch_ii * batch_size:
            (batch_ii+1) * batch_size, :]

        scores = np.dot(batch_x, W) + b
        y_out = Sigmoid(scores)

        e = y_out - batch_y

        dataloss = 0.5 * np.sum(e*e) / batch_size
        regloss = 0.5 * reg *  np.sum(W*W)

        L = dataloss + regloss

        dscores = e * y_out * (1 - y_out) / batch_size
        dw = np.dot(batch_x.T, dscores)
        db = np.sum(dscores, axis=0, keepdims=True)

        dw += reg*W

        W = W - step_size * dw
        b = b - step_size * db

    if (ii % 10 == 0):
        print 'the training loss is: %.4f' % L

# test process
scores = np.dot(test_x, W) + b
y_out = Sigmoid(scores)

predict_out = np.argmax(y_out, axis=1)

print 'test accuracy: %.2f' % (np.mean(predict_out == test_y))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85

声明：本文内容由网友自发贡献，转载请注明出处：【wpsshop】