深度学习基础之《TensorFlow框架（9）—案例：实现线性回归》

作者：知新_RL | 2024-03-23 14:56:19

踩

一、线性回归原理复习

1、什么是线性回归
（1）有个假设函数，假定特征值和目标值满足这样的关系
w1x1 + w2x2 + ... + wnxn + b = y
（2）构造损失函数
均方误差、最小二乘法
（3）优化损失
正规方程和梯度下降
（4）当梯度下降到一定程度，使得损失函数比较小的时候，所对应的权重和偏置，就是我们要求的模型参数

二、案例：实现线性回归的训练

1、案例背景
（1）假设随机指定100个点，只有一个特征
（2）数据本身的分布为 y = 0.8 * x + 0.7
（3）这里将数据分布的规律确定，是为了使我们训练出的参数跟真实的参数（即0.8和0.7）比较是否训练准确

2、准备真实数据
x：特征值
y_true：目标值
y_true = 0.8 * x + 0.7
假定x和y之间的关系，满足y = kx + b
经过线性回归，求出来k≈0.8，b≈0.7

3、流程分析
（1）准备100个样本
(100, 1) * (1, 1) = (100, 1)
100行1列，乘以1行1列，得出100行1列

y_predict = x * 权重(1, 1) + 偏置(1, 1)
y_predict = tf.matmul(x, weights) + bias

（2）构造损失函数
error = tf.reduce_mean(tf.square(y_predict - y_true))
求均方误差

（3）优化损失函数
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(error)

（4）训练
反复的run这个optimizer，不断的更新迭代

4、用到的API
矩阵乘法：tf.matmul(x, w)
平方：tf.square(error)
均值：tf.reduce_mean(error)

梯度下降优化：
tf.train.GradientDescentOptimizer(learning_rate)
说明：
（1）梯度下降优化器
（2）learning_rate：学习率，一般为0-1之间比较小的值
（3）梯度下降优化器实例化后的方法
minimize(loss)：让error最小化
（4）return：梯度下降op

tensorflow2.0版本用tf.keras.optimizers.SGD
优化器对照：https://blog.csdn.net/u013587606/article/details/105138271

5、代码实现（tensorflow2.0版本写法）


import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
 
def tensorflow_demo():
    """
    TensorFlow的基本结构
    """
 
    # TensorFlow实现加减法运算
    a_t = tf.constant(2)
    b_t = tf.constant(3)
    c_t = a_t + b_t
    print("TensorFlow加法运算结果：\n", c_t)
    print(c_t.numpy())
 
    # 2.0版本不需要开启会话，已经没有会话模块了
 
    return None
 
def graph_demo():
    """
    图的演示
    """
    # TensorFlow实现加减法运算
    a_t = tf.constant(2)
    b_t = tf.constant(3)
    c_t = a_t + b_t
    print("TensorFlow加法运算结果：\n", c_t)
    print(c_t.numpy())
 
    # 查看默认图
    # 方法1：调用方法
    default_g = tf.compat.v1.get_default_graph()
    print("default_g：\n", default_g)
 
    # 方法2：查看属性
    # print("a_t的图属性：\n", a_t.graph)
    # print("c_t的图属性：\n", c_t.graph)
 
    # 自定义图
    new_g = tf.Graph()
    # 在自己的图中定义数据和操作
    with new_g.as_default():
        a_new = tf.constant(20)
        b_new = tf.constant(30)
        c_new = a_new + b_new
        print("c_new：\n", c_new)
        print("a_new的图属性：\n", a_new.graph)
        print("b_new的图属性：\n", b_new.graph)
 
    # 开启new_g的会话
    with tf.compat.v1.Session(graph=new_g) as sess:
        c_new_value = sess.run(c_new)
        print("c_new_value：\n", c_new_value)
        print("我们自己创建的图为：\n", sess.graph)
 
    # 可视化自定义图
    # 1）创建一个writer
    writer = tf.summary.create_file_writer("./tmp/summary")
    # 2）将图写入
    with writer.as_default():
        tf.summary.graph(new_g)
 
    return None
 
def session_run_demo():
    """
    feed操作
    """
    tf.compat.v1.disable_eager_execution()
    
    # 定义占位符
    a = tf.compat.v1.placeholder(tf.float32)
    b = tf.compat.v1.placeholder(tf.float32)
    sum_ab = tf.add(a, b)
    print("a：\n", a)
    print("b：\n", b)
    print("sum_ab：\n", sum_ab)
    # 开启会话
    with tf.compat.v1.Session() as sess:
        print("占位符的结果：\n", sess.run(sum_ab, feed_dict={a: 1.1, b: 2.2}))
 
    return None
 
def tensor_demo():
    """
    张量的演示
    """
    tensor1 = tf.constant(4.0)
    tensor2 = tf.constant([1, 2, 3, 4])
    linear_squares = tf.constant([[4], [9], [16], [25]], dtype=tf.int32)
    print("tensor1：\n", tensor1)
    print("tensor2：\n", tensor2)
    print("linear_squares：\n", linear_squares)
 
    # 张量类型的修改
    l_cast = tf.cast(linear_squares, dtype=tf.float32)
    print("before：\n", linear_squares)
    print("l_cast：\n", l_cast)
 
    return None
 
def variable_demo():
    """
    变量的演示
    """
    a = tf.Variable(initial_value=50)
    b = tf.Variable(initial_value=40)
    c = tf.add(a, b)
    print("a：\n", a)
    print("b：\n", b)
    print("c：\n", c)
    with tf.compat.v1.variable_scope("my_scope"):
        d = tf.Variable(initial_value=30)
        e = tf.Variable(initial_value=20)
        f = tf.add(d, e)
    print("d：\n", d)
    print("e：\n", e)
    print("f：\n", f)
    return None
 
def linear_regression():
    """
    自实现一个线性回归
    """
    # 1、准备数据
    x = tf.random.normal(shape=[100,1])
    y_true = tf.matmul(x, [[0.8]]) + 0.7
 
    # 2、构造模型
    # 定义模型参数，用变量
    weights = tf.Variable(initial_value=tf.random.normal(shape=[1, 1]))
    bias = tf.Variable(initial_value=tf.random.normal(shape=[1, 1]))
    y_predict = tf.matmul(x, weights) + bias
    
    # 3、构造损失函数
    error = tf.reduce_mean(tf.square(y_predict - y_true))
 
    # 4、优化损失
    #optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(error)
    optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
    
    # 5、查看初始化模型参数之后的值
    print("训练前模型参数为：权重%f，偏置%f，损失%f" % (weights, bias, error))
 
    # 6、开始训练
    num_epoch = 10000 # 定义迭代次数
    
    for e in range(num_epoch): # 迭代多次
        with tf.GradientTape() as tape:
            y_predict = tf.matmul(x, weights) + bias
            error = tf.reduce_mean(tf.square(y_predict - y_true))
        grads = tape.gradient(error, [weights, bias]) # 求损失关于参数weights、bias的梯度
        optimizer.apply_gradients(grads_and_vars=zip(grads, [weights, bias])) # 自动根据梯度更新参数，即利用梯度信息修改weights与bias，使得损失减小
    
    print("训练后模型参数为：权重%f，偏置%f，损失%f" % (weights, bias, error))
 
    return None
 
if __name__ == "__main__":
    # 代码1：TensorFlow的基本结构
    # tensorflow_demo()
    # 代码2：图的演示
    #graph_demo()
    # feed操作
    #session_run_demo()
    # 代码4：张量的演示
    #tensor_demo()
    # 代码5：变量的演示
    #variable_demo()
    # 代码6：自实现一个线性回归
    linear_regression()

运行结果：


训练前模型参数为：权重-0.941857，偏置-0.845241，损失4.750606
训练后模型参数为：权重0.799998，偏置0.699998，损失0.000000

训练后，权重无限接近0.8，偏置无限接近0.7

参考资料：
https://stackoverflow.com/questions/68879963/valueerror-tape-is-required-when-a-tensor-loss-is-passed
https://blog.csdn.net/AwesomeP/article/details/123787448

三、学习率的设置和步数的设置

1、学习率（learning_rate）越大，训练到较好结果的步数（迭代次数）越小；学习率越小，训练到较好结果的步数越大

2、但是学习率过大会出现梯度爆炸现象
比如设置learning_rate=5

四、梯度爆炸

1、什么是梯度爆炸
在极端情况下，权重的值变得非常大，以至于溢出，导致NaN值

2、如何解决梯度爆炸问题（深度神经网络当中更容易出现）
（1）重新设计网络
（2）调整学习率
（3）使用梯度截断（在训练过程中检查和限制梯度的大小）
（4）使用激活函数

五、变量的trainable属性

1、trainable参数是tf.Variable对象的一个属性
用于指定变量是否可训练

2、当trainable=True时，表示该变量将在训练过程中被优化器更新；当trainable=False时，表示该变量在训练过程中不会被优化器更新

3、通常用于固定一些预训练好的模型参数，或者用于实现一些不需要训练的辅助参数

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/知新_RL/article/detail/295897