当前位置:   article > 正文

LSTM-预测概率_python实现lstm给出预测概率

python实现lstm给出预测概率

目录

1.已知知识

1.1LSTM

1.2.随机行走模型

2 问题描述

3 代码

3.1.数据准备

3.2.结果


1.已知知识

1.1LSTM

指长短期记忆人工神经网络。长短期记忆网络(LSTM,Long Short-Term Memory)是一种时间循环神经网络,是为了解决一般的RNN(循环神经网络)存在的长期依赖问题而专门设计出来的。

RNN:Recurrent Neural Network 循环神经网络的计算过程如下:

c^{N<t>}=tanh(W_c[c^{<t-1>},x^{<t>}]+b_c),

\Gamma _u=\sigma (w_u[c^{<t>},x^{<T>}]+b_u),

c^{<t>}=\Gamma _u*c^{N<t>}+{\color{Red} (1-\Gamma _u)}*c^{<t>},

a^{<t>}=c^{<t>},

LSTM当中将{\color{Red} (1-\Gamma _u)}换成了\Gamma _f,而且,对a^{<t>}的更新也换成了\Gamma _o*tanh(c^{<t>})

墙裂推荐吴恩达的课程,图片来自他讲LSTM,非常清楚,LSTM的计算过程如下:

计算三个门的结果除了a^{<t-1>},x^{<t>}外还可以再加一个上次细胞状态c^{<t-1>} ,这个操作叫窥视连孔连接,peophole connection。

1.2.随机行走模型

用random walk model 得出概率的走向,下面这段代码可以玩一下。

  1. __author__ = 'Administrator'
  2. import matplotlib.pyplot as plt
  3. import numpy as np
  4. import pandas as pd
  5. fig=plt.figure()
  6. #time span
  7. T=500
  8. #drift factor飘移率
  9. mu=0.00005
  10. #volatility波动率
  11. sigma=0.04
  12. #t=0初试价
  13. S0=np.random.random()
  14. #length of steps
  15. dt=1
  16. N=round(T/dt)
  17. t=np.linspace(0,T,N)
  18. W=np.random.standard_normal(size=N)
  19. print("W ",W.shape)
  20. #W.shape=(500,)
  21. #几何布朗运动过程
  22. W=np.cumsum(W)*np.sqrt(dt)
  23. X=(mu-0.5*sigma**2)*t+sigma*W
  24. S=S0*np.exp(X)
  25. fd=pd.DataFrame({'pro':S})
  26. fd.to_csv('pic/random_walk.csv',sep=',',index=False)
  27. plt.plot(t,S,lw=2)
  28. plt.show()

2 问题描述

预测某个app在未来某个时间再次被打开的概率,其概率曲线用随机行走模型 random walk model 得出,数据大小都为0-1之间的小数,如果得出的图的取值不在【0,1】,多画几次~,因为每个动作都随机,很有可能会超过【0,1】这个范围。假设共500个分钟,先用70%的数据进行训练,打乱数据集后,再选取30%的数据进行测试,这样可以提高泛化能力。

配置:用cpu跑的,相当的慢了,使用keras

3 代码

  1. import pandas as pd
  2. import numpy as np
  3. import keras
  4. import matplotlib.pyplot as plt
  5. #from sklearn.preprocessing import MinMaxScaler
  6. from keras.models import Sequential
  7. from keras.layers import LSTM, Dense, Activation
  8. import os
  9. os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
  10. test_num=500
  11. train_times=1000
  12. #random walk model to generate the probability tendency of task
  13. def random_walk_model():
  14. fig=plt.figure()
  15. #time span
  16. T=500
  17. #drift factor飘移率
  18. mu=0.00005
  19. #volatility波动率
  20. sigma=0.04
  21. #t=0初试价
  22. S0=np.random.random()
  23. #length of steps
  24. dt=1
  25. N=round(T/dt)
  26. #generate 500 steps and collect it into t
  27. t=np.linspace(0,T,N)
  28. #W is standard normal list
  29. W=np.random.standard_normal(size=N)
  30. print("W ",W)
  31. #W.shape=(500,)
  32. #几何布朗运动过程,产生概率轨迹
  33. W=np.cumsum(W)*np.sqrt(dt)
  34. X=(mu-0.5*sigma**2)*t+sigma*W
  35. S=S0*np.exp(X)
  36. plt.plot(t,S,lw=2)
  37. plt.show()
  38. #save the probability tendency of picture
  39. fd=pd.DataFrame({'pro':S})
  40. fd.to_csv('pic/random_walk_model.csv',sep=',',index=False)
  41. plt.savefig('pic/random_data.png')
  42. return S
  43. def random_test(sequence_length=5,split=0.7):
  44. #get the stored data by using pandas
  45. test_data = pd.read_csv('pic/random_walk_model.csv', sep=',',usecols=[0])
  46. #print("test_data:",test_data)
  47. #generate new test data for 2d
  48. test_data = np.array(test_data).astype('float64')
  49. #print('test_data:',test_data.shape)
  50. #test_data: (500, 1)
  51. #70% are used to be trained, the rest is used to be tested
  52. split_boundary = int(test_data.shape[0] * split)
  53. #print('split_boundary:',split_boundary)
  54. #split_boundary:350
  55. pro_test=np.linspace(split_boundary,test_data.shape[0],test_data.shape[0]-split_boundary)
  56. pro_x=np.linspace(1,split_boundary,split_boundary)
  57. plt.plot(pro_x,test_data[:split_boundary])
  58. plt.plot(pro_test,test_data[split_boundary:],'red')
  59. plt.legend(['train_data','test_data'])
  60. plt.xlabel('times')
  61. plt.ylabel('probability')
  62. plt.show()
  63. #print("test_data: ",test_data,test_data.shape),test_data.shape=(600,1),array to list format
  64. #generate 3d format of data and collect it
  65. data = []
  66. for i in range(len(test_data) - sequence_length - 1):
  67. data.append(test_data[i: i + sequence_length + 1])
  68. #print(len(data[0][0]),len(data[1]),len(data))
  69. #1 6 494
  70. reshaped_data = np.array(data).astype('float64')
  71. #print("reshaped_data:",reshaped_data.shape)
  72. #reshaped_data: (494, 6, 1)
  73. #random the order of test_data to improve the robustness
  74. np.random.shuffle(reshaped_data)
  75. #from n to n*5 are the training data collected in x, the n*6th is the true value collected in y
  76. x = reshaped_data[:, :-1]
  77. y = reshaped_data[:, -1]
  78. #print("x ",x.shape,"\ny ",y.shape)
  79. #x (494, 5, 1) y (494, 1)
  80. #train data
  81. train_x = x[: split_boundary]
  82. train_y = y[: split_boundary]
  83. #test data
  84. test_x = x[split_boundary:]
  85. test_y=y[split_boundary:]
  86. #print("train_y:",train_x.shape,"train_y:",train_y.shape,"test_x ",test_x.shape,"test_y",test_y.shape)
  87. #train_y: (350, 5, 1) train_y: (350, 1) test_x (144, 5, 1) test_y (144, 1)
  88. return train_x, train_y, test_x, test_y
  89. def build_model():
  90. # input_dim是输入的train_x的最后一个维度,相当于输入的神经只有1个——特征只有1个,train_x的维度为(n_samples, time_steps, input_dim)
  91. #如果return_sequences=True:返回形如(samples,timesteps,output_dim)的3D张量否则,返回形如(samples,output_dim)的2D张量
  92. #unit并不是输出的维度,而是门结构(forget门、update门、output门)使用的隐藏单元个数
  93. model = Sequential()
  94. #use rmsprop for optimizer
  95. rmsprop=keras.optimizers.RMSprop(lr=0.001, rho=0.9,epsilon=1e-08,decay=0.0)
  96. #build one LSTM layer
  97. model.add(LSTM(input_dim=1, units=1, return_sequences=False,use_bias=True,activation='tanh'))
  98. #model.add(LSTM(100, return_sequences=False,use_bias=True,activation='tanh'))
  99. #comiple this model
  100. model.compile(loss='mse', optimizer=rmsprop)#rmsprop
  101. return model
  102. def train_model(train_x, train_y, test_x, test_y):
  103. #call function to build model
  104. model = build_model()
  105. try:
  106. #store this model to use its loss parameter
  107. history=model.fit(train_x, train_y, batch_size=20, epochs=train_times,verbose=2)
  108. #store the loss
  109. lossof_history=history.history['loss']
  110. predict = model.predict(test_x)
  111. predict = np.reshape(predict, (predict.size, ))
  112. #evaluate this model by returning a loss
  113. loss=model.evaluate(test_x,test_y)
  114. print("loss is ",loss)
  115. #if there is a KeyboardInterrupt error, do the following
  116. except KeyboardInterrupt:
  117. print("error of predict ",predict)
  118. print("error of test_y: ",test_y)
  119. try:
  120. #x1 is the xlabel to print the test value, there are 500 data,30% is for testing
  121. x1=np.linspace(1,test_y.shape[0],test_y.shape[0])
  122. #x1 is the xlabel to print the loss value, there are 500 data,70% is for training
  123. x2=np.linspace(1,train_times,train_times)
  124. fig = plt.figure(1)
  125. #print the predicted value and true value
  126. plt.title("test with rmsprop lr=0.01_")
  127. plt.plot(x1,predict,'ro-')
  128. plt.plot(x1,test_y,'go-')
  129. plt.legend(['predict', 'true'])
  130. plt.xlabel('times')
  131. plt.ylabel('propability')
  132. plt.savefig('pic/train_with_rmsprop_lr=0.01.png')
  133. #print the loss
  134. fig2=plt.figure(2)
  135. plt.title("loss lr=0.01")
  136. plt.plot(x2,lossof_history)
  137. plt.savefig('pic/train_with_rmsprop_lr=0.01_LOSS_.png')
  138. plt.show()
  139. #if the len(x1) is not equal to predict.shape[0] / test_y.shape[0] / len(x2) is not equal to lossof_history.shape[0],there will be an Exception
  140. except Exception as e:
  141. print("error: ",e)
  142. if __name__ == '__main__':
  143. #random_walk_model() function is only used by once, because data are stored as pic/random_data.csv
  144. #random_walk_model()
  145. #prepare the right data format for LSTM
  146. train_x, train_y, test_x, test_y=random_test()
  147. #standard the format for LSTM input
  148. test_x = np.reshape(test_x, (test_x.shape[0], test_x.shape[1], 1))
  149. #print("main: train_x.shape ",train_x.shape)
  150. #main: train_x.shape (350, 5, 1)
  151. train_model(train_x, train_y, test_x, test_y)

细节

3.1.数据准备

def random_test(sequence_length=6,split=0.7):

在这个函数当中,将获取的数据处理为需要的三维的张量格式,在训练的时候

history=model.fit(train_x, train_y, batch_size=20, epochs=train_times,verbose=2)

train_x.shape=  (415, 6, 1)#600条数据,70%的训练数据,30%测试数据

最后使用的训练数据需要是这样的三维张量。

Question:600条的数据,70%用于训练,那么数据应该是420条,但是为什么是415?

Answer:在预处理数据之后,建立一层的LSTM:

model.add(LSTM(input_dim=1, units=50, return_sequences=False))

后台显示input_dim warning,找了下原因,要使用input_shape作为参数,input_shape为3维,参数为(Batch_size, Time_step, Input_Sizes)。

batch_size设置:batch_size为

在model.add(LSTM(input_shape(,,,), units=, return_sequences=))语句中定义在该语句中不定义batch_size,input_shape=(5,),这只定义了time_step,或者input_shape=(None,5,5),第一个参数定义为None
无法使用model.train_on_batch(),且在test时也需要有batch_size的数据可以调用train_on_batch()

time_step为时间序列的长度/语句的最大长度

input_sizes:每个时间点输入x的维度/语句的embedding的向量维度,本例题做概率预测,再次回忆LSTM的图,x(t)输入是一个概率值,那么input_sizes=1,即特征值只有一个

那这与数据415条的关系在哪呢?

注意默认的参数 sequence_length=6

这是数据的一部分:

  1. '''
  2. 0.7586717205211277
  3. 0.6628550358816061
  4. 0.9184003785782959
  5. 0.09365662435769384
  6. 0.9791582266747239
  7. 0.8700739252039772
  8. 0.7924134549615585
  9. 0.3983410609045436
  10. 0.38988445126231197
  11. 0.8167186985712294
  12. 0.879351951255656
  13. 0.9468282424096985
  14. 0.7060727836006101
  15. 0.7650727081508003
  16. 0.3633755461129521
  17. 0.3489589275449808
  18. '''
  19. for i in range(len(test_data) - sequence_length - 1):
  20. data.append(test_data[i: i + sequence_length + 1])

在上面的语句作用就是

当i=0时,data.append(test[0:6]),一共循环594次,那么data的数据为

  1. [[[0.75867172]
  2. [0.66285504]
  3. [0.91840038]
  4. [0.09365662]
  5. [0.97915823]
  6. [0.87007393]]
  7. [[0.66285504]
  8. [0.91840038]
  9. [0.09365662]
  10. [0.97915823]
  11. [0.87007393]
  12. [0.79241345]]
  13. [[0.91840038]
  14. [0.09365662]
  15. [0.97915823]
  16. [0.87007393]
  17. [0.79241345]
  18. [0.39834106]]
  19. ...
  20. ]

即 其步长为1,生成的data.shape=(594, 6, 1)= (seq_len, batch, input_size)

其中第一个参数表示数据个数,

第二个参数表示,观察dada的数据,步长为1(自己设定的),个数为6,进行划分,batch就是那个6

第三个参数:回忆上面讲的input_shape的input_size——>每个时间点输入x的维度/语句的embedding的向量维度,本例题做概率预测,再次回忆LSTM的图,x(t)输入是一个概率值,那么input_sizes=1,即特征值只有一个

3.2.结果

用于测试的30%的数据预测结果与真实结果如下

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/848252?site
推荐阅读
相关标签
  

闽ICP备14008679号