机器学习(Machine Learning)是人工智能(Artificial Intelligence)的一个分支,它旨在让计算机自动学习和理解数据,从而进行决策和预测。深度学习(Deep Learning)是机器学习的一个子集,它主要通过多层神经网络来模拟人类大脑的思维过程。自然语言处理(Natural Language Processing,NLP)是人工智能的一个分支,它旨在让计算机理解和生成人类语言。
其中,$y$ 是输出,$x$ 是输入,$W$ 是权重矩阵,$b$ 是偏置向量,$f$ 是激活函数。
$$ L(y, \hat{y}) = \frac{1}{n} \sum{i=1}^{n} (yi - \hat{y}_i)^2 $$
$$ L(y, \hat{y}) = -\sum{i=1}^{n} yi \log(\hat{y}i) - (1 - yi) \log(1 - \hat{y}_i) $$
$$ W{t+1} = Wt - \eta \nabla L(W_t) $$
$$ W{t+1} = Wt - \eta \nabla L(Wt, xi, y_i) $$
$$ mt = \beta1 m{t-1} + (1 - \beta1) \nabla L(W{t-1}) \ vt = \beta2 v{t-1} + (1 - \beta2) (\nabla L(W{t-1}))^2 \ Wt = W{t-1} - \eta \frac{mt}{\sqrt{vt} + \epsilon} $$
$$ wi = \sum{j=1}^{n} a{ij} vj + b_i $$
$$ P(wt | w{t-1}, ..., w1) = \frac{C(w{t-1}, ..., w1, wt)}{C(w{t-1}, ..., w1)} $$
$$ ht = f(W{hh} h{t-1} + W{xh} xt + bh) \ yt = W{hy} ht + by $$
$$ it = \sigma(W{ii} xt + W{hi} h{t-1} + bi) \ ft = \sigma(W{if} xt + W{hf} h{t-1} + bf) \ ot = \sigma(W{io} xt + W{ho} h{t-1} + bo) \ gt = tanh(W{ig} xt + W{hg} h{t-1} + bg) \ ct = ft * c{t-1} + it * gt \ ht = ot * tanh(ct) $$
$$ Attention(Q, K, V) = softmax(\frac{QK^T}{\sqrt{dk}})V \ MultiHead(Q, K, V) = Concat(head1, ..., headh)W^O \ Encoder: N = \lfloor \frac{L}{2} \rfloor \ Ei = MultiHead(H^{2i-1}, H^{2i}, E^{2i-1}W^E) + E^{2i-1} \ Decoder: N = \lfloor \frac{L}{2} \rfloor \ Di = MultiHead(E^{2N+i}, E^{2N+i-1}, D^{2N+i-1}W^D) + D^{2N+i-1} \ P(y1, ..., yT) = \prod{i=1}^{T} P(yi | y{
```python import numpy as np from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Flatten from keras.utils import to_categorical
(xtrain, ytrain), (xtest, ytest) = mnist.load_data()
xtrain = xtrain.reshape(-1, 28 * 28).astype('float32') / 255 xtest = xtest.reshape(-1, 28 * 28).astype('float32') / 255 ytrain = tocategorical(ytrain, 10) ytest = tocategorical(ytest, 10)
model = Sequential() model.add(Flatten(input_shape=(28 * 28,))) model.add(Dense(128, activation='relu')) model.add(Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(xtrain, ytrain, epochs=10, batch_size=32)
loss, accuracy = model.evaluate(xtest, ytest) print('Accuracy: %.2f' % (accuracy * 100)) ```
