当前位置:   article > 正文

Deep Convolutional and LSTM Recurrent Neural Networks(可穿戴行为识别)

deep convolutional and lstm recurrent neural networks for multimodal wearabl

概述

本文是参考《Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition》这篇文章,以及对文章的一个复现尝试。
那篇文章是16年提出的,是在现有的多传感器监测人类行为的数据集上,使用了深度学习模型DeepConvLstm来进行识别,影响较大,对18种类型识别的F_score可以达到0.915。
主要贡献:

  • 提出了DeepConvLSTM:一个由卷积层和LSTM递归层组成的深度学习框架,能够自动学习特征表示和并对各活动之间的时间依赖性的进行建模。
  • 该框架可以分别无缝地应用于不同的传感器(加速度计、陀螺仪),并可以将它们融合以提高性能。

论文框架:

在这里插入图片描述
论文中给出的网络框架如上图所示:
文章所需要处理的数据集为包含113维传感器信号的时序数据,要分类的类别数为18类。文中表示以24帧的数据作为一个样本输入,样本的滑窗步长为12帧。每个样本以该样本的最后一帧的标签作为样本标签。
网络包括一个输入层,四个卷积层,2个LSTM层及一个softmax输出层,batch-size为100。
每个卷积层的卷积核长度为(5*1),卷积核个数为64个,没有padding,因此相当于每过一个卷积层,长度缩水4。每个LSTM层包括128个LSTM单元,其参数表如下:
由于是二维卷积层(github上复现的代码不少是一维的卷积层),因此参数表如下:
在这里插入图片描述

实验设置

实验配置为 ubuntu + anaconda3 + pytorch
参考了github 上的几篇源码:
https://github.com/sussexwearlab/DeepConvLSTM
https://github.com/yminoh/DeepConvLSTM_Python3

实验代码

数据集下载及预处理(preprocess_data.py)

运行代码如下:

# 获取OpportunityUCIDataset数据集
wget https://archive.ics.uci.edu/ml/machine-learning-databases/00226/OpportunityUCIDataset.zip
# 对数据集中的数据进行预处理
python preprocess_data.py -h
python preprocess_data.py -i data/OpportunityUCIDataset.zip -o oppChallenge_gestures.data
  • 1
  • 2
  • 3
  • 4
  • 5

preprocess_data.py 中的代码:

import os
import zipfile
import argparse
import numpy as np
import _pickle as cp
from io import BytesIO
from pandas import Series
# Hardcoded number of sensor channels employed in the OPPORTUNITY challenge
NB_SENSOR_CHANNELS = 113

# Hardcoded names of the files defining the OPPORTUNITY challenge data. As named in the original data.
OPPORTUNITY_DATA_FILES = ['OpportunityUCIDataset/dataset/S1-Drill.dat',
                          'OpportunityUCIDataset/dataset/S1-ADL1.dat',
                          'OpportunityUCIDataset/dataset/S1-ADL2.dat',
                          'OpportunityUCIDataset/dataset/S1-ADL3.dat',
                          'OpportunityUCIDataset/dataset/S1-ADL4.dat',
                          'OpportunityUCIDataset/dataset/S1-ADL5.dat',
                          'OpportunityUCIDataset/dataset/S2-Drill.dat',
                          'OpportunityUCIDataset/dataset/S2-ADL1.dat',
                          'OpportunityUCIDataset/dataset/S2-ADL2.dat',
                          'OpportunityUCIDataset/dataset/S2-ADL3.dat',
                          'OpportunityUCIDataset/dataset/S3-Drill.dat',
                          'OpportunityUCIDataset/dataset/S3-ADL1.dat',
                          'OpportunityUCIDataset/dataset/S3-ADL2.dat',
                          'OpportunityUCIDataset/dataset/S3-ADL3.dat',
                          'OpportunityUCIDataset/dataset/S2-ADL4.dat',
                          'OpportunityUCIDataset/dataset/S2-ADL5.dat',
                          'OpportunityUCIDataset/dataset/S3-ADL4.dat',
                          'OpportunityUCIDataset/dataset/S3-ADL5.dat'
                          ]


# Hardcoded thresholds to define global maximums and minimums for every one of the 113 sensor channels employed in the
# OPPORTUNITY challenge
NORM_MAX_THRESHOLDS = [3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,
                       3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,
                       3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,
                       3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,   3000,
                       3000,   3000,   3000,   10000,  10000,  10000,  1500,   1500,   1500,
                       3000,   3000,   3000,   10000,  10000,  10000,  1500,   1500,   1500,
                       3000,   3000,   3000,   10000,  10000,  10000,  1500,   1500,   1500,
                       3000,   3000,   3000,   10000,  10000,  10000,  1500,   1500,   1500,
                       3000,   3000,   3000,   10000,  10000,  10000,  1500,   1500,   1500,
                       250,    25,     200,    5000,   5000,   5000,   5000,   5000,   5000,
                       10000,  10000,  10000,  10000,  10000,  10000,  250,    250,    25,
                       200,    5000,   5000,   5000,   5000,   5000,   5000,   10000,  10000,
                       10000,  10000,  10000,  10000,  250, ]

NORM_MIN_THRESHOLDS = [-3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,
                       -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,
                       -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,
                       -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,  -3000,
                       -3000,  -3000,  -3000,  -10000, -10000, -10000, -1000,  -1000,  -1000,
                       -3000,  -3000,  -3000,  -10000, -10000, -10000, -1000,  -1000,  -1000,
                       -3000,  -3000,  -3000,  -10000, -10000, -10000, -1000,  -1000,  -1000,
                       -3000,  -3000,  -3000,  -10000, -10000, -10000, -1000,  -1000,  -1000,
                       -3000,  -3000,  -3000,  -10000, -10000, -10000, -1000,  -1000,  -1000,
                       -250,   -100,   -200,   -5000,  -5000,  -5000,  -5000,  -5000,  -5000,
                       -10000, -10000, -10000, -10000, -10000, -10000, -250,   -250,   -100,
                       -200,   -5000,  -5000,  -5000,  -5000,  -5000,  -5000,  -10000, -10000,
                       -10000, -10000, -10000, -10000, -250, ]


def select_columns_opp(data):
    """Selection of the 113 columns employed in the OPPORTUNITY challenge
    :param data: numpy integer matrix
        Sensor data (all features)
    :return: numpy integer matrix
        Selection of features
    """

    #                     included-excluded
    features_delete = np.arange(46, 50)
    features_delete = np.concatenate([features_delete, np.arange(59, 63)])
    features_delete = np.concatenate([features_delete, np.arange(72, 76)])
    features_delete = np.concatenate([features_delete, np.arange(85, 89)])
    features_delete = np.concatenate([features_delete, np.arange(98, 102)])
    features_delete = np.concatenate([features_delete, np.arange(134, 243)])
    features_delete = np.concatenate([features_delete, np.arange(244, 249)])
    return np.delete(data, features_delete, 1)


def normalize(data, max_list, min_list):
    """Normalizes all sensor channels
    :param data: numpy integer matrix
        Sensor data
    :param max_list: numpy integer array
        Array containing maximums values for every one of the 113 sensor channels
    :param min_list: numpy integer array
        Array containing minimum values for every one of the 113 sensor channels
    :return:
        Normalized sensor data
    """
    max_list, min_list = np.array(max_list), np.array(min_list)
    diffs = max_list - min_list
    for i in np.arange(data.shape[1]):
        data[:, i] = (data[:, i]-min_list[i])/diffs[i]
    #     Checking the boundaries
    data[data > 1] = 0.99
    data[data < 0] = 0.00
    return data


def divide_x_y(data, label):
    """Segments each sample into features and label
    :param data: numpy integer matrix
        Sensor data
    :param label: string, ['gestures' (default), 'locomotion']
        Type of activities to be recognized
    :return: numpy integer matrix, numpy integer array
        Features encapsulated into a matrix and labels as an array
    """

    data_x = data[:, 1:114]
    if label not in ['locomotion', 'gestures']:
            raise RuntimeError("Invalid label: '%s'" % label)
    if label == 'locomotion':
        data_y = data[:, 114]  # Locomotion label
    elif label == 'gestures':
        data_y = data[:, 115]  # Gestures label

    return data_x, data_y


def adjust_idx_labels(data_y, label):
    """Transforms original labels into the range [0, nb_labels-1]
    :param data_y: numpy integer array
        Sensor labels
    :param label: string, ['gestures' (default), 'locomotion']
        Type of activities to be recognized
    :return: numpy integer array
        Modified sensor labels
    """

    if label == 'locomotion':  # Labels for locomotion are adjusted
        data_y[data_y == 4] = 3
        data_y[data_y == 5] = 4
    elif label == 'gestures':  # Labels for gestures are adjusted
        data_y[data_y == 406516] = 1
        data_y[data_y == 406517] = 2
        data_y[data_y == 404516] = 3
        data_y[data_y == 404517] = 4
        data_y[data_y == 406520] = 5
        data_y[data_y == 404520] = 6
        data_y[data_y == 406505] = 7
        data_y[data_y == 404505] = 8
        data_y[data_y == 406519] = 9
        data_y[data_y == 404519] = 10
        data_y[data_y == 406511] = 11
        data_y[data_y == 404511] = 12
        data_y[data_y == 406508] = 13
        data_y[data_y == 404508] = 14
        data_y[data_y == 408512] = 15
        data_y[data_y == 407521] = 16
        data_y[data_y == 405506] = 17
    return data_y


def check_data(data_set):
    """Try to access to the file and checks if dataset is in the data directory
       In case the file is not found try to download it from original location
    :param data_set:
            Path with original OPPORTUNITY zip file
    :return:
    """
    print('Checking dataset {0}'.format(data_set))
    data_dir, data_file = os.path.split(data_set)
    # When a directory is not provided, check if dataset is in the data directory
    if data_dir == "" and not os.path.isfile(data_set):
        new_path = os.path.join(os.path.split(__file__)[0], "data", data_set)
        if os.path.isfile(new_path) or data_file == 'OpportunityUCIDataset.zip':
            data_set = new_path

    # When dataset not found, try to download it from UCI repository
    if (not os.path.isfile(data_set)) and data_file == 'OpportunityUCIDataset.zip':
        print('... dataset path {0} not found'.format(data_set))
        import urllib
        origin = (
            'https://archive.ics.uci.edu/ml/machine-learning-databases/00226/OpportunityUCIDataset.zip'
        )
        if not os.path.exists(data_dir):
            print('... creating directory {0}'.format(data_dir))
            os.makedirs(data_dir)
        print('... downloading data from {0}'.format(origin))
        urllib.request.urlretrieve(origin, data_set)

    return data_dir


def process_dataset_file(data, label):
    """Function defined as a pipeline to process individual OPPORTUNITY files
    :param data: numpy integer matrix
        Matrix containing data samples (rows) for every sensor channel (column)
    :param label: string, ['gestures' (default), 'locomotion']
        Type of activities to be recognized
    :return: numpy integer matrix, numy integer array
        Processed sensor data, segmented into features (x) and labels (y)
    """

    # Select correct columns
    data = select_columns_opp(data)

    # Colums are segmentd into features and labels
    data_x, data_y =  divide_x_y(data, label)
    data_y = adjust_idx_labels(data_y, label)
    data_y = data_y.astype(int)

    # Perform linear interpolation
    data_x = np.array([Series(i).interpolate() for i in data_x.T]).T

    # Remaining missing data are converted to zero
    data_x[np.isnan(data_x)] = 0

    # All sensor channels are normalized
    data_x = normalize(data_x, NORM_MAX_THRESHOLDS, NORM_MIN_THRESHOLDS)

    return data_x, data_y


def generate_data(dataset, target_filename, label):
    """Function to read the OPPORTUNITY challenge raw data and process all sensor channels
    :param dataset: string
        Path with original OPPORTUNITY zip file
    :param target_filename: string
        Processed file
    :param label: string, ['gestures' (default), 'locomotion']
        Type of activities to be recognized. The OPPORTUNITY dataset includes several annotations to perform
        recognition modes of locomotion/postures and recognition of sporadic gestures.
    """

    data_dir = check_data(dataset)

    data_x = np.empty((0, NB_SENSOR_CHANNELS))
    data_y = np.empty((0))

    zf = zipfile.ZipFile(dataset)
    print('Processing dataset files ...')
    for filename in OPPORTUNITY_DATA_FILES:
        try:
            data = np.loadtxt(BytesIO(zf.read(filename)))
            print('... file {0}'.format(filename))
            x, y = process_dataset_file(data, label)
            data_x = np.vstack((data_x, x))
            data_y = np.concatenate([data_y, y])
        except KeyError:
            print('ERROR: Did not find {0} in zip file'.format(filename))

    # Dataset is segmented into train and test
    nb_training_samples = 557963
    # The first 18 OPPORTUNITY data files define the traning dataset, comprising 557963 samples
    X_train, y_train = data_x[:nb_training_samples,:], data_y[:nb_training_samples]
    X_test, y_test = data_x[nb_training_samples:,:], data_y[nb_training_samples:]

    print("Final datasets with size: | train {0} | test {1} | ".format(X_train.shape,X_test.shape))

    obj = [(X_train, y_train), (X_test, y_test)]
    f = open(os.path.join(data_dir, target_filename), 'wb')
    cp.dump(obj, f, protocol=-1)
    f.close()


def get_args():
    '''This function parses and return arguments passed in'''
    parser = argparse.ArgumentParser(
        description='Preprocess OPPORTUNITY dataset')
    # Add arguments
    parser.add_argument(
        '-i', '--input', type=str, help='OPPORTUNITY zip file', required=True)
    parser.add_argument(
        '-o', '--output', type=str, help='Processed data file', required=True)
    parser.add_argument(
        '-t', '--task', type=str.lower, help='Type of activities to be recognized', default="gestures", choices = ["gestures", "locomotion"], required=False)
    # Array for all arguments passed to script
    args = parser.parse_args()
    # Assign args to variables
    dataset = args.input
    target_filename = args.output
    label = args.task
    # Return all variable values
    return dataset, target_filename, label
if __name__ == '__main__':
    OpportunityUCIDataset_zip, output, l = get_args();
    generate_data(OpportunityUCIDataset_zip, output, l)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283

滑窗切割数据(sliding_window.py)

按照文章中设计,使用滑窗对数据进行切割,每24个采样点为一个样本,使用最后一个采样点时刻的label 作为整个样本的label。

import numpy as np
from numpy.lib.stride_tricks import as_strided as ast

def norm_shape(shape):
    '''
    Normalize numpy array shapes so they're always expressed as a tuple,
    even for one-dimensional shapes.
    Parameters
        shape - an int, or a tuple of ints
    Returns
        a shape tuple
    '''
    try:
        i = int(shape)
        return (i,)
    except TypeError:
        # shape was not a number
        pass

    try:
        t = tuple(shape)
        return t
    except TypeError:
        # shape was not iterable
        pass

    raise TypeError('shape must be an int, or a tuple of ints')

def sliding_window(a,ws,ss = None,flatten = True):
    '''
    Return a sliding window over a in any number of dimensions
    Parameters:
        a  - an n-dimensional numpy array
        ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size
             of each dimension of the window
        ss - an int (a is 1D) or tuple (a is 2D or greater) representing the
             amount to slide the window in each dimension. If not specified, it
             defaults to ws.
        flatten - if True, all slices are flattened, otherwise, there is an
                  extra dimension for each dimension of the input.
    Returns
        an array containing each n-dimensional window from a
    '''

    if None is ss:
        # ss was not provided. the windows will not overlap in any direction.
        ss = ws
    ws = norm_shape(ws)
    ss = norm_shape(ss)

    # convert ws, ss, and a.shape to numpy arrays so that we can do math in every
    # dimension at once.
    ws = np.array(ws)
    ss = np.array(ss)
    shape = np.array(a.shape)


    # ensure that ws, ss, and a.shape all have the same number of dimensions
    ls = [len(shape),len(ws),len(ss)]
    if 1 != len(set(ls)):
        raise ValueError(\
        'a.shape, ws and ss must all have the same length. They were %s' % str(ls))

    # ensure that ws is smaller than a in every dimension
    if np.any(ws > shape):
        raise ValueError(\
        'ws cannot be larger than a in any dimension.\
 a.shape was %s and ws was %s' % (str(a.shape),str(ws)))

    # how many slices will there be in each dimension?
    newshape = norm_shape(((shape - ws) // ss) + 1)
    # the shape of the strided array will be the number of slices in each dimension
    # plus the shape of the window (tuple addition)
    newshape += norm_shape(ws)
    # the strides tuple will be the array's strides multiplied by step size, plus
    # the array's strides (tuple addition)
    newstrides = norm_shape(np.array(a.strides) * ss) + a.strides
    strided = ast(a,shape = newshape,strides = newstrides)
    if not flatten:
        return strided

    # Collapse strided so that it has one more dimension than the window.  I.e.,
    # the new array is a flat list of slices.
    meat = len(ws) if ws.shape else 0
    firstdim = (np.product(newshape[:-meat]),) if ws.shape else ()
    dim = firstdim + (newshape[-meat:])
    # remove any dimensions with size 1
#     dim = filter(lambda i : i != 1,dim)
    return strided.reshape(dim)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89

主程序

导入pytorch等库和并初始化相关数据、
程序中设置每个样本的长度(seq_len)为24,滑窗步长为12,共计113个通道,样本共有18种类别。

import numpy as np
import _pickle as cp
import matplotlib.pyplot as plt
from sliding_window import sliding_window

NB_SENSOR_CHANNELS = 113
NUM_CLASSES = 18
SLIDING_WINDOW_LENGTH = 24
FINAL_SEQUENCE_LENGTH = 8
SLIDING_WINDOW_STEP = 12
BATCH_SIZE = 100
NUM_FILTERS = 64
FILTER_SIZE = 5
NUM_UNITS_LSTM = 128
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

定义数据加载模块,加载之前处理好的数据集,并进行滑窗处理,生成样本集。

#load sensor data
def load_dataset(filename):
    f = open(filename, 'rb')
    data = cp.load(f)
    f.close()
    X_train, y_train = data[0]
    X_test, y_test = data[1]
    print(" ..from file {}".format(filename))
    print(" ..reading instances: train {0}, test {1}".format(X_train.shape, X_test.shape))
    X_train = X_train.astype(np.float32)
    X_test = X_test.astype(np.float32)
    # The targets are casted to int8 for GPU compatibility.
    y_train = y_train.astype(np.uint8)
    y_test = y_test.astype(np.uint8)
    return X_train, y_train, X_test, y_test

#Segmentation and Reshaping
def opp_sliding_window(data_x, data_y, ws, ss):
    data_x = sliding_window(data_x, (ws, data_x.shape[1]), (ss, 1))
    data_y = np.asarray([[i[-1]] for i in sliding_window(data_y, ws, ss)])
    return data_x.astype(np.float32), data_y.reshape(len(data_y)).astype(np.uint8)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

继承pytorch 的 Dataset 类,生成可构建自己数据集的Dataset 子类,能够将将numpy格式的数据集转换成pytorch 能处理的Tensor形式的Dataset

import torch
import torchvision
import torchvision.transforms as transforms

from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import torch.utils.data.dataset as Dataset
import torch.utils.data.dataloader as DataLoader

#创建子类
class subDataset(Dataset.Dataset):
    #初始化,定义数据内容和标签
    def __init__(self, Data, Label):
        self.Data = Data
        self.Label = Label
    #返回数据集大小
    def __len__(self):
        return len(self.Data)
    #得到数据内容和标签
    def __getitem__(self, index):
        data = torch.Tensor(self.Data[index])
        label = torch.Tensor(self.Label[index])
        return data, label
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

进行网络结构初始化,包括四层Conv2d卷积层和2层LSTM层,根据文章设定,卷积层的卷积核大小为(5,1),输出64个feature_maps。

由于Batch_size 为100,因此对于卷积层而言,其tensor size 的变化为:
100 * 1 * 24 * 113 --> 100 * 64 * 20 * 113 --> 100 * 64 * 16 * 113–> 100 * 64 * 12 * 113–> 100 * 64 * 8 * 113

然后为了与Dense层(LSTM层)相连接,需要reshape 一次,转成大小为 100 * 8 * 7232的tensor
LSTM层的隐藏单元为128个。
输出层的激活函数为softmax,其它层则为relu。

注:
每个Dense 层之前需要增加一个Dropout层,p 设置为 0.5

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels = 1, out_channels=NUM_FILTERS, kernel_size = (5,1))
        self.conv2 = nn.Conv2d(NUM_FILTERS, NUM_FILTERS, (5,1))
        self.conv3 = nn.Conv2d(NUM_FILTERS, NUM_FILTERS, (5,1))
        self.conv4 = nn.Conv2d(NUM_FILTERS, NUM_FILTERS, (5,1))
        
        self.lstm1 = nn.LSTM(input_size= (64*113),hidden_size=NUM_UNITS_LSTM,num_layers=1)
        self.lstm2 = nn.LSTM(input_size=NUM_UNITS_LSTM,hidden_size=NUM_UNITS_LSTM,num_layers=1)
        self.out = nn.Linear(128, NUM_CLASSES)


    def forward(self, x):
        x = (F.relu(self.conv1(x)))
        x = (F.relu(self.conv2(x)))
        x = (F.relu(self.conv3(x)))
        x = (F.relu(self.conv4(x)))
        x = x.permute(0,2,1,3)
        x = x.contiguous()
        x = x.view(-1, 864*113)
        x = F.dropout(x, p=0.5)
        x,(h_n,c_n)=self.lstm1(x)
        x = F.dropout(x, p=0.5) 
        x,(h_n,c_n)=self.lstm2(x)
        x = x.view(-1, 1 * 8 * 128)
        x = F.dropout(x, p=0.5) 
        x = F.softmax(self.out(x), dim=1)
        return x
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

权重初始化,每层参数都进行正交初始化。

正交初始化:
用以解决深度网络下的梯度消失、梯度爆炸问题,在RNN中经常使用的参数初始化方法。
https://blog.csdn.net/shenxiaolu1984/article/details/71508892

def weights_init(m):
    classname=m.__class__.__name__
    if classname.find('conv') != -1:
        torch.nn.init.orthogonal(m.weight.data)
        torch.nn.init.orthogonal(m.bias.data)
    if classname.find('lstm') != -1:
        torch.nn.init.orthogonal(m.weight.data)
        torch.nn.init.orthogonal(m.bias.data)
    if classname.find('out') != -1:
        torch.nn.init.orthogonal(m.weight.data)
        torch.nn.init.orthogonal(m.bias.data)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

主函数:加载上文所述各模块,进行网络的训练和预测,并使用F_score来衡量模型在测试集的表现。


if __name__ ==  "__main__":
    
    print("Loading data...")
    X_train, y_train, X_test, y_test = load_dataset('data/oppChallenge_gestures.data')
    assert NB_SENSOR_CHANNELS == X_train.shape[1]
    # Sensor data is segmented using a sliding window mechanism
    X_train, y_train = opp_sliding_window(X_train, y_train, SLIDING_WINDOW_LENGTH, SLIDING_WINDOW_STEP)
    X_test, y_test = opp_sliding_window(X_test, y_test, SLIDING_WINDOW_LENGTH, SLIDING_WINDOW_STEP)

   
    # Data is reshaped
    X_train = X_train.reshape((-1, 1, NB_SENSOR_CHANNELS, SLIDING_WINDOW_LENGTH)) # for input to Conv2D
    X_test = X_test.reshape((-1, 1, NB_SENSOR_CHANNELS, SLIDING_WINDOW_LENGTH)) # for input to Conv2D
        
    print(" ..after sliding and reshaping, train data: inputs {0}, targets {1}".format(X_train.shape, y_train.shape))
    print(" ..after sliding and reshaping, test data : inputs {0}, targets {1}".format(X_test.shape, y_test.shape))
    
    y_train= y_train.reshape(len(y_train),1)
    y_test= y_test.reshape(len(y_test),1)

    net = Net().cuda()
        
    # optimizer, loss function
    optimizer=torch.optim.Adam(net.parameters(), lr=0.00001)
    loss_F=torch.nn.CrossEntropyLoss()
    
    # create My Dateset
    
    train_set  = subDataset(X_train,y_train)
    test_set  = subDataset(X_test,y_test)
    
    print(train_set.__len__())
    print(test_set.__len__())
    
    
    trainloader = DataLoader.DataLoader(train_set, batch_size=100,
                                          shuffle=True, num_workers=2)
    
    testloader = DataLoader.DataLoader(test_set, batch_size=9894,
                                         shuffle=False, num_workers=2)
    
  
    for epoch in range(800):  # loop over the dataset multiple times
        running_loss = 0.0
        for i, data in enumerate(trainloader, 0):
            # get the inputs
            inputs, labels = data
            labels = labels.long()
    
            # wrap them in Variable
            inputs, labels = Variable(inputs).cuda(), Variable(labels).cuda()
            labels = labels.squeeze(1)
            # zero the parameter gradients
            optimizer.zero_grad()
            # forward + backward + optimize
            outputs = net(inputs)
            loss = loss_F(outputs, labels)
            loss.backward()
            optimizer.step()
    
            # print statistics
            running_loss += loss.item()
            if i % 100 == 99:    # print every 2000 mini-batches
                print('[%d, %5d] loss: %.3f' %
                      (epoch + 1, i + 1, running_loss / 100))
                running_loss = 0.0
            
    print('Finished Training')
    
    corret,total = 0,0
    for datas,labels in testloader:
        datas = datas.cuda()
        labels = labels.cuda()
        outputs = net(datas)
        _,predicted = torch.max(outputs.data,1)
        
        labels = labels.long()
        total += labels.size(0)
        corret += (predicted == labels.squeeze(1)).sum()
        
        predicted = predicted.cpu().numpy()
        labels = labels.cpu().squeeze(1).numpy()
        # F_score
        import sklearn.metrics as metrics
        print("\tTest fscore:\t{:.4f} ".format(metrics.f1_score(labels, predicted, average='weighted')))
  
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87

实验结果及进一步修正

实验结果

上述代码跑完之后,可以得到的f_score的结果为:0.7564,和文章中展示的最后结果0.915相去甚远。
经测试,发现应该是训练集数据严重不均衡导致:
训练数据样本集共包含了46495个样本,其中类型0的数据就有32348个,所占比例过大。因此会导致网络倾向于将所有数据都判定成0。
数据分布如下:(单位:个)
类型0:32348;类型1:864;类型2:887;类型3:806;类型4:846;类型5:921;类型6:850;类型7:666;类型8:628;类型9:490;类型10:413;类型11:457;类型12:457;类型13:566;类型14:564;类型15:904;类型16:3246;类型17:623

对数据不均衡场景下的处理方法通常有两种:

  1. 对数据进行下采样或上采样,使各样本数据分布均衡
  2. 对样本权重进行调整,样本数量少的权重调高,样本数量多的权重调低

对数据进行下采样

import random
x_0 = list(np.array(np.where(y_train==0))[0])
x_1 = list(np.array(np.where(y_train==1))[0])
x_2 = list(np.array(np.where(y_train==2))[0])
x_3 = list(np.array(np.where(y_train==3))[0])
x_4 = list(np.array(np.where(y_train==4))[0])
x_5 = list(np.array(np.where(y_train==5))[0])
x_6 = list(np.array(np.where(y_train==6))[0])
x_7 = list(np.array(np.where(y_train==7))[0])
x_8 = list(np.array(np.where(y_train==8))[0])
x_9 = list(np.array(np.where(y_train==9))[0])
x_10 = list(np.array(np.where(y_train==10))[0])
x_11 = list(np.array(np.where(y_train==11))[0])
x_12 = list(np.array(np.where(y_train==12))[0])
x_13 = list(np.array(np.where(y_train==13))[0])
x_14 = list(np.array(np.where(y_train==14))[0])
x_15 = list(np.array(np.where(y_train==15))[0])
x_16 = list(np.array(np.where(y_train==16))[0])
x_17 = list(np.array(np.where(y_train==17))[0])

x_0 = random.sample(x_0, 2000)#1600:0.8151 2000:0.8275
x_16 = random.sample(x_16, 800)
Down_sample = x_0+x_1+x_2+x_3+x_4+x_5+x_6+x_7+x_8+x_9+x_10+x_11+x_12+x_13+x_14+x_15+x_16+x_17
print("Down_sampel_size:",len(Down_sample))
X_train = X_train[Down_sample]
y_train = y_train[Down_sample]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

进行下采样后:
当类型0保留1600个样本时,F_score得分为0.8151,当类型0保留2000个样本时,得分为0.8275。
可见,当继续调整训练集样本分布时,可以达到更高的得分,在此就不一一尝试了。

调整样本权重

根据训练集不同类型的样本数量,调整类型的权重。


   sample_numbers = [32348,864,887,806,846,921,850,666,628,490,413,457,457,566,564,904,3246,623]
   
   weights = []
   
   for i in sample_numbers:
       weights.append(1000.0/i)
       
   weights = torch.from_numpy(np.array(weights)).cuda()
   weights = weights.float()
   
   net.apply(weights_init)
   
   # optimizer, loss function
   optimizer=torch.optim.RMSprop(net.parameters(), lr=0.000001)
   #optimizer=torch.optim.Adam(net.parameters(), lr=0.000001)
   
   loss_F=torch.nn.CrossEntropyLoss(weight=weights)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

再根据loss 值手动调整学习率,最终的F_score得分为0.8665.

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/正经夜光杯/article/detail/840188
推荐阅读
相关标签
  

闽ICP备14008679号