【机器学习与遥感】sklearn与rasterio实现遥感影像监督分类_遥感分类用一个特征的机器学习方法

作者：知新_RL | 2024-02-09 09:48:17

踩

遥感分类用一个特征的机器学习方法

在学习遥感的过程中，我们都了解到了监督分类与非监督分类，二者是遥感解译的基础。之前更多的是使用Erdas与ENVI来进行这两种分类。这里使用python语言，基于机器学习库sklearn与遥感影像处理库rasterio，使用朴素贝叶斯法（Naive Bayes model）方法实现监督分类。

项目完整代码+数据下载地址https://download.csdn.net/download/GISuuser/88014799

监督分类

监督分类又称训练场地法、训练分类法，是以建立统计识别函数为理论基础、依据典型样本训练方法进行分类的技术，即根据已知训练区提供的样本，通过选择特征参数，求出特征参数作为决策规则，建立判别函数以对各待分类影像进行的图像分类，是模式识别的一种方法。要求训练区域具有典型性和代表性。判别准则若满足分类精度要求，则此准则成立；反之，需重新建立分类的决策规则，直至满足分类精度要求为止。

朴素贝叶斯法（Naive Bayes model）

最为广泛的两种分类模型是决策树模型(Decision Tree Model)和朴素贝叶斯模型（Naive Bayesian Model，NBM）。和决策树模型相比，朴素贝叶斯分类器(Naive Bayes Classifier 或 NBC)发源于古典数学理论，有着坚实的数学基础，以及稳定的分类效率。同时，NBC模型所需估计的参数很少，对缺失数据不太敏感，算法也比较简单。理论上，NBC模型与其他分类方法相比具有最小的误差率。但是实际上并非总是如此，这是因为NBC模型假设属性之间相互独立，这个假设在实际应用中往往是不成立的，这给NBC模型的正确分类带来了一定影响。

贝叶斯分类器中的一种模型，用已知类别的数据集训练模型，从而实现对未知类别数据的类别判断。其理论基础是贝叶斯决策论（Bayesian decision theory）。朴素贝叶斯算法假设了数据集属性之间是相互独立的，因此算法的逻辑性十分简单，并且算法较为稳定，当数据呈现不同的特点时，朴素贝叶斯的分类性能不会有太大的差异。换句话说就是朴素贝叶斯算法的健壮性比较好，对于不同类型的数据集不会呈现出太大的差异性。当数据集属性之间的关系相对比较独立时，朴素贝叶斯分类算法会有较好的效果。

数据准备

样本数据（矢量shp格式）
真彩影像（绘制样本使用）
哨兵影像8波段（坐标系与样本shape文件一致）

样本准备

基于真彩色影像，准备土地分类数据，因为是实验，初步分为耕地、房子和道路三类。样本绘制

程序开发

库引入

import rasterio
from rasterio.mask import mask
import geopandas as gpd
import numpy as np
from shapely.geometry import mapping
import matplotlib.pyplot as plt
from sklearn.naive_bayes import GaussianNB
from rasterio.plot import reshape_as_raster, reshape_as_image
1
2
3
4
5
6
7
8

样本数据读取与影像叠加处理

   shapefile = gpd.read_file(r'E:\测试数据\无人机影像\New_Shapefile.shp')
    geoms = shapefile.geometry.values

    X = None  # pixels for training
    y = np.array([], dtype=np.string_)  # labels for training

    # extract the raster values within the polygon
    with rasterio.open(img_fp) as src:
        band_count = src.count
        X = np.array([], dtype=np.int8).reshape(0, band_count)
        print(band_count)
        for index, geom in enumerate(geoms):
            # 每个多边形转geojson
            feature = [mapping(geom)]

            # 提取多边形内对应的像素值
            out_image, out_transform = mask(src, feature, crop=True)
            # 移除所有波段中全是nodata值的无效像素
            out_image_trimmed = out_image[:, ~np.all(out_image == src.nodata, axis=0)]
            # 矩阵转置
            out_image_reshaped = out_image_trimmed.reshape(-1, band_count)
            # 提取标签
            y = np.append(y, [shapefile["Classname"][index]] * out_image_reshaped.shape[0])
            # NumPy vstack 与 NumPy concatenate 和 NumPy hstack 相关，除了它使您能够将 NumPy 数组垂直组合在一起
            X = np.vstack((X, out_image_reshaped))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

均方值比对分析

fig, ax = plt.subplots(1, 3, figsize=[20, 8])
    #
    # # 波段1-8
    band_count = np.arange(1, 9)
    print(band_count)
    classes = np.unique(y)
    for class_type in classes:
        # np.mean计算平均值的，不过它可以沿指定轴计算算术平均值
        band_intensity = np.mean(X[y == class_type, :], axis=0)
        ax[0].plot(band_count, band_intensity, label=class_type)
        ax[1].plot(band_count, band_intensity, label=class_type)
        ax[2].plot(band_count, band_intensity, label=class_type)
    # plot them as lines

    # Add some axis labels
    ax[0].set_xlabel('Band #')
    ax[0].set_ylabel('Reflectance Value')
    ax[1].set_ylabel('Reflectance Value')
    ax[1].set_xlabel('Band #')
    ax[2].set_ylabel('Reflectance Value')
    ax[2].set_xlabel('Band #')
    # ax[0].set_ylim(32,38)
    ax[1].set_ylim(32, 38)
    ax[2].set_ylim(70, 140)
    # ax.set
    ax[1].legend(loc="upper right")
    # band intensity 谱带强度
    ax[0].set_title('Band Intensities Full Overview')
    ax[1].set_title('Band Intensities Lower Ref Subset')
    ax[2].set_title('Band Intensities Higher Ref Subset')
    plt.text(0.5, 1.0, 'Band Intensities Higher Ref Subset')
    plt.show()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

波段特征
4. 使用分类器进行训练

    gnb = GaussianNB()
    gnb.fit(X, y)
1
2

进行预测

    rs = rasterio.open(r'E:\temp\43261.tif')
    img = rs.read()
    reshaped_img = reshape_as_image(img)
    class_prediction = gnb.predict(reshaped_img.reshape(-1, 8))
    class_prediction = class_prediction.reshape(reshaped_img[:, :, 0].shape)
    class_prediction = str_class_to_int(class_prediction)
    
def str_class_to_int(class_array):
    """
    分类汉字转整数
    :param class_array:
    :return:
    """
    class_array[class_array == '耕地'] = 1
    class_array[class_array == '房子'] = 2
    class_array[class_array == '道路'] = 3
    return class_array.astype(int)
    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

训练结果写入本地栅格文件

    prof = rs.profile
    # 更新为1个波段
    prof.update(count=1)
    with rasterio.open(r'E:\测试数据\无人机影像\0926_01result2.tif', 'w', **prof) as dst:
        dst.write(class_prediction, 1)

1
2
3
4
5
6

训练结果比对展示

    # find the highest pixel value in the prediction image
    n = int(np.max(class_prediction))

    # next setup a colormap for our map
    colors = dict((
        (0, (48, 156, 214, 255)),  # Blue - Water
        (1, (139, 69, 19, 255)),  # Brown - WetSand
        (2, (96, 19, 134, 255)),  # Purple - Emergent Wetland
        (3, (244, 164, 96, 255)),  # Tan - Sand
        (4, (206, 224, 196, 255)),  # Lime - Herbaceous
        (5, (34, 139, 34, 255)),  # Forest Green - Forest
    ))

    # Put 0 - 255 as float 0 - 1
    for k in colors:
        v = colors[k]
        _v = [_v / 255.0 for _v in v]
        colors[k] = _v

    index_colors = [colors[key] if key in colors else
                    (255, 255, 255, 0) for key in range(0, n + 1)]

    cmap = plt.matplotlib.colors.ListedColormap(index_colors, 'Classification', n + 1)
    fig, axs = plt.subplots(2, 1, figsize=(10, 7))

    img_stretched = color_stretch(reshaped_img, [4, 3, 2])
    axs[0].imshow(img_stretched)

    axs[1].imshow(class_prediction, cmap=cmap, interpolation='none')
    plt.show()
    print("展示")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

结果对比
预测时使用的是一景影像，所以使用plt无法很好的展示。

结果验证

对识别结果生成shape文件，叠加在地图上显示；因为道路的样本准备的太少，只有一个，导致了识别结果里没有道路
房屋识别结果展示

总结

各类样本越多，准确度越高
本人也使用了3波段的真彩色影像进行训练，最后发现8波段影像训练的准确度远远大于3波段，3波段的真彩色影像只适合于纹理识别。

项目完整代码+数据下载地址https://download.csdn.net/download/GISuuser/88014799

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/知新_RL/article/detail/71190