赞
踩
K近邻算法(K-Nearest Neighbors,简称KNN)是一种基于实例的学习方法,主要用于分类和回归任务。它的基本思想是:给定一个训练数据集,对于一个新的输入实例,在训练数据集中找到与该实例最邻近的K个实例,这K个实例的多数类别就是该输入实例的类别。
思路:
K近邻算法(K-Nearest Neighbors,简称KNN)是一种基于实例的学习方法,主要用于分类和回归任务。它的使用场景包括:
在使用KNN算法时,需要注意以下几点:
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- from sklearn.metrics import accuracy_score
-
- import numpy as np
- from collections import Counter
-
- def euclidean_distance(x1, x2):
- # 计算欧氏距离
- return np.sqrt(np.sum((x1 - x2) ** 2))
-
- class KNN:
- def __init__(self, k=3):
- self.k = k
-
- def fit(self, X, y):
- self.X_train = X
- self.y_train = y
-
- def predict(self, X):
- y_pred = [self._predict(x) for x in X]
- return np.array(y_pred)
-
- def _predict(self, x):
- # 计算输入实例与训练数据集中每个实例之间的距离
- distances = [euclidean_distance(x, x_train) for x_train in self.X_train]
- # 对距离进行排序,找到距离最近的K个实例的索引
- k_indices = np.argsort(distances)[:self.k]
- # 根据这K个实例的类别进行投票,得到输入实例的类别
- k_nearest_labels = [self.y_train[i] for i in k_indices]
- most_common = Counter(k_nearest_labels).most_common(1)
- return most_common[0][0]
-
-
- data = load_iris()
- X, y = data.data, data.target
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-
- knn = KNN(k=3)
- knn.fit(X_train, y_train)
- predictions = knn.predict(X_test)
-
- print("Accuracy:", accuracy_score(y_test, predictions))
sklearn.neighbors.KNeighborsClassifier — scikit-learn 1.4.0 documentation
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- from sklearn.metrics import accuracy_score
- from sklearn.neighbors import KNeighborsClassifier
-
- data = load_iris()
- X, y = data.data, data.target
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-
- knc = KNeighborsClassifier(n_neighbors=3)
- knc.fit(X_train, y_train)
- predictions = knc.predict(X_test)
-
- print("Accuracy:", accuracy_score(y_test, predictions))
在这个示例中,我们首先从scikit-learn库中加载了iris花卉数据集,并将其划分为训练集和测试集。然后,我们创建了一个KNeighborsClassifier对象,并设置了K值为3。接下来,我们使用训练集对模型进行训练,并使用测试集进行预测。最后,我们计算了预测结果的准确度。
sklearn.neighbors.KNeighborsRegressor — scikit-learn 1.4.0 documentation
- from sklearn.datasets import load_iris
- from sklearn.model_selection import train_test_split
- from sklearn.neighbors import KNeighborsRegressor
- from sklearn.metrics import mean_squared_error
-
- # 加载iris花卉数据集
- data = load_iris()
- X = data.data
- y = data.target
-
- # 将数据集划分为训练集和测试集
- X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
-
- # 创建KNeighborsRegressor对象,设置K值为3
- knn = KNeighborsRegressor(n_neighbors=3)
-
- # 使用训练集对模型进行训练
- knn.fit(X_train, y_train)
-
- # 使用测试集进行预测
- y_pred = knn.predict(X_test)
-
- # 计算预测结果的均方误差
- mse = mean_squared_error(y_test, y_pred)
- print("均方误差:", mse)
在这个示例中,我们首先从scikit-learn库中加载了iris花卉数据集,并将其划分为训练集和测试集。然后,我们创建了一个KNeighborsRegressor对象,并设置了K值为3。接下来,我们使用训练集对模型进行训练,并使用测试集进行预测。最后,我们计算了预测结果的均方误差。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。