赞
踩
将原始数据转换为特征,提高预测的准确性
身高 | 体重 | 性别 | |
---|---|---|---|
1 | 176 | 62 | |
2 | 185 | 74 |
特征值:身高、体重
样本:1、2
目标值:分析性别
重复值:不需要去重
# !pip install -U scikit-learn -i https://pypi.doubanio.com/simple/ --trusted-host pypi.doubanio.com
# 导入包
from sklearn.feature_extraction.text import CountVectorizer
# # 实例化CountVectorizer
vector = CountVectorizer()
# # 调用fit_transform输入并转换数据
res = vector.fit_transform(["life is short,i like python","life is too long,i dislike python"])
print(vector.get_feature_names())
print(res.toarray())
from sklearn.feature_extraction import DictVectorizer
onehot = DictVectorizer() # 如果结果不用toarray,请开启sparse=False
instances = [{'city': '北京','temperature':100},{'city': '上海','temperature':60}, {'city': '深圳','temperature':30}]
X = onehot.fit_transform(instances).toarray()
print(X)
print('*'*30)
# print(onehot.inverse_transform(X))
X = onehot.fit_transform(instances)
print(X)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。