赞
踩
代码数据网址:https://work.datafountain.cn/forum?id=68&type=2&source=1
- ### 导入数据集
- import numpy as np # Python中进行数值计算的库
- import pandas as pd # Python中进行数据处理的库
- import warnings
- warnings.filterwarnings('ignore') # 忽略弹出的warnings
-
-
- data=pd.read_csv("loan.csv",encoding="utf-8")
- data.head(5)
- print(data.shape)
-
- me v:18340082396
查看缺失值的情况
- ### 查看数据集缺失情况
- missingDf = data.isnull().sum().sort_values(ascending=False).reset_index()
- missingDf.columns = ['feature', 'miss_num']
- missingDf['miss_percentage'] = missingDf['miss_num'] / data.shape[0] # 缺失值比例 data.shape= (400000, 145)
- missingDf.head(10) # 缺失值最多的前十列特征
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。