当前位置:   article > 正文

Kaggle -- Titanic - Machine Learning from Disaster

Kaggle -- Titanic - Machine Learning from Disaster

新手kaggle之旅:1 . 泰坦尼克号 

使用一个简单的决策树进行模型构建,达到75.8%的准确率(有点低,但是刚开始)

完整代码如下:

  1. import pandas as pd
  2. import numpy as np
  3. df = pd.read_csv("train.csv")
  4. df.info
  5. label = ['Pclass','Sex','Age','SibSp','Fare','Embarked']
  6. x = df[label]
  7. y = df['Survived']
  8. print(x.loc[0])
  9. x['Embarked'] = x['Embarked'].map({'C': 1, 'Q': 2, 'S': 3})
  10. x['Sex'] = x['Sex'].map({'male': 1,'female' : 2})
  11. print(x.loc[0])
  12. x = x.fillna(x.mean())
  13. import sklearn
  14. from sklearn.tree import DecisionTreeClassifier
  15. from sklearn.model_selection import train_test_split
  16. from sklearn.metrics import accuracy_score
  17. train_x,test_x,train_y,test_y = train_test_split(x,y,test_size=0.2,random_state=42,shuffle=True)
  18. clf = DecisionTreeClassifier()
  19. clf.fit(train_x,train_y)
  20. y_pred = clf.predict(test_x)
  21. accuracy = accuracy_score(y_pred,test_y)
  22. print(f"Accuracy: {accuracy * 100:.2f}%")
  23. res = pd.read_csv('test.csv')
  24. print(res.loc[0])
  25. res_x = res[label]
  26. res_x['Embarked'] = res_x['Embarked'].map({'C': 1, 'Q': 2, 'S': 3})
  27. res_x['Sex'] = res_x['Sex'].map({'male': 1,'female' : 2})
  28. print(res_x.loc[0])
  29. res_x = res_x.fillna(res_x.mean())
  30. pred = clf.predict(res_x)
  31. print(pred[0])
  32. ans = res[['PassengerId']].copy()
  33. ans['Survived'] = pred
  34. print(ans.loc[0])
  35. ans.to_csv("ans.csv")

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/知新_RL/article/detail/703926
推荐阅读
相关标签
  

闽ICP备14008679号