当前位置:   article > 正文

泰坦尼克号生还预测 第3关:特征工程与生还预测_泰坦尼克号生还预测头歌答案

泰坦尼克号生还预测头歌答案

头歌学习平台

ps:如果出现时间超时,多提交几遍就好。

 

  1. import pandas as pd
  2. import numpy as np
  3. import sklearn
  4. #********* Begin *********#
  5. from sklearn.ensemble import RandomForestClassifier
  6. from sklearn.ensemble import RandomForestRegressor
  7. titanic = pd.read_csv('./train.csv')
  8. def set_missing_ages(df):
  9. age_df = df[['Age', 'Fare', 'Parch', 'SibSp', 'Pclass']]
  10. known_age = age_df[age_df.Age.notnull()].values
  11. unknown_age = age_df[age_df.Age.isnull()].values
  12. y = known_age[:, 0]
  13. X = known_age[:, 1:]
  14. rfr = RandomForestRegressor(random_state=0, n_estimators=2000, n_jobs=-1)
  15. rfr.fit(X, y)
  16. predictedAges = rfr.predict(unknown_age[:, 1::])
  17. df.loc[(df.Age.isnull()), 'Age'] = predictedAges
  18. return df
  19. titanic = set_missing_ages(titanic)
  20. dummies_Embarked = pd.get_dummies(titanic['Embarked'], prefix= 'Embarked')
  21. dummies_Sex = pd.get_dummies(titanic['Sex'], prefix= 'Sex')
  22. dummies_Pclass = pd.get_dummies(titanic['Pclass'], prefix= 'Pclass')
  23. df = pd.concat([titanic, dummies_Embarked, dummies_Sex, dummies_Pclass], axis=1)
  24. df.drop(['Pclass', 'Name', 'Sex', 'Ticket', 'Cabin', 'Embarked'], axis=1, inplace=True)
  25. train_label = df['Survived']
  26. train_titanic = df.drop('Survived', 1)
  27. titanic_test = pd.read_csv('./test.csv')
  28. titanic_test = set_missing_ages(titanic_test)
  29. dummies_Embarked = pd.get_dummies(titanic_test['Embarked'], prefix= 'Embarked')
  30. dummies_Sex = pd.get_dummies(titanic_test['Sex'], prefix= 'Sex')
  31. dummies_Pclass = pd.get_dummies(titanic_test['Pclass'], prefix= 'Pclass')
  32. df_test = pd.concat([titanic_test,dummies_Embarked, dummies_Sex, dummies_Pclass], axis=1)
  33. df_test.drop(['Pclass', 'Name', 'Sex', 'Ticket', 'Cabin', 'Embarked'], axis=1, inplace=True)
  34. model = RandomForestClassifier(n_estimators=10)
  35. model.fit(train_titanic, train_label)
  36. predictions = model.predict(df_test)
  37. result = pd.DataFrame({'Survived':predictions.astype(np.int32)})
  38. result.to_csv("./predict.csv", index=False)
  39. #********* End *********#

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/237080
推荐阅读
相关标签
  

闽ICP备14008679号