赞
踩
因为一些不可抗力,下面仅展示部分代码(第一问的部分),其余代码看文末
首先导入本题需要的包:
- import numpy as np
- import pandas as pd
- import matplotlib.pyplot as plt
- from datetime import timedelta
- from sklearn.model_selection import train_test_split, cross_val_score
- from sklearn.ensemble import RandomForestClassifier
- from sklearn.metrics import accuracy_score, classification_report
读取数据:
- match_data_path = './Wimbledon_featured_matches.csv'
- match_data = pd.read_csv(match_data_path)
- match_data
数据预处理(部分):
- # 数据预处理
-
- # Convert `elapsed_time` to timedelta
- match_data['elapsed_time_td'] = pd.to_timedelta(match_data['elapsed_time'])
-
- # Calculate the time difference in seconds within each match_id group
- match_data['time_diff'] = match_data.groupby('match_id')['elapsed_time_td'].diff().dt.total_seconds()
-
- # Fill NaN values with the first elapsed_time value in each group, converted to seconds
- match_data['time_diff'] = match_data.groupby('match_id')['time_diff'].fillna(
- match_data['elapsed_time_td'].dt.total_seconds()
- )
-
- # Show the updated dataframe to verify changes
- match_data[['match_id', 'elapsed_time', 'time_diff']].head()
接着来看后面的部分可视化展示:(仅部分代码)
上图代码:
- # 假设p1_scores和p2_scores是您计算得到的分数
- # 创建一个与分数长度相同的索引列表,以用作X轴
- index = list(range(len(p1_scores)))
-
- # 绘制p1和p2的折线图
- plt.plot(index, p1_scores, label='Player 1', linestyle='-', marker='o')
- plt.plot(index, p2_scores, label='Player 2', linestyle='-', marker='o')
-
- # 添加标签和标题
- plt.xlabel('Point Index')
- plt.ylabel('Score')
- plt.title('Player 1 vs Player 2 Performance')
-
- # 添加图例
- plt.legend()
-
- # 显示图形
- plt.show()
添加图片注释,不超过 140 字(可选)
- correlation_matrix = match_data.iloc[:,[1,2,]].corr()
-
- # 可视化相关性矩阵
- plt.figure(figsize=(20, 16))
- sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
-
- # 设置热图的标题
- plt.title('Correlation Matrix')
-
- # 显示图形
- plt.show()
添加图片注释,不超过 140 字(可选)
有关思路、相关代码、讲解视频、参考文献等相关内容可以点击下方群名片哦!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。