当前位置:   article > 正文

【Python】怎么用matplotlib画出漂亮的分析图表

data_df.plot(kind='bar', figsize=(10,6))

特征锦囊:怎么用matplotlib画出漂亮的分析图表

???? Index

关于用matplotlib画图,先前的锦囊里有提及到,不过那些图都是比较简陋的(《特征锦囊:常用的统计图在Python里怎么画?》),难登大雅之堂,作为一名优秀的分析师,还是得学会一些让图表漂亮的技巧,这样子拿出去才更加有面子哈哈。好了,今天的锦囊就是介绍一下各种常见的图表,可以怎么来画吧。

???? 数据集引入

首先引入数据集,我们还用一样的数据集吧,分别是 Salary_Ranges_by_Job_Classification以及 GlobalLandTemperaturesByCity。(具体数据集可以后台回复 plot获取)

  1. # 导入一些常用包
  2. import pandas as pd
  3. import numpy as np
  4. import seaborn as sns
  5. %matplotlib inline
  6. import matplotlib.pyplot as plt
  7. import matplotlib as mpl
  8. plt.style.use('fivethirtyeight')
  9. #解决中文显示问题,Mac
  10. from matplotlib.font_manager import FontProperties
  11. # 查看本机plt的有效style
  12. print(plt.style.available)
  13. # 根据本机available的style,选择其中一个,因为之前知道ggplot很好看,所以我选择了它
  14. mpl.style.use(['ggplot'])
  15. # ['_classic_test''bmh''classic''dark_background''fast''fivethirtyeight''ggplot''grayscale''seaborn-bright''seaborn-colorblind''seaborn-dark-palette''seaborn-dark''seaborn-darkgrid''seaborn-deep''seaborn-muted''seaborn-notebook''seaborn-paper''seaborn-pastel''seaborn-poster''seaborn-talk''seaborn-ticks''seaborn-white''seaborn-whitegrid''seaborn''Solarize_Light2']
  16. # 数据集导入
  17. # 引入第 1 个数据集 Salary_Ranges_by_Job_Classification
  18. salary_ranges = pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv')
  19. # 引入第 2 个数据集 GlobalLandTemperaturesByCity
  20. climate = pd.read_csv('./data/GlobalLandTemperaturesByCity.csv')
  21. # 移除缺失值
  22. climate.dropna(axis=0, inplace=True)
  23. # 只看中国
  24. # 日期转换, 将dt 转换为日期,取年份, 注意map的用法
  25. climate['dt'] = pd.to_datetime(climate['dt'])
  26. climate['year'] = climate['dt'].map(lambda value: value.year)
  27. climate_sub_china = climate.loc[climate['Country'] == 'China']
  28. climate_sub_china['Century'] = climate_sub_china['year'].map(lambda x:int(x/100 +1))
  29. climate.head()

???? 折线图

折线图是比较简单的图表了,也没有什么好优化的,颜色看起来顺眼就好了。下面是从网上找到了颜色表,可以从中挑选~

  1. # 选择上海部分天气数据
  2. df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .set_index('dt')
  5. df1.head()

  1. # 折线图
  2. df1.plot(colors=['lime'])
  3. plt.title('AverageTemperature Of ShangHai')
  4. plt.ylabel('Number of immigrants')
  5. plt.xlabel('Years')
  6. plt.show()

上面这是单条折线图,多条折线图也是可以画的,只需要多增加几列。

  1. # 多条折线图
  2. df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .rename(columns={'AverageTemperature':'SH'})
  5. df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
  6.                   .loc[:,['dt','AverageTemperature']]\
  7.                   .rename(columns={'AverageTemperature':'TJ'})
  8. df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
  9.                   .loc[:,['dt','AverageTemperature']]\
  10.                   .rename(columns={'AverageTemperature':'SY'})
  11. # 合并
  12. df123 = df1.merge(df2, how='inner', on=['dt'])\
  13.                 .merge(df3, how='inner', on=['dt'])\
  14.                 .set_index(['dt'])
  15. df123.head()

  1. # 多条折线图
  2. df123.plot()
  3. plt.title('AverageTemperature Of 3 City')
  4. plt.ylabel('Number of immigrants')
  5. plt.xlabel('Years')
  6. plt.show()

???? 饼图

接下来是画饼图,我们可以优化的点多了一些,比如说从饼块的分离程度,我们先画一个“低配版”的饼图。

df1 = salary_ranges.groupby('SetID', axis=0).sum()

  1. # “低配版”饼图
  2. df1['Step'].plot(kind='pie', figsize=(7,7),
  3.                   autopct='%1.1f%%',
  4.                   shadow=True)
  5. plt.axis('equal')
  6. plt.show()

  1. # “高配版”饼图
  2. colors = ['lightgreen''lightblue'] #控制饼图颜色 ['lightgreen''lightblue''pink''purple''grey''gold']
  3. explode=[00.2] #控制饼图分离状态,越大越分离
  4. df1['Step'].plot(kind='pie', figsize=(77),
  5.                   autopct = '%1.1f%%', startangle=90,
  6.                   shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode)
  7. plt.axis('equal')
  8. plt.legend(labels=df1.index, loc='upper right', fontsize=14)
  9. plt.show()

???? 散点图

散点图可以优化的地方比较少了,ggplot2的配色都蛮好看的,正所谓style选的好,省很多功夫!

  1. # 选择上海部分天气数据
  2. df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .rename(columns={'AverageTemperature':'SH'})
  5. df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
  6.                   .loc[:,['dt','AverageTemperature']]\
  7.                   .rename(columns={'AverageTemperature':'SY'})
  8. # 合并
  9. df12 = df1.merge(df2, how='inner', on=['dt'])
  10. df12.head()

  1. # 散点图
  2. df12.plot(kind='scatter',  x='SH', y='SY', figsize=(106), color='darkred')
  3. plt.title('Average Temperature Between ShangHai - ShenYang')
  4. plt.xlabel('ShangHai')
  5. plt.ylabel('ShenYang')
  6. plt.show()

???? 面积图
  1. # 多条折线图
  2. df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .rename(columns={'AverageTemperature':'SH'})
  5. df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\
  6.                   .loc[:,['dt','AverageTemperature']]\
  7.                   .rename(columns={'AverageTemperature':'TJ'})
  8. df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\
  9.                   .loc[:,['dt','AverageTemperature']]\
  10.                   .rename(columns={'AverageTemperature':'SY'})
  11. # 合并
  12. df123 = df1.merge(df2, how='inner', on=['dt'])\
  13.                 .merge(df3, how='inner', on=['dt'])\
  14.                 .set_index(['dt'])
  15. df123.head()

  1. colors = ['red''pink''blue'] #控制饼图颜色 ['lightgreen''lightblue''pink''purple''grey''gold']
  2. df123.plot(kind='area', stacked=False,
  3.         figsize=(2010), colors=colors)
  4. plt.title('AverageTemperature Of 3 City')
  5. plt.ylabel('AverageTemperature')
  6. plt.xlabel('Years')
  7. plt.show()

???? 直方图
  1. # 选择上海部分天气数据
  2. df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .set_index('dt')
  5. df.head()

  1. # 最简单的直方图
  2. df['AverageTemperature'].plot(kind='hist', figsize=(8,5), colors=['grey'])
  3. plt.title('ShangHai AverageTemperature Of 2010-2013') # add a title to the histogram
  4. plt.ylabel('Number of month') # add y-label
  5. plt.xlabel('AverageTemperature') # add x-label
  6. plt.show()

???? 条形图
  1. # 选择上海部分天气数据
  2. df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\
  3.                   .loc[:,['dt','AverageTemperature']]\
  4.                   .set_index('dt')
  5. df.head()

  1. df.plot(kind='bar', figsize = (106))
  2. plt.xlabel('Month'
  3. plt.ylabel('AverageTemperature'
  4. plt.title('AverageTemperature of shanghai')
  5. plt.show()

  1. df.plot(kind='barh', figsize=(1216), color='steelblue')
  2. plt.xlabel('AverageTemperature'
  3. plt.ylabel('Month'
  4. plt.title('AverageTemperature of shanghai'
  5. plt.show()

今天的内容比较长了,建议收藏起来哦,下次有空的时候可以把它弄进自己的代码库,使用起来更加方便哦~

  1. 往期精彩回顾
  2. 适合初学者入门人工智能的路线及资料下载机器学习及深度学习笔记等资料打印机器学习在线手册深度学习笔记专辑《统计学习方法》的代码复现专辑
  3. AI基础下载机器学习的数学基础专辑
  4. 获取本站知识星球优惠券,复制链接直接打开:
  5. https://t.zsxq.com/qFiUFMV
  6. 本站qq群704220115
  7. 加入微信群请扫码:
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/181884
推荐阅读
相关标签
  

闽ICP备14008679号