当前位置:   article > 正文

利用seaborn、statannotations库绘制显著性标注_python bar图添加显著性

python bar图添加显著性

 

目录

1、Python-Seaborn 自定义函数绘制

2、Python-statannotations库添加显著性标注

3、Python-statannotations库绘制显著性标注并自己设置标识

4、seaborn中数据的读取格式

5、seaborn.barplot参数(柱状图)

6、相关性热力图自动标记显著性


如何使用Python-SeabornSeaborn进行显著性统计图表绘制,详细内容如下:

  • Python-Seaborn自定义函数绘制
  • Python-statannotations库添加显著性标注

1、Python-Seaborn 自定义函数绘制

  1. import matplotlib.pylab as plt
  2. import numpy as np
  3. import seaborn as sns
  4. import scipy
  5. # ---------------------自定义P值和星号对应关系----------------------
  6. def convert_pvalue_to_asterisks(pvalue):
  7. if pvalue <= 0.0001:
  8. return "****"
  9. elif pvalue <= 0.001:
  10. return "***"
  11. elif pvalue <= 0.01:
  12. return "**"
  13. elif pvalue <= 0.05:
  14. return "*"
  15. return "ns"
  16. # ---------------------scipy.stats 计算显著性指标----------------
  17. iris = sns.load_dataset("iris")
  18. data_p = iris[["sepal_length","species"]]
  19. stat,p_value = scipy.stats.ttest_ind(data_p[data_p["species"]=="setosa"]["sepal_length"],
  20. data_p[data_p["species"]=="versicolor"]["sepal_length"],
  21. equal_var=False)
  22. # ------------------------可视化绘制---------------------------
  23. plt.rcParams['font.family'] = ['Times New Roman']
  24. plt.rcParams["axes.labelsize"] = 18
  25. palette=['#0073C2FF','#EFC000FF','#868686FF']
  26. fig,ax = plt.subplots(figsize=(5,4),dpi=100,facecolor="w")
  27. ax = sns.barplot(x="species",y="sepal_length",data=iris,palette=palette,
  28. estimator=np.mean,ci="sd", capsize=.1,errwidth=1,errcolor="k",
  29. ax=ax,
  30. **{"edgecolor":"k","linewidth":1})
  31. # 添加P值
  32. x1, x2 = 0, 1
  33. y,h = data_p["sepal_length"].mean()+1,.2
  34. #绘制横线位置
  35. ax.plot([x1, x1, x2, x2], [y, y+h, y+h, y], lw=1, c="k")
  36. #添加P值
  37. ax.text((x1+x2)*.5, y+h, "T-test: {} ".format(p_value), ha='center', va='bottom', color="k")
  38. ax.tick_params(which='major',direction='in',length=3,width=1.,labelsize=14,bottom=False)
  39. for spine in ["top","left","right"]:
  40. ax.spines[spine].set_visible(False)
  41. ax.spines['bottom'].set_linewidth(2)
  42. ax.grid(axis='y',ls='--',c='gray')
  43. ax.set_axisbelow(True)
  44. plt.show()

2、Python-statannotations库添加显著性标注

Python-statannotations库则是针对Seaborn绘图对象进行显著性标注的专用库,其可以提供柱形图、箱线图、小提琴图等统计图表的显著性标注绘制,计算P值方法基于scipy.stats方法,这里我们简单列举几个示例演示即可,更多详细内容可参看:项目地址使用教程 or 使用教程

样例一:

  1. import seaborn as sns
  2. import matplotlib.pylab as plt
  3. from statannotations.Annotator import Annotator
  4. df = sns.load_dataset("tips")
  5. x = "day"
  6. y = "total_bill"
  7. order = ['Sun', 'Thur', 'Fri', 'Sat']
  8. fig,ax = plt.subplots(figsize=(5,4),dpi=100,facecolor="w")
  9. ax = sns.boxplot(data=df, x=x, y=y, order=order,ax=ax)
  10. pairs=[("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")]
  11. annotator = Annotator(ax, pairs, data=df, x=x, y=y, order=order)
  12. annotator.configure(test='Mann-Whitney', text_format='star',line_height=0.03,line_width=1)
  13. annotator.apply_and_annotate()
  14. ax.tick_params(which='major',direction='in',length=3,width=1.,labelsize=14,bottom=False)
  15. for spine in ["top","left","right"]:
  16. ax.spines[spine].set_visible(False)
  17. ax.spines['bottom'].set_linewidth(2)
  18. ax.grid(axis='y',ls='--',c='gray')
  19. ax.set_axisbelow(True)
  20. plt.show()

样例二:

  1. import seaborn as sns
  2. import matplotlib.pyplot as plt
  3. plt.rcParams['font.family'] = ['Times New Roman']
  4. plt.rcParams["axes.labelsize"] = 18
  5. #palette=['#0073C2FF','#EFC000FF']
  6. palette=['#E59F01','#56B4E8']
  7. #palette = ["white","black"]
  8. fig,ax = plt.subplots(figsize=(5,4),dpi=100,facecolor="w")
  9. ax = sns.barplot(x="order",y="value",hue="class",data=group_data_p,palette=palette,ci="sd",
  10. capsize=.1,errwidth=1,errcolor="k",ax=ax,
  11. **{"edgecolor":"k","linewidth":1})
  12. # 添加P值
  13. box_pairs = [(("one","type01"),("two","type01")),
  14. (("one","type02"),("two","type02")),
  15. (("one","type01"),("three","type01")),
  16. (("one","type02"),("three","type02")),
  17. (("two","type01"),("three","type01")),
  18. (("two","type02"),("three","type02"))]
  19. annotator = Annotator(ax, data=group_data_p, x="order",y="value",hue="class",
  20. pairs=box_pairs)
  21. annotator.configure(test='t-test_ind', text_format='star',line_height=0.03,line_width=1)
  22. annotator.apply_and_annotate()

样例三:如果针对组间数据进行统计分析,可以设置pairs参数据如下:

  1. box_pairs = [(("one","type01"),("one","type02")),
  2. (("two","type01"),("two","type02")),
  3. (("three","type01"),("three","type02"))]

案例四:自定义显著性

  1. import seaborn as sns
  2. import matplotlib.pylab as plt
  3. from statannotations.Annotator import Annotator
  4. df = sns.load_dataset("tips")
  5. x = "day"
  6. y = "total_bill"
  7. order = ['Sun', 'Thur', 'Fri', 'Sat']
  8. pairs = [("Sun", "Thur"), ("Sun", "Sat"), ("Fri", "Sun")]
  9. ax = sns.boxplot(data=df, x=x, y=y, order=order)
  10. annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
  11. annot.new_plot(ax, pairs=pairs, data=df, x=x, y=y, order=order)
  12. annot.configure(test=None, loc='inside')
  13. annot.set_pvalues([0.1, 0.1, 0.001])
  14. annot.annotate()
  15. plt.show()

3、Python-statannotations库绘制显著性标注并自己设置标识

在安装的statannotations库文件夹下找到 PValueFormat.py文件并打开

找到下面这个函数,你可以通过修改这个函数添加自己想要的标识效果

4、seaborn中数据的读取格式

例如以 tips 为例,数据结构如下:

例一:如果设置单个柱状图,只需要修改x,y即可。

  1. import seaborn as sns
  2. import matplotlib.pylab as plt
  3. df = sns.load_dataset("tips")
  4. order = ['Sun', 'Thur', 'Fri', 'Sat']
  5. fig, ax = plt.subplots(figsize=(5, 4), dpi=100, facecolor="w")
  6. ax = sns.barplot(data=df, x="day", y="total_bill", order=order, ax=ax,capsize=0.2)
  7. plt.show()

例二:如果要绘制分组的柱状图,则还需要设置hue

  1. import seaborn as sns
  2. import matplotlib.pylab as plt
  3. df = sns.load_dataset("tips")
  4. order = ['Sun', 'Thur', 'Fri', 'Sat']
  5. fig, ax = plt.subplots(figsize=(5, 4), dpi=100, facecolor="w")
  6. ax = sns.barplot(data=df, x="day", y="total_bill", order=order, ax=ax,capsize=0.2,hue="sex")
  7. plt.show()

5、seaborn.barplot参数(柱状图)

  1. seaborn.barplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=mean , ci=95,
  2. n_boot=1000, units=None, seed=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26',
  3. errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)
  • x,y:str ,dataframe中的列名
  • hue:dataframe的列名,按照列名的值分类形成分类的条形图;
  • data:dataframe或数组
  • order,hue_order(list of strings):用于控制条形图的顺序;
  • estimator:默认mean 可以修改为 median 中位数
  • ci:置信区间的大小(默认95%),如果为sd,跳过引导程序并绘制观测值的标准偏差(标准差 (standard Deviation,SD)、标准误差(standard Error,SE)、置信区间表示);
  • orient:绘图方向,v,h
  • palette:调色板【"Set3",""】
  • saturation:饱和度
  • capsize:设置误差棒帽条(上下两根横线)的宽度;
  • n_boot:计算代表置信区间的误差线时,默认采用bootstrap抽样方法,控制bootstrap抽样次数;
  • errcolor:设置误差线颜色,默认黑色;
  • errwidth:设置误差线的显示线宽;
  • dodge:当使用分类参数hue时,dodge=True,不同bar显示,False 同bar不同颜色;
  • ax:选择图形将显示在哪个axes对象上,默认当前Axes对象;

6、相关性热力图自动标记显著性

  1. import pandas as pd
  2. import seaborn as sns
  3. import matplotlib.pyplot as plt
  4. import numpy as np
  5. from scipy.stats import pearsonr
  6. import matplotlib as mpl
  7. def cm2inch(x,y):
  8. return x/2.54,y/2.54
  9. size1 = 10.5
  10. mpl.rcParams.update(
  11. {
  12. 'text.usetex': False,
  13. 'font.family': 'stixgeneral',
  14. 'mathtext.fontset': 'stix',
  15. "font.family":'serif',
  16. "font.size": size1,
  17. "font.serif": ['Times New Roman'],
  18. }
  19. )
  20. fontdict = {'weight': 'bold','size':size1,'family':'SimHei'}
  21. df_coor=np.random.random((10,10)) # 相关性结果
  22. fig = plt.figure(figsize=(cm2inch(16,12)))
  23. ax1 = plt.gca()
  24. #构造mask,去除重复数据显示
  25. mask = np.zeros_like(df_coor)
  26. mask[np.triu_indices_from(mask)] = True
  27. mask2 = mask
  28. mask = (np.flipud(mask)-1)*(-1)
  29. mask = np.rot90(mask,k = -1)
  30. im1 = sns.heatmap(df_coor,annot=True,cmap="RdBu"
  31. , mask=mask#构造mask,去除重复数据显示
  32. ,vmax=1,vmin=-1
  33. , fmt='.2f',ax = ax1)
  34. ax1.tick_params(axis = 'both', length=0)
  35. #计算相关性显著性并显示
  36. rlist = []
  37. plist = []
  38. for i in range(df_coor.shape[0]):
  39. for j in range(df_coor.shape[0]):
  40. r,p = pearsonr(df_coor[i],df_coor[j])
  41. rlist.append(r)
  42. plist.append(p)
  43. rarr = np.asarray(rlist).reshape(df_coor.shape[0],df_coor.shape[0])
  44. parr = np.asarray(plist).reshape(df_coor.shape[0],df_coor.shape[0])
  45. xlist = ax1.get_xticks()
  46. ylist = ax1.get_yticks()
  47. widthx = 0
  48. widthy = -0.15
  49. for m in ax1.get_xticks():
  50. for n in ax1.get_yticks():
  51. pv = (parr[int(m),int(n)])
  52. rv = (rarr[int(m),int(n)])
  53. if mask2[int(m),int(n)]<1.:
  54. if abs(rv) > 0.5:
  55. if pv< 0.05 and pv>= 0.01:
  56. ax1.text(n+widthx,m+widthy,'*',ha = 'center',color = 'white')
  57. if pv< 0.01 and pv>= 0.001:
  58. ax1.text(n+widthx,m+widthy,'**',ha = 'center',color = 'white')
  59. if pv< 0.001:
  60. print([int(m),int(n)])
  61. ax1.text(n+widthx,m+widthy,'***',ha = 'center',color = 'white')
  62. else:
  63. if pv< 0.05 and pv>= 0.01:
  64. ax1.text(n+widthx,m+widthy,'*',ha = 'center',color = 'k')
  65. elif pv< 0.01 and pv>= 0.001:
  66. ax1.text(n+widthx,m+widthy,'**',ha = 'center',color = 'k')
  67. elif pv< 0.001:
  68. ax1.text(n+widthx,m+widthy,'***',ha = 'center',color = 'k')
  69. plt.show()

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/426575
推荐阅读
相关标签
  

闽ICP备14008679号