当前位置:   article > 正文

python 数据合并函数merge( )_np.merge

np.merge

python中的merge函数与sql中的 join 用法非常类似,以下是merge( )函数中的参数:

merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)

一、左右连接键名一样

  1. import pandas as pd
  2. df1=pd.DataFrame({'key':['a','b','a','b','b'],'value1':range(5)})
  3. df2=pd.DataFrame({'key':['a','c','c','c','c'],'value2':range(5)})
  4. display(df1,df2,pd.merge(df1,df2))

df1

  1. key value1
  2. 0 a 0
  3. 1 b 1
  4. 2 a 2
  5. 3 b 3
  6. 4 b 4

df2

  1. key value2
  2. 0 a 0
  3. 1 c 1
  4. 2 c 2
  5. 3 c 3
  6. 4 c 4

pd.merge(df1,df2) ##以df1、df2中相同的列名key进行连接,默认how='inner', pd.merge(df1,df2,on='key',how='inner')

  1. key value1 value2
  2. 0 a 0 0
  3. 1 a 2 0

pd.merge(df1,df2,how='outer') ##  全连接,取并集

  1. key value1 value2
  2. 0 a 0.0 0.0
  3. 1 a 2.0 0.0
  4. 2 b 1.0 NaN
  5. 3 b 3.0 NaN
  6. 4 b 4.0 NaN
  7. 5 c NaN 1.0
  8. 6 c NaN 2.0
  9. 7 c NaN 3.0
  10. 8 c NaN 4.0

pd.merge(df1,df2,how='left')  ### 左连接,左边取全部,右边取部分,没有值则用NaN填充

  1. key value1 value2
  2. 0 a 0 0.0
  3. 1 b 1 NaN
  4. 2 a 2 0.0
  5. 3 b 3 NaN
  6. 4 b 4 NaN

pd.merge(df1,df2,how='right') ###  右连接,右边取全部,左边取部分,没有值则用NaN填充

  1. key value1 value2
  2. 0 a 0.0 0
  3. 1 a 2.0 0
  4. 2 c NaN 1
  5. 3 c NaN 2
  6. 4 c NaN 3
  7. 5 c NaN 4

二、左右连接键名不一样

如果两个DataFrame的左右连接键的列名不一样,可以用left_on,right_on来进行指定

  1. df3=pd.DataFrame({'lkey':['a','b','a','b','b'],'data1':range(5)})
  2. df4=pd.DataFrame({'rkey':['a','c','c','c','c'],'data2':range(5)})

df3

  1. lkey data1
  2. 0 a 0
  3. 1 b 1
  4. 2 a 2
  5. 3 b 3
  6. 4 b 4

df4

  1. rkey data2
  2. 0 a 0
  3. 1 c 1
  4. 2 c 2
  5. 3 c 3
  6. 4 c 4

pd.merge(df3,df4,left_on='lkey',right_on='rkey')   ### 内连接,默认how='inner'

  1. lkey data1 rkey data2
  2. 0 a 0 a 0
  3. 1 a 2 a 0

pd.merge(df3,df4,left_on='lkey',right_on='lkey',how='outer')  ### 全连接

  1. lkey data1 rkey data2
  2. 0 a 0.0 a 0.0
  3. 1 a 2.0 a 0.0
  4. 2 b 1.0 NaN NaN
  5. 3 b 3.0 NaN NaN
  6. 4 b 4.0 NaN NaN
  7. 5 NaN NaN c 1.0
  8. 6 NaN NaN c 2.0
  9. 7 NaN NaN c 3.0
  10. 8 NaN NaN c 4.0

pd.merge(df3,df4,left_on='lkey',right_on='rkey',how='left')  ### 左连接

  1. lkey data1 rkey data2
  2. 0 a 0 a 0.0
  3. 1 b 1 NaN NaN
  4. 2 a 2 a 0.0
  5. 3 b 3 NaN NaN
  6. 4 b 4 NaN NaN

pd.merge(df3,df4,left_on='lkey',right_on='rkey',how='right')  ### 右连接

  1. lkey data1 rkey data2
  2. 0 a 0.0 a 0
  3. 1 a 2.0 a 0
  4. 2 NaN NaN c 1
  5. 3 NaN NaN c 2
  6. 4 NaN NaN c 3
  7. 5 NaN NaN c 4

三、索引作为连接键

  1. df5=pd.DataFrame(np.arange(12).reshape(3,4),index=list('abc'),columns=['v1','v2','v3','v4'])
  2. df6=pd.DataFrame(np.arange(12,24,1).reshape(3,4),index=list('abd'),columns=['v5','v6','v7','v8'])

df5

  1. v1 v2 v3 v4
  2. a 0 1 2 3
  3. b 4 5 6 7
  4. c 8 9 10 11

df6

  1. v5 v6 v7 v8
  2. a 12 13 14 15
  3. b 16 17 18 19
  4. d 20 21 22 23

pd.merge(df5,df6,left_index=True,right_index=True)

  1. v1 v2 v3 v4 v5 v6 v7 v8
  2. a 0 1 2 3 12 13 14 15
  3. b 4 5 6 7 16 17 18 19
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小桥流水78/article/detail/792869
推荐阅读
相关标签
  

闽ICP备14008679号