赞
踩
左表转化为右表的形式,左表是多重列索引,右表为多重行索引,这在时间序列数据中是非常 常见的。
以下代码是在pycharm中的jupyter notebook中写的,然后在浏览器中打开,下载为md文档,再导入到csdn中,真是方便极了!!!
完整代码如下
import numpy as np
import pandas as pd
numbers = ['000001', '600000', '688001']
colors = [2016, 2017]
mindex=pd.MultiIndex.from_product([numbers, colors],names=['code', 'year'])
mindex
MultiIndex([('000001', 2016),
('000001', 2017),
('600000', 2016),
('600000', 2017),
('688001', 2016),
('688001', 2017)],
names=['code', 'year'])
#第1 2行为表头,自动转化为列的多重索引,第一列为行索引
df = pd.read_excel(io = r'test1.xlsx',header=[0,1],index_col=[0])
df.index.name='code'
df.head()
year | 2006 | 2007 | ||||||
---|---|---|---|---|---|---|---|---|
quarter | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 |
code | ||||||||
000001.SZ | 100000 | 100001 | 100002 | 100003 | 200000 | 200000 | 200000 | 200000 |
000002.SZ | 100000 | 100001 | 100002 | 100003 | 200001 | 200001 | 200001 | 200001 |
000003.SZ | 100000 | 100001 | 100002 | 100003 | 200002 | 200002 | 200002 | 200002 |
000004.SZ | 100000 | 100001 | 100002 | 100003 | 200003 | 200003 | 200003 | 200003 |
000005.SZ | 100000 | 100001 | 100002 | 100003 | 200004 | 200004 | 200004 | 200004 |
df.columns
# 此时数据的列名是一个多重索引
MultiIndex([(2006, 1),
(2006, 2),
(2006, 3),
(2006, 4),
(2007, 1),
(2007, 2),
(2007, 3),
(2007, 4)],
names=['year', 'quarter'])
df_quarter=df.stack() #列转化为行索引
df_quarter.head()
year | 2006 | 2007 | |
---|---|---|---|
code | quarter | ||
000001.SZ | 1 | 100000 | 200000 |
2 | 100001 | 200000 | |
3 | 100002 | 200000 | |
4 | 100003 | 200000 | |
000002.SZ | 1 | 100000 | 200001 |
s_year=df_quarter.stack() #列转化为行索引,再转化一次,这个时候,行就是多重索引了,code,quarter,year
s_year
code quarter year 000001.SZ 1 2006 100000 2007 200000 2 2006 100001 2007 200000 3 2006 100002 2007 200000 4 2006 100003 2007 200000 000002.SZ 1 2006 100000 2007 200001 2 2006 100001 2007 200001 3 2006 100002 2007 200001 4 2006 100003 2007 200001 000003.SZ 1 2006 100000 2007 200002 2 2006 100001 2007 200002 3 2006 100002 2007 200002 4 2006 100003 2007 200002 000004.SZ 1 2006 100000 2007 200003 2 2006 100001 2007 200003 3 2006 100002 2007 200003 4 2006 100003 2007 200003 000005.SZ 1 2006 100000 2007 200004 2 2006 100001 2007 200004 3 2006 100002 2007 200004 4 2006 100003 2007 200004 dtype: int64
df_year=s_year.to_frame()
df_year=df_year.swaplevel(1,2,axis=0) #多级索引中quarter year 交换位置
df_year.rename(columns={0:'值'},inplace=True)
df_year.sort_index(inplace=True) #排个序
df_year
值 | |||
---|---|---|---|
code | year | quarter | |
000001.SZ | 2006 | 1 | 100000 |
2 | 100001 | ||
3 | 100002 | ||
4 | 100003 | ||
2007 | 1 | 200000 | |
2 | 200000 | ||
3 | 200000 | ||
4 | 200000 | ||
000002.SZ | 2006 | 1 | 100000 |
2 | 100001 | ||
3 | 100002 | ||
4 | 100003 | ||
2007 | 1 | 200001 | |
2 | 200001 | ||
3 | 200001 | ||
4 | 200001 | ||
000003.SZ | 2006 | 1 | 100000 |
2 | 100001 | ||
3 | 100002 | ||
4 | 100003 | ||
2007 | 1 | 200002 | |
2 | 200002 | ||
3 | 200002 | ||
4 | 200002 | ||
000004.SZ | 2006 | 1 | 100000 |
2 | 100001 | ||
3 | 100002 | ||
4 | 100003 | ||
2007 | 1 | 200003 | |
2 | 200003 | ||
3 | 200003 | ||
4 | 200003 | ||
000005.SZ | 2006 | 1 | 100000 |
2 | 100001 | ||
3 | 100002 | ||
4 | 100003 | ||
2007 | 1 | 200004 | |
2 | 200004 | ||
3 | 200004 | ||
4 | 200004 |
df_year.to_excel('s_year.xlsx') #保存
# s_year.to_excel('s_year.xlsx',merge_cells=False) 不合并单元格
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。