赞
踩
openpyxl官方网站: (openpyxl.readthedocs.io)
最后一趴记录了我目前碰到和解决的问题以及解决的方法,欢迎同刚入坑的小白参考,特别特别欢迎有大佬能指出可以更精进的地方
win+R进入cmd,输入以下代码
- # 清华大学
- pip install openpyxl -i https://pypi.tuna.tsinghua.edu.cn/simple/
- # 阿里云
- pip install openpyxl -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com
- # 中国科技大学
- https://pypi.mirrors.ustc.edu.cn/simple/
- # 豆瓣
- http://pypi.douban.com/simple/
- # 中国科技技术大学
- http://pypi.mirrors.ustc.edu.cn/simple/
-i后是国内的pip源,直接输入pip install openpyxl下载失败的可以尝试加入此串代码,可以都试试
想要永久的修改pip源可以试试如下操作:
Windows:
找到系统盘下C:\C:\Users\用户名\AppData\Roaming
查看在Roaming文件夹下有没有一个pip文件夹,如果没有创建一个;
进入pip文件夹,创建一个pip.ini文件;
使用记事本的方式打开pip.ini文件,写入:
- [global]
- index-url = http://mirrors.aliyun.com/pypi/simple # 指定下载源
- trusted-host = mirrors.aliyun.com # 指定域名
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active # 获取默认的工作表
Workbook后一定记得要加括号!!!
- from openpyxl import Workbook, load_workbook
-
- wb = load_workbook('./test.xlsx') # 打开已存在的工作簿,./代替地址
- ws = wb.active
打开已经存在的工作簿需调用load_workbook,其中Workbook是新建工作簿需调用的
wb.save("./test.xlsx") #./代替地址,保存类型为xlsx,保存后将会覆盖原先文件,无提示
- from openpyxl import Workbook
-
- wb = Workbook()
- ws1 = wb.active # 获取默认的工作表
- print(ws1.title) # 返回工作表ws1的名
-
- ws2 = wb.create_sheet("sheet2", 1) # 创建工作表
- ws2 = wb.create_sheet("sheet3", 2) # 工作表下标的顺序左到右依次为0,1,2...
- wb.save("./test.xlsx") # 保存到硬盘
create_sheet(title,index)
title:工作表的名字
index:下标的位置
- print(ws.max_row) # 最大行数,例如14
- print(ws.max_column) # 最大列数,例如20
- print(ws.dimensions) # 已启用的单元格范围,例如A1:T14
- print(ws.encoding) # 编码类型,例如utf-8
- print(ws.sheet_view) # 对象信息
- from openpyxl import Workbook
-
- wb = Workbook()
- ws1 = wb.create_sheet("Sheet2", 1)
- ws2 = wb.create_sheet("Sheet3", 2)
-
- # 修改表名
- print(ws1.title) # 输出:Sheet2
- ws1.title = "Sheet22"
- print(ws1.title) # 输出:Sheet22
-
- # 获取所有表名
- sheet_names = wb.sheetnames
- print(type(sheet_names), sheet_names)
- # 输出:<class 'list'> ['Sheet', 'Sheet22', 'Sheet3']
-
- # 通过表名获取表
- sheet_name = wb["Sheet"]
-
- # 获取表的下标位置(下标从0开始)
- index = wb.index(ws2)
- print("get_index:", index) # 输出get_index: 2
-
- from openpyxl import Workbook
-
- wb = Workbook()
- ws1 = wb.create_sheet("Sheet2", 1)
- ws2 = wb.create_sheet("Sheet3", 2)
-
- # 移动位置
- wb.move_sheet("Sheet3", -1) # 向前移动一个位置,正数往后,负数往前
- print(wb.sheetnames) # 输出:['Sheet', 'Sheet3', 'Sheet2']
-
- # 删除工作表
- del wb["Sheet3"]
- print(wb.sheetnames) # 输出:['Sheet', 'Sheet2']
-
- # 复制工作表
- copy_sheet = wb.copy_worksheet(ws1)
- print(copy_sheet.title) # 输出:Sheet2 Copy
-
- from openpyxl import Workbook
- wb = Workbook()
- ws = wb.active
-
- cell1 = ws["a1"] # 通过坐标获取a1
- cell2 = ws.cell(1, 2) # 通过行列下标获取
-
- # 直接修改某个单元格的值
- ws["b2"] = "Hello World"
-
- # 先获取单元格对象然后再进行修改
- cell = ws["a1"]
- cell.value = "Hello World"
- print(cell, cell.value) # 输出:<Cell 'Sheet'.a1> Hello World
-
- # 单元格坐标信息
- print(cell.coordinate) # 单元格坐标,例如A1
- print(cell.column_letter) # 单元格列名,例如A
- print(cell.col_idx) # 单元列下标,例如1
- print(cell.row) # 单元格所在行,例如1
-
cell(row, column, value)
row:行
column:列
value:如果赋值则修改单元格
通过一段循环来创建以下数据
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
-
- x = 0
- for i in range(1, 11):
- for j in range(1, 6):
- ws.cell(i, j, x)
- x += 1
此循环创建了一个10*6(10行6列)的表格,数据从0开始递加
- from openpyxl import Workbook
- from openpyxl.utils import get_column_letter
-
- wb = Workbook()
- ws = wb.active
-
- for row in range(1, 5):
- for col in range(1, 6):
- char = get_column_letter(col)
- ws[char + str(row)].value = char + str(row)
-
- wb.save("text.xlsx")
通过调用get_column_letter来给一个范围内的单元格赋值
- ws.insert_cols(2, 3) # 第2列开始插入3列
- ws.insert_rows(3, 2) # 第3行开始插入2行
- ws.delete_cols(2, 2) # 第2列开始删除2列
- ws.delete_rows(3, 1) # 第3行开始删除1行
- from openpyxl import Workbook
-
- wb = Workbook()
-
- ws = wb.active
-
- x = 1
- for i in range(1, 11): # 创建一个10行5列的表格
- for i in range(1, 6):
- ws.cell(i, j, x)
- x += x
-
- row_cells = ws[2] # 选取第2行(下标从1开始)
- print(row_cells) # 输出:第二行的单元格
-
- col_cells = ws["b"] # 选取B列
- print(col_cells) # 输出:B列的单元格
-
- row_range_cells = ws[2:5] # 选取2、3、4、5共4行
- print(row_range_cells) # 输出:2-4行的单元格
-
- col_range_cells = ws["B:D"] # 选取B、C、D共3列
- print(col_range_cells) # 输出:B-D列的单元格
-
- range_cells = ws["c3:f6"] # 选取 C3到F6区域共16个元素
- print(range_cells) # 输出:C3-F6的单元格
-
- for cells in ws["a1:c4"] # 遍历出的a1-c1作为一个元组,a2-c2作为一个元组...
- for cell in cells # 把元组里所有元素拆分
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
-
- x = 0
- for i in range(1, 11):
- for j in range(1, 6):
- ws.cell(i, j, x)
- x += 1
-
- print(ws["C2"].value) # 输出:23
- ws.merge_cells("A1:C3") # 合并A1-C3的单元格
- ws.unmerge_cells("A1:C3") # 取消A1-C3合并的单元格
- # 等同于下面的代码
- # ws.merge_cells(start_row=1, start_column=1, end_row=3, end_column=6)
- # ws.unmerge_cells(start_row=1, start_column=1, end_row=3, end_column=6)
合并之后只保留左上第一个单元格里的数据,取消合并之后只有左上第一个单元格的数据会被还原
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
- x = 0
- for i in range(1, 11):
- for j in range(1, 6):
- ws.cell(i, j, x)
- x += 1
-
- ws.move_range("B2:C3", rows=2, cols=-1, translate=False)
- # 移动B2-C3的单元格,向下移动2行,向左移动1列
- from openpyxl.utils import FORMULAE
-
- print(FORMULAE) # 打印可以使用的公式
- print(len(FORMULAE)) # 352
-
- # 判断是否支持某个公式,公式名区分大小写
- print("SUM" in FORMULAE) # True
- print("PI" in FORMULAE) # True
- print("sum" in FORMULAE) # False
所有可用公式都是全大写 ,要使用公式需调用FORMULAE
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
-
- ws.append(["数量1", "数量2", "总和", "平均值"])
- ws.append([10,20])
- ws.append([15, 30])
- ws.append([20, 45])
-
- ws["c2"] = "=SUM(A2,B2)" # 求和
- ws["d2"] = "=AVERAGE(A2:B2)" # 求平均值
- wb.save("test.xlsx")
- from openpyxl import Workbook
- from openpyxl.utils import get_column_letter
-
- data = [
- {"name": "小A", "tall": 160, "age": 23, "weight": 70},
- {"name": "小B", "tall": 154, "age": 13, "weight": 40},
- {"name": "小C", "tall": 180, "age": 18, "weight": 87},
- {"name": "小D", "tall": 171, "age": 53, "weight": 70}
- ]
-
- wb = Workbook()
- ws = wb.active
-
- title = ["姓名", "身高", "年龄", "体重"]
- ws.append(title)
-
- for person in data:
- ws.append(list(person.values()))
-
- for col in range(2,5):
- char = get_column_letter(col)
- ws[char + "6"].value = f'=AVERAGE({char + "2"}:{char + "5"})'
-
- wb.save("text.xlsx")
在ws[char+"6"].value = 之后的值时,需要注意单双引号的冲突,f后若用双引号会报错
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
-
- ws.append(["数量1", "数量2", "总和", "平均值"])
- ws.append([10,20])
- ws.append([15, 30])
- ws.append([20, 45])
-
- ws["c2"] = "=SUM(A2,B2)" # 求和
- ws["d2"] = "=AVERAGE(A2:B2)" # 求平均值
- # C3、C4使用上面的C2的求和公式
- ws["C3"] = Translator(formula="=SUM(A2,B2)", origin="C2").translate_formula("C3")
- ws["C4"] = Translator(formula="=SUM(A2,B2)", origin="C2").translate_formula("C4")
- ws["D3"] = Translator(formula="=AVERAGE(A2,B2)", origin="D2").translate_formula("D3")
- ws["D4"] = Translator(formula="=AVERAGE(A2,B2)", origin="D2").translate_formula("D4")
-
- wb.save("test.xlsx")
-
相对于Excel里的填充
- for cell in ws["C3:C4"]:
- # ws["C3"] = Translator(formula="=SUM(A2,B2)", origin="C2").translate_formula("C3")
- cell[0].value = Translator(formula="=SUM(A2,B2)", origin="C2").translate_formula(cell[0].coordinate)
-
- for cell in ws["D3:D4"]:
- # ws["D3"] = Translator(formula="=AVERAGE(A2,B2)", origin="D2").translate_formula("D3")
- cell[0].value = Translator(formula="=AVERAGE(A2,B2)", origin="D2").translate_formula(cell[0].coordinate)
-
- wb.save("test.xlsx")
使用循环便捷的写出代码
- from openpyxl import Workbook
- from openpyxl.styles import Font
-
- wb = Workbook()
- ws = wb.active
-
- # 默认字体样式
- ws["A1"] = "A1"
-
- # 自定义字体样式
- ws["B2"] = "B2"
- font = Font(
- name="微软雅黑", # 字体
- size=15, # 字体大小
- color="0000FF", # 字体颜色,用16进制rgb表示
- bold=True, # 是否加粗,True/False
- italic=True, # 是否斜体,True/False
- strike=None, # 是否使用删除线,True/False
- underline=None, # 下划线, 可选'singleAccounting', 'double', 'single', 'doubleAccounting'
- )
- ws["B2"].font = font
-
- wb.save("./test.xlsx")
-
- from openpyxl import Workbook
- wb = Workbook()
- ws = wb.active
-
- ws.row_dimensions[2].height = 30 # 设置第2行高度为30
- ws.column_dimensions["B"].width = 30 # 设置B列宽度为30
-
- wb.save("./test.xlsx")
- from openpyxl import Workbook
- from openpyxl.styles import Alignment
-
- wb = Workbook()
- ws = wb.active
-
- ws.row_dimensions[2].height = 30 # 设置第2行高度为30
- ws.column_dimensions["B"].width = 30 # 设置B列宽度为30
-
- # 默认字体样式
- ws["A1"] = "A1"
-
- ws["B2"] = "B1"
- ws['B2'].alignment = Alignment(
- horizontal='left', # 水平对齐,可选general、left、center、right、fill、justify、centerContinuous、distributed
- vertical='top', # 垂直对齐, 可选top、center、bottom、justify、distributed
- text_rotation=0, # 字体旋转,0~180整数
- wrap_text=False, # 是否自动换行
- shrink_to_fit=False, # 是否缩小字体填充
- indent=0, # 缩进值
- )
-
- wb.save("./test.xlsx")
-
- from openpyxl import Workbook
- from openpyxl.styles import Border, Side
-
- wb = Workbook()
- ws = wb.active
-
- ws["B2"] = "B2"
-
- side = Side(
- style="medium", # 边框样式,可选dashDot、dashDotDot、dashed、dotted、double、hair、medium、mediumDashDot、mediumDashDotDot、mediumDashed、slantDashDot、thick、thin
- color="ff66dd", # 边框颜色,16进制rgb表示
- )
-
- ws["B2"].border = Border(
- top=side, # 上
- bottom=side, # 下
- left=side, # 左
- right=side, # 右
- diagonal=side # 对角线
- )
-
- wb.save("./test.xlsx")
-
- from openpyxl import Workbook
- from openpyxl.styles import PatternFill, GradientFill
-
- wb = Workbook()
- ws = wb.active
-
- ws["B2"] = "B2"
-
- fill = PatternFill(
- patternType="solid", # 填充类型,可选none、solid、darkGray、mediumGray、lightGray、lightDown、lightGray、lightGrid
- fgColor="F562a4", # 前景色,16进制rgb
- bgColor="0000ff", # 背景色,16进制rgb
- # fill_type=None, # 填充类型
- # start_color=None, # 前景色,16进制rgb
- # end_color=None # 背景色,16进制rgb
- )
- ws["B2"].fill = fill
- ws["B3"].fill = GradientFill(
- degree=60, # 角度
- stop=("000000", "FFFFFF") # 渐变颜色,16进制rgb
- )
-
- wb.save("./test.xlsx")
-
- from openpyxl import Workbook
-
- wb = Workbook()
- ws = wb.active
-
- # 准备数据
- rows = [
- ['月份', '桃子', '西瓜', '龙眼'],
- [1, 38, 28, 29],
- [2, 52, 21, 35],
- [3, 39, 20, 69],
- [4, 51, 29, 41],
- [5, 39, 39, 31],
- [6, 30, 41, 39],
- ]
- for row in rows:
- ws.append(row)
-
- ws.auto_filter.ref = "A1:D7" # 选择数据范围
- ws.auto_filter.add_filter_column(1, ["39", "29", "30"]) # 选择第2列为过滤数据(下标从0开始),并勾选需要过滤的数据项
- ws.auto_filter.add_sort_condition("C2:C7", True) # 设置排序范围,第二个参数是是否倒序,默认为否
-
- wb.save("./openpyxl/test.xlsx")
openpyxl并不能直接让Excel文件实现过滤或者排序,而是设置好,需要在Excel上确认
- import pandas as pd
- # 读取上一步保存的Excel文件
- df = pd.read_excel("./openpyxl/test.xlsx", sheet_name="Sheet")
- df_value = df.sort_values(by=["桃子", "西瓜"], ascending=False) # 如果"桃子"数据相同再按照"西瓜"进行排列,ascending是否正序
- # 保存文件
- writer = pd.ExcelWriter('./openpyxl/sort_file.xlsx')
- df_value.to_excel(writer, sheet_name='Sheet1', index=False)
- writer.save()
pandas排序可以直接让Excel实现排序
官方文档:https://openpyxl.readthedocs.io/en/stable/charts/introduction.html
- from openpyxl import Workbook
- from openpyxl.chart import LineChart, Reference
-
- wb = Workbook()
- ws = wb.active
-
- # 准备数据
- rows = [
- ['月份', '桃子', '西瓜', '龙眼'],
- [1, 38, 28, 29],
- [2, 52, 21, 35],
- [3, 39, 20, 69],
- [4, 51, 29, 41],
- [5, 29, 39, 31],
- [6, 30, 41, 39],
- ]
- for row in rows:
- ws.append(row)
-
- # 创建图表
- c1 = LineChart()
- c1.title = "折线图" # 标题
- c1.style = 13 # 样式
- c1.y_axis.title = '销量' # Y轴
- c1.x_axis.title = '月份' # X轴
-
- # 选择数据范围
- data = Reference(ws, min_col=2, min_row=1, max_col=4, max_row=7)
- c1.add_data(data, titles_from_data=True)
-
- # 线条样式
- s0 = c1.series[0]
- s0.marker.symbol = "triangle" # triangle为三角形标记, 可选circle、dash、diamond、dot、picture、plus、square、star、triangle、x、auto
- s0.marker.graphicalProperties.solidFill = "FF0000" # 填充颜色
- s0.marker.graphicalProperties.line.solidFill = "0000FF" # 边框颜色
- # s0.graphicalProperties.line.noFill = True # 改为True则隐藏线条,但显示标记形状
-
- s1 = c1.series[1]
- s1.graphicalProperties.line.solidFill = "00AAAA"
- s1.graphicalProperties.line.dashStyle = "sysDot" # 线条点状样式
- s1.graphicalProperties.line.width = 80000 # 线条大小,最大20116800EMUs
-
- s2 = c1.series[2] # 采用默认设置
- s2.smooth = True # 线条平滑
-
- ws.add_chart(c1, "A8") # 图表位置
-
- wb.save("line.xlsx")
-
大概过程是,创建一个图表(Chart)–指定数据范围(Reference)–设置系列(series)样式–添加到工作表中
前面我们使用的normal模式进行读写Excel文件,这是一种兼顾读写相对比较平衡的模式,但是,数据加载到内存占用的资源是比较大的,大概是文件的50倍,所以需要考虑是否使用只读或者只写模式来提高性能
只读模式,如果你需要读取很大的Excel文件,但是又不改变和保存,例如只读取数值用于其他数据分析,这时候我们完全可以使用只读模式提供性能
- from openpyxl import load_workbook
-
- # 加载Excel文件时使用read_only指定只读模式
- wb = load_workbook(filename='large_file.xlsx', read_only=True)
- ws = wb['big_data']
-
- # 可以正常读取值
- for row in ws.rows:
- for cell in row:
- print(cell.value)
-
- # 注意:读取完之后需要手动关闭避免内存泄露
- wb.close()
load_workbook(filename, read_only=False, keep_vba=KEEP_VBA, data_only=False, keep_links=True)
read_only:是否只读,默认False
keep_vba:是否使用VBA编程,默认False
data_only:是否只加载数据值,即丢弃公式、排序等操作,默认False
keep_links:是否保留超链接,默认True
如果文件是以写为主,可以在创建工作簿的时候指定为只写模式以便提高性能,不管文件有多大,都可以把内存保持在10M以下
- from openpyxl import Workbook
- from openpyxl.cell import WriteOnlyCell
- from openpyxl.comments import Comment
- from openpyxl.styles import Font
-
- wb = Workbook(write_only=True) # 创建工作簿时指定只写模式
- ws = wb.create_sheet() # 需要通过create_sheet创建一个sheet
-
- # 可以正常保存数据
- for _ in range(100):
- ws.append([i for i in range(200)]) # 只能通过append写
-
- # 如果需要保留公式、注释等操作,可以使用WriteOnlyCell
- cell = WriteOnlyCell(ws, value="冰冷的希望")
- cell.font = Font(name='黑体', size=15)
- cell.comment = Comment(text="这是注释", author="pan")
-
- ws.append([cell])
-
- wb.save('openpyxl/test.xlsx')
-
只写模式注意点:
1.需要通过create_sheet()创建表
2.只能通过append()增加数据,不能通过cell或iter_rows()
3.wb.save()之后不能再修改,否则抛出WorkbookAlreadySaved异常
- from openpyxl import load_workbook
-
- wb = load_workbook(r"E:\系统默认\桌面\工资.xlsx")
- ws = wb.active
-
- ws_rows = list(ws.rows) # 获取表单中所有的信息放入元组
-
- list_title = [i.value for i in ws_rows[0]] # 获取第一行标题
- list_row1 = []
-
- for data in ws_rows[1:]: # 遍历除第一行以外的所有行
- value = [i.value for i in data] # 每次换行之后将本行数据存入value
- dic_rows = dict(zip(list_title, value)) # 将第一行数据和本行数据打包成字典
- list_row1.append(dic_rows) # 把字典添加入列表
-
- print(list_row1)
- from openpyxl import Workbook
-
- data = [
- {"name": "小A", "tall": 160, "age": 23, "weight": 70},
- {"name": "小B", "tall": 154, "age": 13, "weight": 40},
- {"name": "小C", "tall": 180, "age": 18, "weight": 87},
- {"name": "小D", "tall": 171, "age": 53, "weight": 70}
- ]
-
- wb = Workbook()
- ws = wb.active
-
- title = ["姓名", "身高", "年龄", "体重"]
- ws.append(title)
-
- for person in data:
- ws.append(list(person.values()))
-
- wb.save("text.xlsx")
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。