pandas 用df.to_excel()将dataframe保存至Excel时报出“MemoryError”错误的解决办法_xue_11的博客

相关文章推荐

快乐的剪刀 · 香港中文大学（深圳）-职业规划与发展处 ...· 4 月前 ·

忐忑的豆芽 · ZipFile.CreateFromDire ...· 6 月前 ·

爱笑的西红柿 · 使用css和html根据文本长度动态改变字体 ...· 1 年前 ·

深情的西装 · 数据可视化之powerBI基础（十八）Pow ...· 1 年前 ·

率性的蚂蚁 · Java ...· 1 年前 ·

使用pandas中to_excel()函数将dataframe数据写入的时候，有时候会报出“MemoryError”错误。
如下代码：

import pandas as pd
import numpy as np
# 生成dataframe数据并写入Excel表中
df = pd.DataFrame(np.arange(12000000).reshape(300000,40))
# print(df)
df.to_excel('test.xlsx',index=False)
运行结果如下：
 
 因写入的数据量太大，导致报出“内存溢出”的错误。 
二、解决方法
 
使用xlsxwriter模块将数据写入，代码更改为： 
import pandas as pd
import numpy as np
import xlsxwriter
# 生成dataframe数据
df = pd.DataFrame(np.arange(12000000).reshape(300000,40))
# print(df)
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter', options={'strings_to_urls':False})  # options参数可带可不带，根据实际情况
df.to_excel(writer, index=False)
writer.save()
更改后就不会报错了，xlsxwriter模块是一个python处理Excel写入的专有模块，不支持对Excel的读取，只支持写入，功能非常强大。
                    一、原因使用pandas 中to_excel函数对数据量大的dataframe数据写入的时候，有时候会报出“MemoryError”错误。如下代码：import pandas as pdimport numpy as np# 生成dataframe数据并写入Excel表中df = pd.DataFrame(np.arange(12000000).reshape(300000,40))# print(df)df.to_excel('test.xlsx',index=False)运行结果
import pandas as pd
from pyspark.sql import SparkSession
from pyspark.sql import SQLContext
from pyspark import SparkContext
#初始化数据
#初始化pandas DataFrame
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], index=['row1', 'row2'], columns=['c1', 'c2', 'c3'])
#打印数据
				DataFrame.to_excel(excel_writer, sheet_name='Sheet1', na_rep='',
 float_format=None, columns=None, header=True, index=True, 
 index_label=None, startrow=0, startcol=0, engine=None, 
 merge_cells=True...
				源于：  执行类代码 – ExcelExtractionClass.py – 函数list_to_excel 
 功能:  用于将pd.DataFrame形式的数据写入Excel中
 用法:  pd.DataFrame(effective_list).to_excel(excel_path)
将列表 effective_list 中的数据写入excel并放到 excel_path路径中
				Python 读写 Excel 可以使用 Pandas，处理很方便。但如果要处理 Excel 的格式，还是需要 openpyxl 模块，旧的 xlrd 和 xlwt 模块可能支持不够丰富。Pandas 读写 Excel 主要用到两个函数，下面分析一下 pandas.read_excel() 和 DataFrame.to_excel() 的参数，以便日后使用。
常用参数：
to_excel 函数
分析个啥, 水平有限, 直接面向stackoverflow编程
https://stackoverflow.com/questions/64264563/attributeerror-elementtree-object-has-no-attribute-getiterator-when-trying
我找到了下面的这几种说法
根据国外大神的指点, 我得出了这些结论:
pandas库读取excel文件是需要安装xlrd模块的, 也就是它默认是引擎engi