Python3 读写文件

读写是Python中常见的操作, 通过读写请求 操作系统打开文件对象 ,然后读取数据或者写入数据。

1. 读文件

f.read([size]) 方式,读取指定size的字节数,如果未给定或为负则读取所有
with open('file.txt', 'r+') as f:
    print(f.read())
#testing
2020-01-25Python 3.9.0a3 now available for testing
2020-01-17Start using 2FA and API tokens on PyPI
2020-01-07Python Software Foundation Fellow Members for Q4 2019

read() 是一次性读取文件的全部内容,即一次性把文件加载到内存, 如果文件很大, 超出内存范围,会报错MemoryError。 所以需要读取指定字节数。下面只是用了小文件举例。

DATA_SIZE=20
with open('file.txt', 'r+') as f:
    while True:
        block = f.read(DATA_SIZE)
        if block:
            print(block)
        else:
            break
#testing
2020-01-25Python 3.9
.0a3 now available f
or testing
2020-01-1
7Start using 2FA and
 API tokens on PyPI
2020-01-07Python Sof
tware Foundation Fel
low Members for Q4 2
f.readline()方式,读取整行,包括换行符
with open('file.txt', 'r+') as f:
    while True:
        line = f.readline()
        if line:
            print(line)
        else:
            break
#testing
2020-01-25Python 3.9.0a3 now available for testing
2020-01-17Start using 2FA and API tokens on PyPI
2020-01-07Python Software Foundation Fellow Members for Q4 2019
f.readlines()方式,读取所有行并返回list
with open('file.txt', 'r+') as f:
    print(f.readlines())
#testing
['2020-01-25Python 3.9.0a3 now available for testing\n', '2020-01-17Start using 2FA and API tokens on PyPI\n', '2020-01-07Python Software Foundation Fellow Members for Q4 2019']
文件对象f当作迭代对象,系统将自动处理IO缓冲和内存管理, 这种方法是更加pythonic的方法
with open('file.txt', 'r+') as f:
    for line in f:
        print(line)
# testing
2020-01-25Python 3.9.0a3 now available for testing
2020-01-17Start using 2FA and API tokens on PyPI
2020-01-07Python Software Foundation Fellow Members for Q4 2019
f.read()f.readlines() 方法都是一次性读取文件全部内容,操作比较方便,但是对于大文件,会出现内存溢出等问题,所以需要使用f.read(size)文件对象f的方法。

2. 快速无误的读取大文件

  • 一个比较pythonic的方法,把文件对象f当作迭代对象
    实现:打开文件的过程中,不会一次性读取全部文件,而是采用每次读取一行的方式。
    优缺点:代码简洁,如果一行数据大小超过内存,也会造成MemoryError
  • with open('file.txt', 'r+') as f:
        for line in f:
            print(line)
    
  • 分割数据+yield
    实现:将文件切分成小段,每次处理完小段内容后,释放内存
    优缺点:代码略显复杂,读取大文件一般没问题
  • def read_big_file(filename, size=4096):
        with open(filename, 'r+') as f:
            while True:
                block = f.read(size)
                if block:
                    yield block
                else:
                    break
    for block in read_big_file('file.txt'):
        print(block)
    # testing
    2020-01-25Python 3.9.0a3 now available for testing
    2020-01-17Start using 2FA and API tokens on PyPI
    2020-01-07Python Software Foundation Fellow Members for Q4 2019
    

    3. 写文件

    f.write()方式, 将字符串写入文件,返回的是写入的字符长度
    with open('file.txt', 'w+') as f:
        f.write('Hello!\n')
    # testing
    cat file.txt 
    Hello!
    f.writelines()方式, 向文件写入一个序列字符串列表,如果需要换行则要每行加入行符
    
    with open('file.txt', 'w+') as f:
        f.writelines(['Hello\n', 'World\n'])
    # testing
    cat file.txt 
    Hello
    World