Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Is it possible to read stdin as binary data in Python 2.6? If so, how?
I see in the
Python 3.1 documentation
that this is fairly simple, but the facilities for doing this in 2.6 don't seem to be there.
If the methods described in 3.1 aren't available, is there a way to close stdin and reopen in in binary mode?
Just to be clear, I am using 'type' in a MS-DOS shell to pipe the contents of a binary file to my python code. This should be the equivalent of a Unix 'cat' command, as far as I understand. But when I test this out, I always get one byte less than the expected file size.
The reason I'm going the Java/JAR/Jython route is because one of my main external libraries is only available as a Java JAR. But unfortunately, I had started my work as Python. It might have been easier to convert my code over to Java a while ago, but since this stuff was all supposed to be compatible, I figured I would try trucking through it and prove it could be done.
In case anyone was wondering, this is also related to
this question
I asked a few days ago.
Some of was answered in
this question
.
So I'll try to update my original question with some notes on what I have figured out so far.
The standard streams are in text mode
by default. To write or read binary
data to these, use the underlying
binary buffer. For example, to write
bytes to stdout, use
sys.stdout.buffer.write(b'abc')
.
But, as in the accepted answer, invoking python with a
-u
is another option which forces stdin, stdout and stderr to be totally unbuffered. See the python(1) manpage for details.
See the
documentation on
io
for more information on text buffering, and use
sys.stdin.detach()
to disable buffering from within Python.
–
Here is the final cut for Linux/Windows Python 2/3 compatible code to read data from stdin without corruption:
import sys
PY3K = sys.version_info >= (3, 0)
if PY3K:
source = sys.stdin.buffer
else:
# Python 2 on Windows opens sys.stdin in text mode, and
# binary data that read from it becomes corrupted on \r\n
if sys.platform == "win32":
# set sys.stdin to binary mode
import os, msvcrt
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
source = sys.stdin
b = source.read()
Use the -u
command line switch to force Python 2 to treat stdin, stdout and stderr as binary unbuffered streams.
C:> type mydoc.txt | python.exe -u myscript.py
–
–
–
–
–
If you still need this...
This simple test i've used to read binary file that contains 0x1A character in between
import os, sys, msvcrt
msvcrt.setmode (sys.stdin.fileno(), os.O_BINARY)
s = sys.stdin.read()
print len (s)
My test file data was:
0x23, 0x1A, 0x45
Without setting stdin to binary mode this test prints 1 as soon it treats 0x1A as EOF.
Of course it works on windows only, because depends on msvcrt module.
–
–
–
–
–
–
–
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.