Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I want to insert bytes into my PostgreSQL (9.5.7) database column with the type bytea , using the Psycopg2 (2.7.1) copy_from() method.

I can insert my bytes with the following code :

psycopg2_cursor.copy_from(
    StringIO("\x30\x40\x50"),
    "my_table",

By executing a SELECT into my table after the insertion, I get the expected value from the bytea column:

\x304050

Now, I want to prepend my bytes with the byte 0:

psycopg2_cursor.copy_from(
    StringIO("\x00\x30\x40\x50"),
    "my_table",

I get the error : psycopg2.DataError: invalid byte sequence for encoding "UTF-8": 0x00. From my understanding, this error should only be triggered when inserting a null byte into a text field, but should work as expected into a bytea field. Am I missing something ? Is there any simple way to insert a null byte into a bytea column ?

Thanks!

the following characters must be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.

just realized you are using COPY, so you have to escape backslash:

t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \\x00
COPY 1
t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
ERROR:  invalid byte sequence for encoding "UTF8": 0x00
CONTEXT:  COPY b, line 1: "\x00"

this should do the trick:

psycopg2_cursor.copy_from(
    StringIO("\\x00\\x30\\x40\\x50"),
    "my_table",
                I added the following parameter to my psycopg2 connection : options="-c standard_conforming_strings=on", but I still get the same error. I tried StringIO('\x30\x40\x50') and it works correctly, but I really need to insert this null byte. I tried to use a BytesIO instead of a StringIO, but I still get the same result.
– jean553
                Jun 12, 2017 at 8:40
                Yes, it produces the wanted result when using it in PSQL (INSERT INTO my_table VALUES(decode('001060', 'hex'));), but I need to insert my bytes with the copy_from() method from my python code, not directly with a SQL query.
– jean553
                Jun 12, 2017 at 9:52
                last attempt: psycopg2_cursor.copy_from(     StringIO("\\x00\\x30\\x40\\x50"),     "my_table", ) ?..
– Vao Tsun
                Jun 12, 2017 at 10:16
                Still the same (invalid byte sequence for encoding "UTF8": 0x00). The good news is that I can insert bytes from my python code when using cursor.execute("INSERT INTO my_table VALUES(decode('005566', 'hex'));") but for performances issues (details on my github: github.com/jean553/massive-insert-postgresql-tornado), I would prefer to insert with copy_from().
– jean553
                Jun 12, 2017 at 12:08

To insert a binary with copy it is necessary to use the binary format which is not what you want. Use the extras.execute_values method

from psycopg2.extensions import Binary
binaries = [[Binary('\x00\x20')], [Binary('\x00\x30')]]
insert_query = 'insert into t (b) values %s'
psycopg2.extras.execute_values (
    cursor, insert_query, binaries, page_size=100
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.