Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

with this query:

SELECT to_number('DEADBEEF', 'FMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX');

I get the following error:

ERROR:  invalid input syntax for type numeric: " "

What am I doing wrong?

Ways without dynamic SQL

There is no cast from hex numbers in text representation to a numeric type, but we can use bit(n) as waypoint. There are undocumented casts from bit strings (bit(n)) to integer types (int2, int4, int8) - the internal representation is binary compatible. Quoting Tom Lane:

This is relying on some undocumented behavior of the bit-type input converter, but I see no reason to expect that would break. A possibly bigger issue is that it requires PG >= 8.3 since there wasn't a text to bit cast before that.

integer for max. 8 hex digits

Up to 8 hex digits can be converted to bit(32) and then coerced to integer (standard 4-byte integer):

SELECT ('x' || lpad(hex, 8, '0'))::bit(32)::int AS int_val
FROM  (
   VALUES
      ('1'::text)
    , ('f')
    , ('100')
    , ('7fffffff')
    , ('80000000')     -- overflow into negative number
    , ('deadbeef')
    , ('ffffffff')
    , ('ffffffff123')  -- too long
   ) AS t(hex);
   int_val
------------
 2147483647
-2147483648
 -559038737

Postgres uses a signed integer type, so hex numbers above '7fffffff' overflow into negative integer numbers. This is still a valid, unique representation but the meaning is different. If that matters, switch to bigint; see below.

For more than 8 hex digits the least significant characters (excess to the right) get truncated.

4 bits in a bit string encode 1 hex digit. Hex numbers of known length can be cast to the respective bit(n) directly. Alternatively, pad hex numbers of unknown length with leading zeros (0) as demonstrated and cast to bit(32). Example with 7 hex digits and int or 8 digits and bigint:

SELECT ('x'|| 'deafbee')::bit(28)::int
     , ('x'|| 'deadbeef')::bit(32)::bigint;
  int4     | int8
-----------+------------
 233503726 | 3735928559

bigint for max. 16 hex digits

Up to 16 hex digits can be converted to bit(64) and then coerced to bigint (int8, 8-byte integer) - overflowing into negative numbers in the upper half again:

SELECT ('x' || lpad(hex, 16, '0'))::bit(64)::bigint AS int8_val
FROM  (
   VALUES
      ('ff'::text)
    , ('7fffffff')
    , ('80000000')
    , ('deadbeef')
    , ('7fffffffffffffff')
    , ('8000000000000000')     -- overflow into negative number
    , ('ffffffffffffffff')
    , ('ffffffffffffffff123')  -- too long
   ) t(hex);
       int8_val
---------------------
          2147483647
          2147483648
          3735928559
 9223372036854775807
-9223372036854775808

uuid for max. 32 hex digits

The Postgres uuid data type is not a numeric type. But it's the most efficient type in standard Postgres to store up to 32 hex digits, only occupying 16 bytes of storage. There is a direct cast from text to uuid (no need for bit(n) as waypoint), but exactly 32 hex digits are required.

SELECT lpad(hex, 32, '0')::uuid AS uuid_val
FROM  (
   VALUES ('ff'::text)
        , ('deadbeef')
        , ('ffffffffffffffff')
        , ('ffffffffffffffffffffffffffffffff')
        , ('ffffffffffffffffffffffffffffffff123') -- too long
   ) t(hex);
              uuid_val
--------------------------------------
 00000000-0000-0000-0000-0000000000ff
 00000000-0000-0000-0000-0000deadbeef
 00000000-0000-0000-ffff-ffffffffffff
 ffffffff-ffff-ffff-ffff-ffffffffffff
 ffffffff-ffff-ffff-ffff-ffffffffffff

As you can see, standard output is a string of hex digits with typical separators for UUID.

md5 hash

This is particularly useful to store md5 hashes:

SELECT md5('Store hash for long string, maybe for index?')::uuid AS md5_hash;
           md5_hash
--------------------------------------
 02e10e94-e895-616e-8e23-bb7f8025da42
  • What is the optimal data type for an MD5 field?
  • to_number doesn't understand hexadecimal.
  • X doesn't have any meaning in a to_number format string and anything without a meaning apparently means "skip a character".
  • I don't have an authoritative justification for (2), just empirical evidence:

    => SELECT to_number('123', 'X999');
     to_number 
    -----------
    (1 row)
    => SELECT to_number('123', 'XX999');
     to_number 
    -----------
    

    The documentation mentions how double quoted patterns are supposed to behave:

    In to_date, to_number, and to_timestamp, double-quoted strings skip the number of input characters contained in the string, e.g. "XX" skips two input characters.

    but the behavior of non-quoted characters that are not formatting characters appears to be unspecified.

    In any case, to_number isn't the right tool for converting hex to numbers, you want to say something like this:

    select x'deadbeef'::int;
    

    so perhaps this function will work better for you:

    CREATE OR REPLACE FUNCTION hex_to_int(hexval varchar) RETURNS integer AS $$
    DECLARE
        result  int;
    BEGIN
        EXECUTE 'SELECT x' || quote_literal(hexval) || '::int' INTO result;
        RETURN result;
    $$ LANGUAGE plpgsql IMMUTABLE STRICT;
    

    Then:

    => select hex_to_int('DEADBEEF');
     hex_to_int 
    ------------
     -559038737 **
    (1 row)
    

    ** To avoid negative numbers like this from integer overflow error, use bigint instead of int to accommodate larger hex numbers (like IP addresses).

    Haha, odd, to_number supports the silliest things such as Roman numerals, ordinal suffixes and whatnot -- when's the last time anyone needed that. :) But no hex?! – intgr Jan 21, 2019 at 15:21

    pg-bignum

    Internally, pg-bignum uses the SSL library for big numbers. This method has none of the drawbacks mentioned in the other answers with numeric. Nor is it slowed down by plpgsql. It's fast and it works with a number of any size. Test case taken from Erwin's answer for comparison,

    CREATE EXTENSION bignum;
    SELECT hex, bn_in_hex(hex::cstring) 
    FROM   (
       VALUES ('ff'::text)
            , ('7fffffff')
            , ('80000000')
            , ('deadbeef')
            , ('7fffffffffffffff')
            , ('8000000000000000')
            , ('ffffffffffffffff')
            , ('ffffffffffffffff123')
       ) t(hex);
             hex         |        bn_in_hex        
    ---------------------+-------------------------
     ff                  | 255
     7fffffff            | 2147483647
     80000000            | 2147483648
     deadbeef            | 3735928559
     7fffffffffffffff    | 9223372036854775807
     8000000000000000    | 9223372036854775808
     ffffffffffffffff    | 18446744073709551615
     ffffffffffffffff123 | 75557863725914323415331
    (8 rows)
    

    You can get the type to numeric using bn_in_hex('deadbeef')::text::numeric.

    Interesting. It's a pity that most hosted DBs only allow a limited list of approved extensions. – Erwin Brandstetter Dec 9, 2017 at 2:06 Unfortunately, this solution suffers from not being in the official packages (available as described in wiki.postgresql.org/wiki/Apt) which I find a little risky in terms of reliability, x-platform installability and so on. It is also not very documented. I think bignum really belongs into core! – John Frazer Mar 11, 2018 at 16:58

    Here is a version which uses numeric, so it can handle arbitrarily large hex strings:

    create function hex_to_decimal(hex_string text)
    returns text
    language plpgsql immutable as $pgsql$
    declare
        bits bit varying;
        result numeric := 0;
        exponent numeric := 0;
        chunk_size integer := 31;
        start integer;
    begin
        execute 'SELECT x' || quote_literal(hex_string) INTO bits;
        while length(bits) > 0 loop
            start := greatest(1, length(bits) - chunk_size);
            result := result + (substring(bits from start for chunk_size)::bigint)::numeric * pow(2::numeric, exponent);
            exponent := exponent + chunk_size;
            bits := substring(bits from 1 for greatest(0, length(bits) - chunk_size));
        end loop;
        return trunc(result, 0);
    $pgsql$;
    

    For example:

    =# select hex_to_decimal('ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff');
    32592575621351777380295131014550050576823494298654980010178247189670100796213387298934358015
                    There is some bug in the function causing it to return wrong output for some input. Example:  SELECT hex_to_decimal('ffffffff00000001000000000000000000000000ffffffffffffffffffffffff'); -- Returned: 115792089223836222093134215979047740691678064040465439117495607658390113746943 -- Expected: 115792089210356248762697446949407573530086143415290314195533631308867097853951
    – Joel Jacobson
                    Dec 5, 2022 at 8:05
    

    bigint version:

    create or replace function hex_to_bigint(hexval text) returns bigint as $$
    select
      (get_byte(x,0)::int8<<(7*8)) |
      (get_byte(x,1)::int8<<(6*8)) |
      (get_byte(x,2)::int8<<(5*8)) |
      (get_byte(x,3)::int8<<(4*8)) |
      (get_byte(x,4)::int8<<(3*8)) |
      (get_byte(x,5)::int8<<(2*8)) |
      (get_byte(x,6)::int8<<(1*8)) |
      (get_byte(x,7)::int8)
    from (
      select decode(lpad($1, 16, '0'), 'hex') as x
    ) as a;
    language sql strict immutable;
    

    int version:

    create or replace function hex_to_int(hexval text) returns int as $$
    select
      (get_byte(x,0)::int<<(3*8)) |
      (get_byte(x,1)::int<<(2*8)) |
      (get_byte(x,2)::int<<(1*8)) |
      (get_byte(x,3)::int)
    from (
      select decode(lpad($1, 8, '0'), 'hex') as x
    ) as a;
    language sql strict immutable;
    

    Here is another implementation:

    CREATE OR REPLACE FUNCTION hex_to_decimal3(hex_string text)
     RETURNS numeric
     LANGUAGE plpgsql
     IMMUTABLE
    AS $function$
    declare
        hex_string_lower text := lower(hex_string);
        i int;
        digit int;
        s numeric := 0;
    begin
        for i in 1 .. length(hex_string) loop
            digit := position(substr(hex_string_lower, i, 1) in '0123456789abcdef') - 1;
            if digit < 0 then
                raise '"%" is not a valid hexadecimal digit', substr(hex_string_lower, i, 1) using errcode = '22P02'; 
            end if;
            s := s * 16 + digit;
        end loop;
        return s;
    $function$;
    

    It is a straightforward one that works digit by digit, using the position() function to compute the numeric value of each character in the input string. Its benefit over hex_to_decimal2() is that it seems to be much faster (4x or so for md5()-generated hex strings).

    Here is a poper way to convert hex to string... then you can check whether it's a numric type or not

    SELECT convert_from('\x7468697320697320612076657279206C6F6E672068657820737472696E67','utf8')
    

    returns

    this is a very long hex string
    

    Here is a other version which uses numeric, so it can handle arbitrarily large hex strings:

    create OR REPLACE function hex_to_decimal2(hex_string text)
    returns text
    language plpgsql immutable as $pgsql$
    declare
        bits bit varying;
        result numeric := 0;
    begin
        execute 'SELECT x' || quote_literal(hex_string) INTO bits;
        while length(bits) > 0 loop
            result := result + (substring(bits from 1 for 1)::bigint)::numeric * pow(2::numeric, length(bits) - 1);
        bits := substring(bits from 2 for length(bits) - 1);
        end loop;
        return trunc(result, 0);
    $pgsql$;
    

    For example:

    =# select hex_to_decimal('ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff');
    32592575621351777380295131014550050576823494298654980010178247189670100796213387298934358015
    

    For example:

    =# select hex_to_decimal('5f68e8131ecf80000');
    110000000000000000000
    begin
      execute 'SELECT x' || quote_literal(substr($1::text,3)) into bits;
      bit_pos := length(bits) + 1;
      exponent := 0;
      while bit_pos >= 56 loop
        bit_pos := bit_pos - 56;
        result := result + substring(bits from bit_pos for 56)::bigint::numeric * pow(2::numeric, exponent);
        exponent := exponent + 56;
      end loop;
      while bit_pos >= 8 loop
        bit_pos := bit_pos - 8;
        result := result + substring(bits from bit_pos for 8)::bigint::numeric * pow(2::numeric, exponent);
        exponent := exponent + 8;
      end loop;
      return trunc(result);
    

    In a future PostgreSQL version, when/if Dean Rasheed's patch 0001-Add-non-decimal-integer-support-to-type-numeric.patch gets committed, this can be simplified:

    CREATE OR REPLACE FUNCTION numeric_from_bytes(bytea)
    RETURNS numeric
    LANGUAGE sql
    AS $$
    SELECT ('0'||right($1::text,-1))::numeric
            

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.