相关文章推荐
买醉的闹钟  ·  Postgresql中JSONB和text的 ...·  1 周前    · 
很酷的鸡蛋  ·  这些 hook ...·  1 年前    · 
坚强的香蕉  ·  .net ...·  1 年前    · 
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

How to prevent 'invalid input syntax for type json' in Postgres, when records contain a mix of json or strings

Ask Question

I have a text column that contains JSON and also plan text. I want to convert it to JSON, and then select a particular property. For example:

user_data
_________
{"user": {"name": "jim"}}
{"user": {"name": "sally"}}
some random data string

I've tried:

select user_data::json#>'{user,name}' from users

I get:

ERROR:  invalid input syntax for type json
DETAIL:  Token "some" is invalid.
CONTEXT:  JSON user_data, line 1: some...

Is it possible to prevent this?

If you want to skip the rows with invalid JSON, you must first test if the text is valid JSON. You can do this by creating a function which will attempt to parse the value, and catch the exception for invalid JSON values.

CREATE OR REPLACE FUNCTION is_json(input_text varchar) RETURNS boolean AS $$
  DECLARE
    maybe_json json;
  BEGIN
    BEGIN
      maybe_json := input_text;
    EXCEPTION WHEN others THEN
      RETURN FALSE;
    RETURN TRUE;
$$ LANGUAGE plpgsql IMMUTABLE;

When you have that, you could use the is_json function in a CASE or WHERE clause to narrow down the valid values.

-- this can eliminate invalid values
SELECT user_data::json #> '{user,name}'
FROM users WHERE is_json(user_data);
-- or this if you want to fill will NULLs
SELECT
    WHEN is_json(user_data)
      THEN user_data::json #> '{user,name}'
FROM users;
                I still got errors with maybe_json := input_text; so I used PERFORM input_text::json #>> '{}'; instead and everything works.
– Splitframe
                Jun 22, 2022 at 11:21
create or replace function is_json(text)
returns boolean language plpgsql immutable as $$
begin
    perform $1::json;
    return true;
exception
    when invalid_text_representation then 
        return false;
end $$;

Test:

with users(user_data) as (
values
    ('{"user": {"name": "jim"}}'),
    ('not json'),
    ('{"user": {"name": "sally"}}'),
    ('also not json')
select user_data::json#>'{user,name}' as name
from users
where is_json(user_data);
---------
 "jim"
 "sally"
(2 rows)

Others have already suggested ways to check if the JSON is valid, but I feel that rather than check that, why not already cast it? I use this function (JSONB, but you can easily change it to JSON):

CREATE OR REPLACE FUNCTION safe_cast_to_jsonb(input TEXT) RETURNS JSONB AS
DECLARE
    output JSONB DEFAULT NULL;
BEGIN
    BEGIN
        output := input::JSONB;
    EXCEPTION
        WHEN OTHERS THEN RAISE NOTICE 'INVALID JSONB';
        RETURN NULL;
    RETURN output;
$$ LANGUAGE plpgsql;

My solution is to do a preselect for solving this issue.

Let's say we got a given table users with columns id and user_data. user_data is declared as a text field, it should contain {firstName, lastName} usually.

  • Note that Postgres JSON operations work fine, if a given field is NULL - so you could omit the WHERE in the following statement for still selecting the rows which contain an invalid JSON in user_data (while the result set would only contain the id field in such a row)
  • user_data ~ '^{.*}$' ("curly brace check") checks if JSON in contained in the user_data field. Danger! Bear in mind that this check is incomplete. Incorrect JSON such as '{"key1": "value1", "key2"}' is still recognised as valid
  • Consider impacts on database performance
  • Gives as an example:

    WITH users_json_data AS (
      SELECT 
        CASE WHEN user_data ~ '^{.*}$' THEN user_data::json END AS user_json
      FROM users
    SELECT id, users_json_data.user_json->>firstName, users_json_data.user_json->>lastName
    FROM users
    INNER JOIN users_json_data USING (id)
    WHERE users_json_data.user_json IS NOT NULL
            

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.