相关文章推荐
年轻有为的海龟  ·  使用Python ...·  2 月前    · 
坚强的韭菜  ·  轻信保健品能治病 ...·  1 周前    · 
一身肌肉的烤土司  ·  thinkphp ...·  1 年前    · 
豁达的地瓜  ·  Amazon.com·  1 年前    · 

Hello everyone, I'm having trouble understanding how to build a filter request on a JSONB field, which is a list of objects.

I implemented it in SQL, but I don't understand how to do it through sqlalchemy at all.

I will be very grateful if you help me to make a correct request.

Example table:

create table test_table_json
    id   serial
        constraint test_table_json_pk
            primary key,
    data jsonb
create unique index test_table_json_id_uindex
    on test_table_json (id);

Example data:

INSERT INTO public.test_table_json (id, data) VALUES (1, '[{"code": "work_id", "value": {"old_value": 10, "new_value": 20}}, {"code": "price", "value": {"old_value": 100, "new_value": 200}}]')
INSERT INTO public.test_table_json (id, data) VALUES (2, '[{"code": "work_id", "value": {"old_value": 20, "new_value": 30}}, {"code": "name", "value": {"old_value": "Alex", "new_value": "Stefan"}}]')

My query:

SELECT * FROM test_table_json
WHERE
      '{example_code}'::text[] <@ (ARRAY(
        SELECT elem->>'code' FROM jsonb_array_elements(data) elem))
      '{Stefan}'::text[] <@ (ARRAY(
        SELECT elem->'value'->>'new_value' FROM jsonb_array_elements(data) elem))

Pseudocode of what's happening:

["Stefan"] IN [i.value.new_value for i in data]
["example_code"] IN [i.code for i in data]
      

okey doke so this is a common kind of question so I will go through some of the top level things people need to know before showing how to do this query:

  • First thing, if you have a textual SQL query that does what you need, you might not need to convert it at all. SQLAlchemy runs textual queries just fine, and you can link them to ORM objects too. So in that sense, if your SQL works, you're done. Background on sticking with text is at https://docs.sqlalchemy.org/en/14/core/tutorial.html#sqlexpression-text and https://docs.sqlalchemy.org/en/14/orm/queryguide.html#getting-orm-results-from-textual-and-core-statements
  • if you haven't already, I'd recommend spending a good chunk of time with t…
  • okey doke so this is a common kind of question so I will go through some of the top level things people need to know before showing how to do this query:

  • First thing, if you have a textual SQL query that does what you need, you might not need to convert it at all. SQLAlchemy runs textual queries just fine, and you can link them to ORM objects too. So in that sense, if your SQL works, you're done. Background on sticking with text is at https://docs.sqlalchemy.org/en/14/core/tutorial.html#sqlexpression-text and https://docs.sqlalchemy.org/en/14/orm/queryguide.html#getting-orm-results-from-textual-and-core-statements
  • if you haven't already, I'd recommend spending a good chunk of time with the tutorial at https://docs.sqlalchemy.org/en/14/tutorial/index.html . this tutorial compared to our previous tutorials is long. But the point of it is that it is the only tutorial you'd need , providing an understanding of what's going on enough that everything else in the docs is grounded in what it all means.
  • OK so the query, let's assume we want it in fancy SQLAlchemy expression language. boilerplate:
  • from
    
    
    
    
        
     sqlalchemy import cast
    from sqlalchemy import Column
    from sqlalchemy import func
    from sqlalchemy import Integer
    from sqlalchemy import literal
    from sqlalchemy import or_
    from sqlalchemy import select
    from sqlalchemy import Text
    from sqlalchemy import type_coerce
    from sqlalchemy.dialects.postgresql import ARRAY
    from sqlalchemy.dialects.postgresql import array
    from sqlalchemy.dialects.postgresql import JSONB
    from sqlalchemy.orm import declarative_base
    Base = declarative_base()
    class Test(Base):
        __tablename__ = 'test_table_json'
        id = Column(Integer, primary_key=True)
        data = Column(JSONB)

    a. selecting FROM a SQL function. SQLAlchemy has recently gotten a lot better at this, background on SQL functions including PG's fancy JSON functions is in yes the tutorial at https://docs.sqlalchemy.org/en/14/tutorial/data_select.html#working-with-sql-functions . The form we have here looks like a column valued function. so like:

    elem = func.jsonb_array_elements(Test.data, type_=JSONB).column_valued("elem")
    

    looks good:

    >>> elem = func.jsonb_array_elements(Test.data, type_=JSONB).column_valued("elem")
    >>> print(select(elem))
    SELECT elem 
    FROM jsonb_array_elements(test_table_json.data) AS elem

    we now need the PG "astext" operator ->>. This is part of the JSON/JSONB operators for Postgresql and is mentioned here, so we can get that like:

    >>> print(select(elem['code'].astext))
    SELECT elem ->> :elem_1 AS anon_1 
    FROM jsonb_array_elements(test_table_json.data) AS elem

    b. wrapping that to an ARRAY. the PG ARRAY literal is avilable from postgresql.array

    >>> print(array([select(elem['code'].astext)]))
    <stdin>:1: SAWarning: implicitly coercing SELECT object to scalar subquery; please use the .scalar_subquery() method to produce a scalar subquery.
    ARRAY[(SELECT elem ->> %(elem_1)s AS anon_1 
    FROM jsonb_array_elements(test_table_json.data) AS elem)]

    OK lets do what the warning says:

    >>> print(array([select(elem['code'].astext).scalar_subquery()]))
    ARRAY[(SELECT elem ->> %(elem_1)s AS anon_1 
    FROM jsonb_array_elements(test_table_json.data) AS elem)]

    c. left hand literal expressions with the "::" operator, that's SQL CAST. we can use CAST directly to get this (found an issue in my tutorial, that the current CAST tutorial is in the old tutorial only at the moment, that needs to be migrated):

    >>> print(cast("example code", ARRAY(Text)).compile(dialect=postgresql.dialect()))
    CAST(%(param_1)s::TEXT[] AS TEXT[])

    d. the docs aren't good on this one. The PG <@ operator. This is called "contained by", and it's here: https://docs.sqlalchemy.org/en/14/dialects/postgresql.html?highlight=jsonb#sqlalchemy.dialects.postgresql.ARRAY.Comparator.contained_by however this doc should show the operator. anyway it looks like this:

    >>> print(
    ...     cast("example code", ARRAY(Text)).
    ...     contained_by(array([select(elem['code'].astext).scalar_subquery()]))
    ... )
    CAST(:param_1 AS ARRAY) <@ ARRAY[(SELECT elem ->> %(elem_1)s AS anon_1 
    FROM jsonb_array_elements(test_table_json.data) AS elem)]

    e. OK the whole thing, which also has the two-level JSONB element and the OR, see below

    from sqlalchemy import Column
    from sqlalchemy import func
    from sqlalchemy import Integer
    from sqlalchemy import or_
    from sqlalchemy import select
    from sqlalchemy import Text, cast
    from sqlalchemy.dialects.postgresql import ARRAY
    from sqlalchemy.dialects.postgresql import array
    from sqlalchemy.dialects.postgresql import JSONB
    from sqlalchemy.orm import declarative_base
    Base = declarative_base()
    class Test(Base):
        __tablename__ = 'test_table_json'
        id = Column(Integer, primary_key=True)
        data = Column(JSONB)
    elem = func.jsonb_array_elements(Test.data, type_=JSONB).column_valued("elem")
    stmt = select(Test).where(
        or_(
            cast("example code", ARRAY(Text)).contained_by(
                array([select(elem['code'].astext).scalar_subquery()])
            cast("stefan", ARRAY(Text)).contained_by(
                array([select(elem['code']['new_value'].astext).scalar_subquery()])
    print(stmt)

    this one had a lot of special APIs to use, so for now I'd advise just going through https://docs.sqlalchemy.org/en/14/dialects/postgresql.html to see what's available.

    @zzzeek, it doesn't work as expected.

  • bad contained_by(array([select(elem['code'].astext).scalar_subquery()
    Working RAW:
  • Generated RAW:

    additional square quotes are created after ARRAY, although they are not actually needed.

  • cast("example code", ARRAY(Text))
  • Generated param: ['s', 't', 'e', 'f', 'a', 'n']
    When expected: '{stefan}'

    you can solve it like this, but it seems to me that this is not the best option:
    cast(text("'{stefan}'"), ARRAY(Text))

    oh right array() literal doesn't do subqueries.

    just use cast() as needed and for the parameters, those should be lists since that's array.

    stmt = select(Test).where(
        or_(
            cast(["example code"], ARRAY(Text)).contained_by(
                array([select(elem['code'].astext).scalar_subquery()])
            cast(["stefan"], ARRAY(Text)).contained_by(
                cast(select(elem['code']['new_value'].astext).scalar_subquery(), ARRAY(Text))
    
    stmt = select(Test).where(
        or_(
            cast(["example code"], ARRAY(Text)).contained_by(
                array([select(elem['code'].astext).scalar_subquery()])
            cast(["stefan"], ARRAY(Text)).contained_by(
                func.array(select(elem['code']['new_value'].astext).scalar_subquery())