Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

How can I count easily the number of rows where a particular column is true and the number where it is false ?

I can't (or can I ?) run the query with count() because I'm embedding this count in a having() clause, like :

.having(func.count(Question.accepted) >
        func.count(not_(Question.accepted)))

but with the above way, the function counts every line on both sides of the inequality.

I tried something like this

.having(func.count(func.if_(Question.accepted, 1, 0)) >
        func.count(func.if_(Question.accepted, 0, 1)))

But I get an error

function if(boolean, integer, integer) does not exist

(seems it doesn't exist in postgresql).

How can I count easily the number of rows where column is true and false ?

Using aggregate functions in a HAVING clause is very much legal, since HAVING eliminates group rows. Conditional counting can be achieved either by using the property that NULLs don't count:

count(expression) ... number of input rows for which the value of expression is not null

or if using PostgreSQL 9.4 or later, with the aggregate FILTER clause:

count(*) FILTER (WHERE something > 0)

You could also use a sum of ones (and zeros).

PostgreSQL >= 9.4 and SQLAlchemy >= 1.0.0

Using a filtered aggregate function:

.having(func.count(1).filter(Question.accepted) >
        func.count(1).filter(not_(Question.accepted)))

Older PostgreSQL and/or SQLAlchemy

The SQL analog for "if" is either CASE expression or in this case nullif() function. Both of them can be used together with the fact that NULLs don't count:

from sqlalchemy import case
.having(func.count(case([(Question.accepted, 1)])) >
        func.count(case([(not_(Question.accepted), 1)])))
.having(func.count(func.nullif(Question.accepted, False)) >
        func.count(func.nullif(Question.accepted, True)))

Using nullif() can be a bit confusing as the "condition" is what you don't want to count. You could device an expression that would make the condition more natural, but that's left for the reader. These 2 are more portable solutions, but on the other hand the FILTER clause is standard, though not widely available.

You can use this query:

Session.query(func.sum(case([(Question.accepted == True, 1)], else_=0).label('accepted_number'))

And the same column will be for False value, but with False in condition

Or, you can use window function:

Session.query(func.count(Question.id).over(partition_by=Question.accepted), Question.accepted).all()

The result will contain two rows (if there are only two possible values in Question.accepted), where the first column is the number of values, and the second is the values of 'accepted' column.

Most flavors of SQL store booleans as 1-bit integers, and calling SUM on them properly gives you the number of positive booleans/True rows. In your case, you can simply do:

func.sum(Question.accepted)

If your SQL engine returns a SUM of booleans as a boolean (not sure there is any), you can use expression function cast() to force it into an integer instead of using func.count(case([(Question.accepted, 1)])):

func.sum(sqlalchemy.cast(Question.accepted, sqlalchemy.Integer))

Finally, if you want the sum as an integer on the Python side the expression function type_coerce() does the trick:

sqlalchemy.type_coerce(func.sum(Question.accepted), sqlalchemy.Integer)
                What SQL "overhead"? That seems like it should be explained a bit more. Note that the SUM() trick is mentioned and linked to in "You could also use a sum of ones (and zeros)." The SUM() trick has one edge though: it produces a different result for 0 rows.
– Ilja Everilä
                Apr 20, 2020 at 15:49
                My bad, I would have thought the case statement would have added some processing time in SQL, but after a few tests in MySQL it seems like the impact is negligible.  Despite that, I still think it’s cleaner to not clutter up the code with a useless case statement when sums on booleans work just well and can be gotten thanks to type_coerce.
– Gary Mialaret
                Apr 21, 2020 at 13:13
                Agreed, the whole COUNT(CASE ... ) thing is not easy on the eyes. Personally I do quite like the <aggregate> FILTER (WHERE <predicate>) though, it's rather flexible, though not broadly supported.
– Ilja Everilä
                Apr 21, 2020 at 15:47
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.