Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

The other day, I gave an answer to this question but then other user solved that problem with sum + case conditional statement to add one edge condition in result. So, the question came to my mind, how statement sum(case when jobname = 'Analyst' then 1 else 0 end) in the below query works

select d.*
from (select deptno,
         sum(case when jobname = 'Analyst' then 1 else 0 end) as numAnalysts
      from employees
      group by deptno
      order by numAnalysts asc
where rownum = 1;`

and return the number of employees over a department. Also, I would like to understand the performance of this query.

Before posting this question, I read this, this and this but still didn't get how this works.

the first step to understanding performance of a query, is to use explain on the query. dev.mysql.com/doc/refman/5.0/en/using-explain.html. as for how the case conditional works, it is simply an if / else style construct. when you sum a field, you total all the values of the field. in the above, instead of summing a field, you are essentially summing the result of a function that operates on the field, and returns either 1 or 0. – pala_ Mar 30, 2015 at 0:02 rownum is the part of outer query and that is used to return the row with minimum no of analysts in department. That query is the answer for this – codiacTushki Mar 30, 2015 at 0:16 rownum is oracle. not sure why this is tagged mysql tbh, but its basically equivalent to 'limit 1' in this instance. in which case my comment above about mysql's 'explain' is pretty useless. – pala_ Mar 30, 2015 at 0:18
  select deptno,
         sum(case when jobname = 'Analyst' then 1 else 0 end) as numAnalysts
  from employees
  group by deptno

This is a simple aggregation query, really. What the query is doing is:

  • Look at each row in employees
  • If jobname is 'Analyst' then assign the value of 1 (this is the case statement. Otherwise, assign a value of0`.
  • Aggregate by department, summing the value just calculated. This has the effect of counting the number of analysts.
  • case is an expression that returns a value. The sum() is simply adding up that value for each group.

    Lets try to split the problem in 2 parts.

    First, let suppose you want a field saying if the jobname is 'Analyst' or not.

    SELECT
      deptno,
      CASE WHEN jobname = 'Analyst' THEN 1 ELSE 0 END AS IsAnalyst
      employees
    

    This query is going to return 0 for all the jobname that are no 'Analyst' and 1 for all the ones that are 'Analyst'

    At this point you will have something like this

    deptno IsAnalyst
    1      1
    1      0
    1      0
    2      0
    2      1
    2      1
    2      1
    2      0
    

    Second, You want to summarize this information by department

    SELECT
      deptno,
      SUM(CASE WHEN jobname = 'Analyst' THEN 1 ELSE 0 END) AS numAnalysts
      employees
    GROUP BY
      deptno
    

    You are applying a sum to all those value and grouping by deptno.

    At this point (I removed the order by from the query to simplify it) you will have the following output

    deptno numAnalysts
    1      1
    2      3
    

    I think that an example is worth a thousand words

    Hope this helps

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.