Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

The following image is the ER diagram of the database:

My task is to create a report that includes the following:

  • the storeID
  • the store name
  • the number of unique players that have purchased a badge from the store
  • the number of unique players that have not purchased a badge from the store
  • the total money spent at the store
  • the most expensive badge a player has purchased at the store
  • the cheapest badge a player has purchased at the store
  • the average price of the items that have been purchased at the store.
  • But when I am trying to execute the following SQL command, I am getting an error saying: Error Code 1111. Invalid use of group function.

    use treasurehunters;
    select rpt_category.storeId,
    rpt_category.storeName,
    total_purchased_user,
    non_purchased_player,
    total_spent,
    expensive_badge,
    cheapest_badge
    avg_spent
    (select badgename as expensive_badge
    from badge 
    inner join purchase
    where cost = max(cost)) rpt_data_2
    inner join
    (select badgename as cheapest_badge
    from badge 
    inner join purchase
    where cost = min(cost)) rpt_data_3
    inner join
    (select distinct count(username) as total_purchased_user,
    storeid,
    storename,
    sum(cost) as total_spent,
    average(cost) as avg_spent
    from player 
    inner join purchase 
    inner join store 
    inner join badge
    on store.storeID = purchase.storeID and 
    purchase.username= player.username and
    purchase.badgeID = badge.badgeId) rpt_category
    inner join 
    (select count (username) as non_purchased_player,
    storeid
    from player 
    inner join purchase 
    on purchase.storeid != store.storeid and
    player.userername= purchase.uername ) rpt_data_1;
    

    Now, what can I do to get rid of that error.

    There are multiple issues with your derived tables. For example, in rpt_category you have distinct, aggregate functions, and no group by. Then there's the fact that you have all those joins with no ON clause... – Andrew Jun 8, 2022 at 17:05 When you say the number of unique players that have not purchased a badge from the store, should that count players thst exist but never purchased Anything from that store? Or only players that purchased Something, but never a badge? – MatBailie Jun 8, 2022 at 17:32 @MatBailie, it meant the count players that exist but never purchased Anything from that store. – Siddikur Jun 9, 2022 at 18:20

    The cause of your error is likely that you're implying a store-level grouping without explicitly grouping on that column with a GROUP BY clause. Therefore, you're attempting to extract aggregate results that are impossible at the table-level.

    You can probably resolve this by adding GROUP BY store.storeID in each of your subqueries. However, there's a lot more wrong with this query that makes it unfavorable to attempt to diagnose and resolve it.

    This is all doable in a single query / grouping. Here's what your query should look like:

    SELECT
        store.storeID,
        MAX(store.storeName) AS storeName,
        COUNT(DISTINCT purchase.username) AS total_purchased_user,
        MAX(player_count.players) - COUNT(DISTINCT purchase.username) AS non_purchased_user,
        SUM(purchase.cost) AS total_spent,
        AVG(purchase.cost) AS avg_spent,
        SUBSTRING(MIN(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS cheapest_badge,
        SUBSTRING(MAX(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS expensive_badge
    FROM store
    LEFT JOIN purchase ON store.storeID = purchase.storeID
    LEFT JOIN badge ON purchase.badgeID = badge.badgeId
    CROSS JOIN (SELECT COUNT(*) AS players FROM player) AS player_count
    GROUP BY store.storeID;
    

    What's happening here (working bottom-up):

  • GROUP BY store to ensure the results are aggregated by that, and all other metrics are calculated
  • FROM store / LEFT JOIN all other tables ensures we get metrics from every store, whether or not there are purchases for it
  • CROSS JOIN (SELECT COUNT(*) FROM players) this is a hack to give us a running total of all players that we can reference against store player-purchase counts to get the "didn't purchase" count simply and quickly, without any additional joins
  • COUNT(DISTINCT purchase.username) ensures that user counts are referenced from purchases. This also means we don't have to join on the players table in this main portion of the query to get purchase counts.
  • SUM / AVERAGE work like you had them
  • SUBSTRING(MIN(CONCAT... these calculations are using Scalar-Aggregate Reduction, a technique I invented to prevent the need for self-joining a query to get associated min/max values. There's more on this technique here: SQL Query to get column values that correspond with MAX value of another column?
  • Cheers!

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.