Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I'm trying to execute the test query like this:

SELECT COUNT(CASE WHEN name IN (SELECT name FROM requiredProducts) THEN name END)
FROM myProducts

which throws the following exception:

java.lang.ClassCastException:
org.apache.spark.sql.execution.datasources.LogicalRelation cannot be cast to
org.apache.spark.sql.execution.SparkPlan

I have a suggestion that IN operator can not be used in CASE WHEN. Is it really so? Spark documentation is silent about this.

In my experience, I've found that Spark doesn't handle all normal SQL queries. However, when redesigned to use the Spark SQL functions on top of DataFrames, they work - spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/… – Dan W Apr 12, 2019 at 16:40

The IN operator using a subquery does not work in a projection regardless of whether it is contained in a CASE WHEN, it will only work in filters. It works fine if you specify values in the IN clause directly rather than using a subquery.

I am not sure how to generate the exact exception you got above, but when I attempt to run a similar query in Spark Scala, it returns a more descriptive error:

org.apache.spark.sql.AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter: Project [CASE WHEN agi_label#5 IN (list#96 []) THEN 1 ELSE 0 END AS CASE WHEN (agi_label IN (listquery())) THEN 1 ELSE 0 END#97]

I have run into this issue in the past. Your best bet is probably to restructure it to use a left join to requiredProducts and then check for a null in the case statement. For example, something like this might work:

SELECT COUNT(CASE WHEN rp.name is not null THEN mp.name END)
FROM myProducts mp 
  LEFT JOIN requiredProducts rp ON mp.name = rp.name
        

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.