看起来spark sql对" like“查询是区分大小写的,对吧?
spark.sql("select distinct status, length(status) from table")
返回
Active|6 spark.sql("select distinct status from table where status like '%active%'")
不返回值
spark.sql("select distinct status from table where status like '%Active%'")
Active
发布于 2018-11-28 18:28:14
是的,Spark是区分大小写的。默认情况下,对于字符串比较,大多数RDBMS是区分大小写的。如果您希望不区分大小写,请尝试rlike或将列转换为大写/小写。
scala> val df = Seq(("Active"),("Stable"),("Inactive")).toDF("status") df: org.apache.spark.sql.DataFrame = [status: string] scala> df.createOrReplaceTempView("tbl") scala> df.show +--------+ | status| +--------+ | Active| | Stable| |Inactive| +--------+ scala> spark.sql(""" select status from tbl where status like '%Active%' """).show +------+ |status| +------+ |Active| +------+ scala> spark.sql(""" select status from tbl where lower(status) like '%active%' """).show +--------+ | status| +--------+