相关文章推荐
谦逊的红豆  ·  清华大学美术学院·  8 月前    · 
开朗的山楂  ·  Java™ SE Development ...·  1 年前    · 
登录

Textblob模块在集群中找不到

内容来源于 Stack Overflow,遵循 CC BY-SA 4.0 许可协议进行翻译与使用。IT领域专用引擎提供翻译支持

腾讯云小微IT领域专用引擎提供翻译支持

原文
Konserwa 修改于2022-01-12
  • 该问题已被编辑
  • 提问者: Konserwa
  • 提问时间: 2022-01-12 14:13

我正在使用Dataproc云进行火花计算。问题是我的工作节点无法访问textblob包。我怎么才能修好它?我在jupyter笔记本上用火花放电内核编写代码

代码错误:

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 588, in main
    func, profiler, deserializer, serializer = read_udfs(pickleSer, infile, eval_type)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 447, in read_udfs
    udfs.append(read_single_udf(pickleSer, infile, eval_type, runner_conf, udf_index=i))
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 249, in read_single_udf
    f, return_type = read_command(pickleSer, infile)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 69, in read_command
    command = serializer._read_with_length(file)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 160, in _read_with_length
    return self.loads(obj)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 430, in loads
    return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'textblob'

失败的示例代码:

data = [{"Category": 'Aaaa'},
        {"Category": 'Bbbb'},
        {"Category": 'Cccc'},
        {"Category": 'Eeeee'}
df = spark.createDataFrame(data)
def sentPackage(text):
    import textblob
    return TextBlob(text).sentiment.polarity