spark默认的spark.driver.maxResultSize为1g,所以在运行spark程序的时候有时候会报错:
ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
解决方案是:
from pyspark import SparkConf, SparkContext
SparkContext.setSystemProperty('spark.driver.maxResultSize', '10g')
spark默认的spark.driver.maxResultSize为1g,所以在运行spark程序的时候有时候会报错:ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
set by
Spark
Conf:conf.set("
spark
.
driver
.
max
Result
Size
", "3g")
set by
spark
-defaults.conf:
spark
.
driver
.
max
Result
Size
3g
set when calling
spark
-submit:--conf
spark
.
driver
.
max
Result
Size
=3g
转载...
.builder \
.appName("
Python
Spark
SQL basic example") \
.config("
spark
.memory.fraction", 0.8) \
最近有个需求需要union 上千甚至更多的dataset数据,然后cache(),然后count(),在执行count()这个action操作的时候,
spark
程序报错,如下:
org.apache.
spark
.
Spark
Exception: Job aborted due to stage failure: Total
size
of serialized
result
s of 16092 tasks (16.0 GB) is bigger than
spark
.
driver
.
max
Result
Size
Error
ERROR TaskSetManager: Total
size
of serialized
result
s of 8113 tasks (1131.0 MB) is bigger than
spark
.
driver
.
max
Result
Size
(1024.0 MB)
ERROR TaskSetManager: Total
size
of serialized
result
s of...
sc_conf =
Spark
Conf()
sc_conf.setMaster('
spark
://master:7077')
sc_conf.setAppName('my-app')
sc_conf.set('
spark
.executor.memory', '2g') #ex.
20/09/15 15:21:32 ERROR scheduler.TaskSetManager: Total
size
of serialized
result
s of 423 tasks (4.0 GB) is bigger than
spark
.
driver
.
max
Result
Size
(4.0 GB)
Exception in thread "main" org.apache.spa
import numpy as np
import matplotlib.pyplot as plt
from sklearn.mixture import GaussianMixture as GMM
from sklearn.mixture import GaussianMixture
from sklearn.cluster import KMeans
from sklearn import metrics
from skle
今天遇到了
spark
.
driver
.
max
Result
Size
的异常,通过增大该值解决了,但是其运行机制不是很明白,先记录在这里,希望后面有机会可以明白背后的机制。
该异常会报如下的异常信息:
Job aborted due to stage failure: Total
size
of serialized
result
s of 3979 tasks (1024.2 MB) is bigger than
spark
.
driver
.
max
Result
Size
(1024.0 MB)
锁定了是sp...
spark
.app.name
(none)
The name of your application. This will appear in the UI and in log data.
spark
.
driver
.cores
Number of cores to use for the
driver
process, only in cluster