spark默认的spark.driver.maxResultSize为1g,所以在运行spark程序的时候有时候会报错:

ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

解决方案是:

from pyspark import SparkConf, SparkContext

SparkContext.setSystemProperty('spark.driver.maxResultSize', '10g')

spark默认的spark.driver.maxResultSize为1g,所以在运行spark程序的时候有时候会报错:ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
set by Spark Conf:conf.set(" spark . driver . max Result Size ", "3g") set by spark -defaults.conf: spark . driver . max Result Size 3g set when calling spark -submit:--conf spark . driver . max Result Size =3g 转载...
.builder \ .appName(" Python Spark SQL basic example") \ .config(" spark .memory.fraction", 0.8) \
最近有个需求需要union 上千甚至更多的dataset数据,然后cache(),然后count(),在执行count()这个action操作的时候, spark 程序报错,如下: org.apache. spark . Spark Exception: Job aborted due to stage failure: Total size of serialized result s of 16092 tasks (16.0 GB) is bigger than spark . driver . max Result Size
Error ERROR TaskSetManager: Total size of serialized result s of 8113 tasks (1131.0 MB) is bigger than spark . driver . max Result Size (1024.0 MB) ERROR TaskSetManager: Total size of serialized result s of...
sc_conf = Spark Conf() sc_conf.setMaster(' spark ://master:7077') sc_conf.setAppName('my-app') sc_conf.set(' spark .executor.memory', '2g') #ex. 20/09/15 15:21:32 ERROR scheduler.TaskSetManager: Total size of serialized result s of 423 tasks (4.0 GB) is bigger than spark . driver . max Result Size (4.0 GB) Exception in thread "main" org.apache.spa import numpy as np import matplotlib.pyplot as plt from sklearn.mixture import GaussianMixture as GMM from sklearn.mixture import GaussianMixture from sklearn.cluster import KMeans from sklearn import metrics from skle
今天遇到了 spark . driver . max Result Size 的异常,通过增大该值解决了,但是其运行机制不是很明白,先记录在这里,希望后面有机会可以明白背后的机制。 该异常会报如下的异常信息: Job aborted due to stage failure: Total size of serialized result s of 3979 tasks (1024.2 MB) is bigger than spark . driver . max Result Size (1024.0 MB) 锁定了是sp...
spark .app.name (none) The name of your application. This will appear in the UI and in log data. spark . driver .cores Number of cores to use for the driver process, only in cluster