File "build\bdist.win32\egg\pymongo\cursor.py", line 703, in next
File "build\bdist.win32\egg\pymongo\cursor.py", line 679, in _refresh
File "build\bdist.win32\egg\pymongo\cursor.py", line 628, in __send_message
File "build\bdist.win32\egg\pymongo\helpers.py", line 95, in _unpack_response
pymongo.errors.OperationFailure: cursor id '1236484850793' not valid at server
这个错误是什么意思?
5 个回答
0 人赞同
也许你的游标在服务器上超时了。看看这是否是问题所在,试着设置timeout=False`。
for doc in coll.find(timeout=False)
See http://api.mongodb.org/python/1.6/api/pymongo/collection.html#pymongo.collection.Collection.find
如果是超时问题,一个可能的解决方案是设置batch_size
(s.其他答案)。
0 人赞同
Setting the timeout=False
is dangerous and should never be used, because the connection to the cursor can remain open for unlimited time, which will affect system performance. The docs specifically reference the need to manually close the cursor.
Setting the batch_size
to a small number will work, but creates a big latency issue, because we need to access the DB more often than needed.
For example:
5M docs with a small batch will take hours to retrieve
the same data that a default batch_size returns in several minutes.
在我的解决方案中,必须使用sort on the cursor:
done = False
skip = 0
while not done:
cursor = coll.find()
cursor.sort( indexed_parameter ) # recommended to use time or other sequential parameter.
cursor.skip( skip )
for doc in cursor:
skip += 1
do_something()
done = True
except pymongo.errors.OperationFailure, e:
msg = e.message
if not (msg.startswith("cursor id") and msg.endswith("not valid at server")):
raise
0 人赞同
设置timeout=False
是一个非常糟糕的做法。 摆脱游标id超时异常的一个更好的方法是估计你的循环能在10分钟内处理多少文档,并得出一个保守的批次大小。 这样一来,MongoDB客户端(在本例中是PyMongo)将不得不在前一个批次的文档用完后,偶尔查询一下服务器。 这将使游标在服务器上保持活跃,而且你仍然会被10分钟的超时保护所覆盖。
下面是你如何为游标设置批量大小。
for doc in coll.find().batch_size(30):
do_time_consuming_things()
0 人赞同
你应该选择一个低值的 batch_size 来解决这个问题。
col.find({}).batch_size(10)
see the following 答案