Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

Got Exception Error “Exception in thread Thread-13 (most likely raised during interpreter shutdown)”

Ask Question

I wrote a simple script, which using threads to retrieve data from service.

    __author__ = 'Igor'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool
ip_list = []
good_ip_list = []
bad_ip_list = []
progress = 0
with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()
def process_request(ip):
    global progress
    progress += 1
    if progress % 10000 == 0:
        print 'Processed ip:', progress, '...'
    r = requests.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        good_ip_list.append(ip)
    elif r.status_code == 400:
        bad_ip_list.append(ip)
    else:
        print 'Unknown http code received, aborting'
        exit(1)
pool = ThreadPool(16)
    pool.map(process_request, ip_list)
except:
    for name, ip_list in (('/tmp/out_good.txt', good_ip_list),     ('/tmp/out_bad.txt', bad_ip_list)):
        with open(name, 'w') as f:
            for ip in ip_list:
                print>>f, ip

But after some requests processed (40k-50k) i receive:

Exception in thread Thread-7 (most likely raised during interpreter shutdown): Traceback (most recent call last): Process finished with exit code 0

Tried to change service settings:

        <timeout>999</timeout>
        <connectionlimit>600</connectionlimit>
        <httpthreads>32</httpthreads>
        <workerthreads>128</workerthreads>

but still same error. Can anybody help me - what's wrong?

progress += 1 in language with mutable data and using multiple threads w/o any protection... I stopped looking at that point ;) – iced Mar 13, 2015 at 8:22 @PatrickCollins as i understood - the problem in requests.exceptions.ConnectionError raise, tried to catch it and continue pool.map but same effect – Igor Mar 13, 2015 at 8:34 your code is wrong. same with good/bad_ip_list.append (shared lists accessed from multiple threads w/o any protection). you need to learn how to make multithreaded apps in python first. I can bet that is the reason for error you are getting, but I'm not going to investigate it deeper while it's bad from very start. – iced Mar 13, 2015 at 8:36 simple explanation - when 2 threads are trying to append to the same list in the same time bad things will happen. I have no idea why they advocate languages with mutable data as good for beginners... – iced Mar 13, 2015 at 8:40

Thanks to everybody, who helped me in solving this problem. Rewrote the whole code and now it works perfectly:

__author__ = 'kulakov'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool
ip_list = []
good_ip_list = []
bad_ip_list = []
with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()
s = requests.Session()
def process_request(ip):
    r = s.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        # good_ip_list.append(ip)
        return (ip, True)
    elif r.status_code == 400:
        # bad_ip_list.append(ip)
        return (ip, False)
    else:
        print 'Unknown http code received, aborting'
        exit(1)
pool = ThreadPool(16)
for ip, isOk in pool.imap(process_request, ip_list):
    if isOk:
        good_ip_list.append(ip)
    else:
        bad_ip_list.append(ip)
pool.close()
pool.join()
for name, ip_list in (('/tmp/out_good.txt', good_ip_list),    ('/tmp/out_bad.txt', bad_ip_list)):
    with open(name, 'w') as f:
        for ip in ip_list:
            print>>f, ip

Some new usefull information:

1) It was really bad idea to write data in different threads in a function process_request, now it returns statement(true\false) and ip.

2) keep alive is fully supported by requests, by default, but if you want to use it, you must create instance of an object Session, and apply get method to it only:

s = requests.Session()
r = s.get('http://*****/?ip='+ip, timeout=None)

is not safe to mix with Python multiprocessing. The correct approach is to return a tuple (or something) from each call to process_request and then concatenate them all at the end. It's also not safe to modify progress concurrently from multiple processes. I'm not positive what your error is, but I bet it's some synchronization problem that is killing Python as a whole.

Remove the shared state and try again.

Thanks, @Patrick Collins I guess you're right with this suggestion: "I bet it's some synchronization problem that is killing Python as a whole" Can you explain little bit more, please this point: "Remove the shared state and try again." As input data i have a function 'process_request' and a list 'ip_list'. So what's the right way to map this 2 objects not in cycle (as i tried with cycle it worked perfectly, but very slow) but in different threads. Thank you. – Igor Mar 13, 2015 at 8:54 @Igor take out the references to good_ip_list, bad_ip_list and progress inside process_request. Don't modify anything inside process_request except objects you've created inside process_request. Instead, do something like returning true or false in the call depending on whether the IP was good or not. – Patrick Collins Mar 13, 2015 at 9:00

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.