top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Python Threading

0 votes
854 views

This is my first script where I want to use the python threading module. I have a large dataset which is a list of dict this can be as much as 200 dictionaries in the list. The final goal is a histogram for each dict 16 histograms on a page ( 4x4 ) - this already works.
What I currently do is a create a nested list [ [ {} ], [ {} ] ] each inner list contains 16 dictionaries, thus each inner list is a single page of 16 histograms. Iterating over the outer-list and creating the graphs takes to long. So I would like multiple inner-list to be processes simultaneously and creating the graphs in "parallel".
I am trying to use the python threading for this. I create 4 threads loop over the outer-list and send a inner-list to the thread. This seems to work if my nested lists only contains 2 elements - thus less elements than threads. Currently the scripts runs and then seems to get hung up. I monitor the resource on my mac and python starts off good using 80% and when the 4-thread is created the CPU usages drops to 0%.

My thread creating is based on the following : http://www.tutorialspoint.com/python/python_multithreading.htm

posted May 18, 2013 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button
Can you show us the code?
I will post code - the entire scripts is 1000 lines of code - can I post the threading functions only?

2 Answers

0 votes

Try to condense it to the relevant parts, but make sure that it can be run by us.

As a general note, when you add new stuff to an existing longish script it is always a good idea to write it in such a way that you can test it standalone so that you can have some confidence that it will work as
designed once you integrate it with your old code.

answer May 18, 2013 by anonymous
0 votes

The odds are good that this is just going to run slower...

One: The common Python implementation uses a global interpreter lock to prevent interpreted code from interfering with itself in multiple threads. So "number cruncher" applications don't gain any speed from
being partitioned into thread -- even on a multicore processor, only one thread can have the GIL at a time. On top of that, you have the overhead of the interpreter switching between threads (GIL release on one thread, GIL acquire for the next thread).

Python threads work fine if the threads either rely on intelligent DLLs for number crunching (instead of doing nested Python loops to process a numeric array you pass it to something like NumPy which
releases the GIL while crunching a copy of the array) or they do lots of I/O and have to wait for I/O devices (while one thread is waiting for the write/read operation to complete, another thread can do some number
crunching).

If you really need to do this type of number crunching in Python level code, you'll want to look into the multiprocessing library instead. That will create actual OS processes (each with a copy of the
interpreter, and not sharing memory) and each of those can run on a core without conflicting on the GIL.

answer May 18, 2013 by anonymous
Similar Questions
+1 vote

I want my thread to be killed when I receive a particular message from Master.(I want my thread to stop whatever it is doing and come out.)

I tried few things but not working,
1) thread_name.exit()
2) thread_name.daemon = True

Any other way?

+1 vote

I am working on integration of multiple GUI (Tkinter) elements, such a progress bar, label update, etc, and throughout my research i discovered that i need to have Threading modules used to distribute the calls for GUI update and processing of my main App.

My Application is broken into multiple Classes(modules), and i wanted to hear your thought on the best practices to implement the Threading model.

I was thinking to create a new py-module, and reference all other modules/class to it, and create thread.start() objects, that would execute the GUI part, other will handle GUI Updates, and other - will be doing the processing.

Please share some of your thought on this approach, or maybe you may suggest a more effective way.

0 votes

I have a pool of worker threads, created like this:

threads = [MyThread(*args) for i in range(numthreads)]
for t in threads:
 t.start()

I then block until the threads are all done:

while any(t.isAlive() for t in threads):
 pass

Is that the right way to wait for the threads to be done? Should I stick a call to time.sleep() inside the while loop? If so, how long should I sleep? That's probably an unanswerable question, but some guidelines on
choosing the sleep time will be appreciated.

...