Python ThreadingX

What is Python ThreadingX?

Python ThreadingX aims to make it easy to write multi-threaded applications in Python, which can run across multiple cores, and avoid issues with the Global Interpreter Lock ("GIL")

Specifically:

Benchmarking

Compared with Python classic threads, threadx threads are 'heavier', use more memory (about 1MB per process), and have a longer setup time (about 100milliseconds) but can make usage of multiple cores, so once they get going they are a lot faster, up to twice the speed on a dual-core for example.

primes benchmark

Calculate prime numbers up to 400000 using a basic sieve of Arystophenes. Do this four times, in four sub-processes.

The benchmark is in the 'examples' directory, under 'primes'.

Results

Environment: eeepc 901, ubuntu jaunty, Intel Atom (two cores)

Classic python threads:

> time python primesclassicthreads.py

real	0m7.110s
user	0m5.828s
sys	0m2.040s

ThreadingX:

> time python threadxprimes.py 

real	0m3.626s
user	0m6.288s
sys	0m0.288s
Whilst the user time is slightly more for threadingx - because of the process setup overhead - the real time is nearly half, because threadingx can take advantage of the two available cores, and is not blocked by python's global interpreter lock.

md5 benchmark

Four processes are created, each of which:

The benchmark is in the 'examples' directory, under 'md5'

Results

Environment: eeepc 901, ubuntu jaunty, Intel Atom (two cores)

Classic python threads:

> time python md5classicthreads.py 

real	0m8.357s
user	0m7.832s
sys	0m1.268s

ThreadingX:

> time python threadxmd5.py 

real	0m5.154s
user	0m8.989s
sys	0m0.212s
Again, whilst the user time is slightly more for threadingx, the real time is nearly half.

FAQ

How does ThreadingX compare with Python classic threads?

Python classic threads are lighter-weight but are inhibited by the global interpreter lock ('GIL'), and will tend to run serially rather than in parallel even when there are multiple processor cores available.

How does ThreadingX compare with Stackless Python?

Stackless Python uses 'green' threads which are multitasked within a single operating system thread.

How does ThreadingX compare with Erlang?

Tutorial

The tutorial is in two parts:

Tutorial, part 1: installation, spawn a process, and communicate with it

Installation

Initializing threadingx

Create a file called 'main.py', and type in the following:
import sys
from threadingxlib import *

class MainService(object):
   def oninit(self, threadx ):
      self.threadx = threadx

      print "Press 'enter' to exit."
      sys.stdin.readline()

      self.threadx.shutdownnow()

threadingx.ThreadingX(MainService())

Creating an instance of the ThreadingX class will initialize the threadingx environment, and open a listening port on an available port on your machine. We pass in an instance of MainService, and threadingx will automatically run the oninit method for us.

In the oninit method, we wait for the user to press 'enter', then we call threadx.shutdownnow(), which will cause threadingx to close all threads and exit.

Run the program, then press enter to exit.

If you are on linux, before pressing enter, you can do 'lsof -i -n -P' to see the port opened by the python process. On Windows you can use tcpview

Spawn a child process

Let's create a child process. First we need to create a module for the child process. Let's create a new text file called 'child.py', in the same directory as 'main.py'. Type the following into child.py:

import sys
from threadingxlib import *

class ChildService(object):
   pass

threadx = threadingx.ThreadingX(ChildService())

This is a simple child module that will simply run, and wait for the main.py process to tell it to shut-down.

In the main.py, just before the line 'print "Press 'enter' to exit'."', add the following line:

      child = self.threadx.spawn('child')

threadx.spawn will launch the child.py module as a new process. The returned child object contains a reference to the child process, termed a 'proxy'.

You can run the application as before, by running 'python main.py'. If you want, you can verify that there are two processes running:

Communicate with a child process

Let's make a method in the child process, and show how easy it is to call from main.

In main.py, underneath 'child = threadx.spawn('child') add the following line:

      child.sendMessage('hello from main')

That's it! That's all we have to do do call a function in the child process! The child object represents the child process, and we can call methods on it directly. Technically, the child object is a 'proxy'.

We need to create the sendMessage method in the child. In child.py, replace the ChildService class with:

class ChildService(object):
   def sendMessage( self, sender, message ):
      print "Child received message: " + message

Now the child is ready to receive 'sendMessage' function calls from the main process!

We simply print a message to the console, so that we can see that the client received the message ok

Run 'python main.py', and you should see the message 'Child received message: hello from main' appear on the console.

Press enter to shut down the processes

Send a reply to the main process

We'd like a way to get the child to communicate results back to the main process. We can do this using the same function-call semantics we just saw, just in the other direction.

For convenience, in the method on the client, an additional parameter 'sender' is always passed in. This represents the calling process. We can simply call method directly on this sender object.

Let's get the child to call the finished method when it receives a message from the main process. In child.py, at the end of the sendMessage method, simply add the line:

      sender.finished()

Easy, right? The 'sender' object is a proxy object that represents the sender process, so we can simply call methods directly on it like this.

Let's add the 'finished' method to the main module, for the child to call. In the main.py, replace the MainService class with:

class MainService(object):
   def oninit(self, threadx ):
      self.threadx = threadx
      child = self.threadx.spawn('child')

   def finished( self, sender ):
      print 'Finished'
      self.threadx.shutdownnow()

We've added a 'finished' method for the child to call, which calls self.threadx.shutdownnow() for us. shutdownnow() requests threadingx to shut down cleanly. We've removed the lines from oninit that waited for a user to press a key.

We're left with two methods:

Run 'python main.py, and the child process should state that it received a message, and then the main process should say that it is finished, and then they should both shut down.

Tutorial, part 2: using the registry process to register and lookup process names

In part 1 we looked at spawning a child process and communicating it. What do we do if there are lots of child processes and they want to communicate with each other?

One possibility is to use the 'registry' process, which lets a process register a name and a process object.

< To be written >

In the meantime, you can look at the 'pingpong' example in the 'examples' directory for an example of using the registry process.

Technical Details

Each process is a full-blown Python process. This eliminates issues with the global interpreter lock. Communications between processes are over TCP/IP sockets. We could look at short-cutting the sockets in the future.

Child processes are represented by a proxy class, of class ThreadingX.Proxy. We override the __getattr__ method, so that we can redirect incoming method calls to the child process.

Call marshalling uses standard python pickle.

proxy objects passed as parameters are marshalled as the port number of the process, and then re-wrapped as a proxy object on the other end.

if the name of a called function is not found in the list of methods registered by the child, threadx holds the call in an internal queue until that method is made available

A new registry process is created each time a process runs which is not a child process.

Contact / Community

Forums are available at Python-ThreadingX forums

About

ThreadingX was written and is maintained by Hugh Perkins. You can contact me on gmail, as the user 'hughperkins'.