multiprocess package documentation

multiprocess: better multiprocessing and multithreading in Python

About Multiprocess

multiprocess is a fork of multiprocessing. multiprocess extends multiprocessing to provide enhanced serialization, using dill. multiprocess leverages multiprocessing to support the spawning of processes using the API of the Python standard library’s threading module. multiprocessing has been distributed as part of the standard library since Python 2.6.

multiprocess is part of pathos, a Python framework for heterogeneous computing. multiprocess is in active development, so any user feedback, bug reports, comments, or suggestions are highly appreciated. A list of issues is located at https://github.com/uqfoundation/multiprocess/issues, with a legacy list maintained at https://uqfoundation.github.io/project/pathos/query.

Major Features

multiprocess enables:

  • objects to be transferred between processes using pipes or multi-producer/multi-consumer queues

  • objects to be shared between processes using a server process or (for simple data) shared memory

multiprocess provides:

  • equivalents of all the synchronization primitives in threading

  • a Pool class to facilitate submitting tasks to worker processes

  • enhanced serialization, using dill

Current Release

The latest released version of multiprocess is available from:

multiprocess is distributed under a 3-clause BSD license, and is a fork of multiprocessing.

Development Version

You can get the latest development version with all the shiny new features at:

If you have a new contribution, please submit a pull request.

Installation

multiprocess can be installed with pip:

$ pip install multiprocess

For Python 2, a C compiler is required to build the included extension module from source. Python 3 and binary installs do not require a C compiler.

Requirements

multiprocess requires:

  • python (or pypy), >=3.8

  • setuptools, >=42

  • dill, >=0.3.9

Basic Usage

The multiprocess.Process class follows the API of threading.Thread. For example

from multiprocess import Process, Queue

def f(q):
    q.put('hello world')

if __name__ == '__main__':
    q = Queue()
    p = Process(target=f, args=[q])
    p.start()
    print (q.get())
    p.join()

Synchronization primitives like locks, semaphores and conditions are available, for example

>>> from multiprocess import Condition
>>> c = Condition()
>>> print (c)
<Condition(<RLock(None, 0)>), 0>
>>> c.acquire()
True
>>> print (c)
<Condition(<RLock(MainProcess, 1)>), 0>

One can also use a manager to create shared objects either in shared memory or in a server process, for example

>>> from multiprocess import Manager
>>> manager = Manager()
>>> l = manager.list(range(10))
>>> l.reverse()
>>> print (l)
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> print (repr(l))
<Proxy[list] object at 0x00E1B3B0>

Tasks can be offloaded to a pool of worker processes in various ways, for example

>>> from multiprocess import Pool
>>> def f(x): return x*x
...
>>> p = Pool(4)
>>> result = p.map_async(f, range(10))
>>> print (result.get(timeout=1))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

When dill is installed, serialization is extended to most objects, for example

>>> from multiprocess import Pool
>>> p = Pool(4)
>>> print (p.map(lambda x: (lambda y:y**2)(x) + x, xrange(10)))
[0, 2, 6, 12, 20, 30, 42, 56, 72, 90]

More Information

Probably the best way to get started is to look at the documentation at http://multiprocess.rtfd.io. Also see multiprocess.tests for scripts that demonstrate how multiprocess can be used to leverge multiple processes to execute Python in parallel. You can run the test suite with python -m multiprocess.tests. As multiprocess conforms to the multiprocessing interface, the examples and documentation found at http://docs.python.org/library/multiprocessing.html also apply to multiprocess if one will import multiprocessing as multiprocess. See https://github.com/uqfoundation/multiprocess/tree/master/py3.12/examples for a set of examples that demonstrate some basic use cases and benchmarking for running Python code in parallel. Please feel free to submit a ticket on github, or ask a question on stackoverflow (@Mike McKerns). If you would like to share how you use multiprocess in your work, please send an email (to mmckerns at uqfoundation dot org).

Citation

If you use multiprocess to do research that leads to publication, we ask that you acknowledge use of multiprocess by citing the following in your publication:

M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis,
"Building a framework for predictive science", Proceedings of
the 10th Python in Science Conference, 2011;
http://arxiv.org/pdf/1202.1056

Michael McKerns and Michael Aivazis,
"pathos: a framework for heterogeneous computing", 2010- ;
https://uqfoundation.github.io/project/pathos

Please see https://uqfoundation.github.io/project/pathos or http://arxiv.org/pdf/1202.1056 for further information.

Array(typecode_or_type, size_or_initializer, *, lock=True)

Returns a synchronized shared array

exception AuthenticationError

Bases: ProcessError

Barrier(parties, action=None, timeout=None)

Returns a barrier object

BoundedSemaphore(value=1)

Returns a bounded semaphore object

exception BufferTooShort

Bases: ProcessError

Condition(lock=None)

Returns a condition object

Event()

Returns an event object

JoinableQueue(maxsize=0)

Returns a queue object

Lock()

Returns a non-recursive lock object

Manager()

Returns a manager associated with a running server process

The managers methods such as Lock(), Condition() and Queue() can be used to create shared objects.

Pipe(duplex=True)

Returns two connection object connected by a pipe

Pool(processes=None, initializer=None, initargs=(), maxtasksperchild=None)

Returns a process pool object

class Process(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)

Bases: BaseProcess

static _Popen(process_obj)
static _after_fork()
_start_method = None
exception ProcessError

Bases: Exception

Queue(maxsize=0)

Returns a queue object

RLock()

Returns a recursive lock object

RawArray(typecode_or_type, size_or_initializer)

Returns a shared array

RawValue(typecode_or_type, *args)

Returns a shared object

Semaphore(value=1)

Returns a semaphore object

SimpleQueue()

Returns a queue object

exception TimeoutError

Bases: ProcessError

Value(typecode_or_type, *args, lock=True)

Returns a synchronized shared object

active_children()

Return list of process objects corresponding to live child processes

allow_connection_pickling()

Install support for sending connections and sockets between processes

cpu_count()

Returns the number of CPUs in the system

current_process()

Return process object representing the current process

freeze_support()

Check whether this is a fake forked process in a frozen executable. If so then run code specified by commandline and exit.

get_all_start_methods()
get_context(method=None)
get_logger()

Return package logger – if it does not already exist then it is created.

get_start_method(allow_none=False)
log_to_stderr(level=None)

Turn on logging and add a handler which prints to stderr

parent_process()

Return process object representing the parent process

set_executable(executable)

Sets the path to a python.exe or pythonw.exe binary used to run child processes instead of sys.executable when using the ‘spawn’ start method. Useful for people embedding Python.

set_forkserver_preload(module_names)

Set list of module names to try to load in forkserver process. This is really just a hint.

set_start_method(method, force=False)

Indices and tables