multiprocess package documentation
multiprocess: better multiprocessing and multithreading in Python
About Multiprocess
multiprocess
is a fork of multiprocessing
. multiprocess
extends multiprocessing
to provide enhanced serialization, using dill. multiprocess
leverages multiprocessing
to support the spawning of processes using the API of the Python standard library’s threading
module. multiprocessing
has been distributed as part of the standard library since Python 2.6.
multiprocess
is part of pathos
, a Python framework for heterogeneous computing.
multiprocess
is in active development, so any user feedback, bug reports, comments,
or suggestions are highly appreciated. A list of issues is located at https://github.com/uqfoundation/multiprocess/issues, with a legacy list maintained at https://uqfoundation.github.io/project/pathos/query.
Major Features
multiprocess
enables:
objects to be transferred between processes using pipes or multi-producer/multi-consumer queues
objects to be shared between processes using a server process or (for simple data) shared memory
multiprocess
provides:
equivalents of all the synchronization primitives in
threading
a
Pool
class to facilitate submitting tasks to worker processesenhanced serialization, using
dill
Current Release
The latest released version of multiprocess
is available from:
multiprocess
is distributed under a 3-clause BSD license, and is a fork of multiprocessing
.
Development Version
You can get the latest development version with all the shiny new features at:
If you have a new contribution, please submit a pull request.
Installation
multiprocess
can be installed with pip
:
$ pip install multiprocess
For Python 2, a C compiler is required to build the included extension module from source. Python 3 and binary installs do not require a C compiler.
Requirements
multiprocess
requires:
python
(orpypy
), >=3.8
setuptools
, >=42
dill
, >=0.3.9
Basic Usage
The multiprocess.Process
class follows the API of threading.Thread
.
For example
from multiprocess import Process, Queue
def f(q):
q.put('hello world')
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=[q])
p.start()
print (q.get())
p.join()
Synchronization primitives like locks, semaphores and conditions are available, for example
>>> from multiprocess import Condition
>>> c = Condition()
>>> print (c)
<Condition(<RLock(None, 0)>), 0>
>>> c.acquire()
True
>>> print (c)
<Condition(<RLock(MainProcess, 1)>), 0>
One can also use a manager to create shared objects either in shared memory or in a server process, for example
>>> from multiprocess import Manager
>>> manager = Manager()
>>> l = manager.list(range(10))
>>> l.reverse()
>>> print (l)
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> print (repr(l))
<Proxy[list] object at 0x00E1B3B0>
Tasks can be offloaded to a pool of worker processes in various ways, for example
>>> from multiprocess import Pool
>>> def f(x): return x*x
...
>>> p = Pool(4)
>>> result = p.map_async(f, range(10))
>>> print (result.get(timeout=1))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
When dill
is installed, serialization is extended to most objects,
for example
>>> from multiprocess import Pool
>>> p = Pool(4)
>>> print (p.map(lambda x: (lambda y:y**2)(x) + x, xrange(10)))
[0, 2, 6, 12, 20, 30, 42, 56, 72, 90]
More Information
Probably the best way to get started is to look at the documentation at
http://multiprocess.rtfd.io. Also see multiprocess.tests
for scripts that
demonstrate how multiprocess
can be used to leverge multiple processes
to execute Python in parallel. You can run the test suite with
python -m multiprocess.tests
. As multiprocess
conforms to the
multiprocessing
interface, the examples and documentation found at
http://docs.python.org/library/multiprocessing.html also apply to
multiprocess
if one will import multiprocessing as multiprocess
.
See https://github.com/uqfoundation/multiprocess/tree/master/py3.12/examples
for a set of examples that demonstrate some basic use cases and benchmarking
for running Python code in parallel. Please feel free to submit a ticket on
github, or ask a question on stackoverflow (@Mike McKerns). If you would
like to share how you use multiprocess
in your work, please send an email
(to mmckerns at uqfoundation dot org).
Citation
If you use multiprocess
to do research that leads to publication, we ask that you
acknowledge use of multiprocess
by citing the following in your publication:
M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis,
"Building a framework for predictive science", Proceedings of
the 10th Python in Science Conference, 2011;
http://arxiv.org/pdf/1202.1056
Michael McKerns and Michael Aivazis,
"pathos: a framework for heterogeneous computing", 2010- ;
https://uqfoundation.github.io/project/pathos
Please see https://uqfoundation.github.io/project/pathos or http://arxiv.org/pdf/1202.1056 for further information.
- Array(typecode_or_type, size_or_initializer, *, lock=True)
Returns a synchronized shared array
- exception AuthenticationError
Bases:
ProcessError
- Barrier(parties, action=None, timeout=None)
Returns a barrier object
- BoundedSemaphore(value=1)
Returns a bounded semaphore object
- exception BufferTooShort
Bases:
ProcessError
- Condition(lock=None)
Returns a condition object
- Event()
Returns an event object
- JoinableQueue(maxsize=0)
Returns a queue object
- Lock()
Returns a non-recursive lock object
- Manager()
Returns a manager associated with a running server process
The managers methods such as Lock(), Condition() and Queue() can be used to create shared objects.
- Pipe(duplex=True)
Returns two connection object connected by a pipe
- Pool(processes=None, initializer=None, initargs=(), maxtasksperchild=None)
Returns a process pool object
- class Process(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)
Bases:
BaseProcess
- static _Popen(process_obj)
- static _after_fork()
- _start_method = None
- Queue(maxsize=0)
Returns a queue object
- RLock()
Returns a recursive lock object
- RawArray(typecode_or_type, size_or_initializer)
Returns a shared array
- RawValue(typecode_or_type, *args)
Returns a shared object
- Semaphore(value=1)
Returns a semaphore object
- SimpleQueue()
Returns a queue object
- exception TimeoutError
Bases:
ProcessError
- Value(typecode_or_type, *args, lock=True)
Returns a synchronized shared object
- active_children()
Return list of process objects corresponding to live child processes
- allow_connection_pickling()
Install support for sending connections and sockets between processes
- cpu_count()
Returns the number of CPUs in the system
- current_process()
Return process object representing the current process
- freeze_support()
Check whether this is a fake forked process in a frozen executable. If so then run code specified by commandline and exit.
- get_all_start_methods()
- get_context(method=None)
- get_logger()
Return package logger – if it does not already exist then it is created.
- get_start_method(allow_none=False)
- log_to_stderr(level=None)
Turn on logging and add a handler which prints to stderr
- parent_process()
Return process object representing the parent process
- set_executable(executable)
Sets the path to a python.exe or pythonw.exe binary used to run child processes instead of sys.executable when using the ‘spawn’ start method. Useful for people embedding Python.
- set_forkserver_preload(module_names)
Set list of module names to try to load in forkserver process. This is really just a hint.
- set_start_method(method, force=False)