Try   HackMD

python thread and multiprocessing

tags: python thread process profiling python performance asyncio

python 线程,GIL 和 ctypes
描述python的GIL歷史,三種 真多工 :sweat_smile: 解決方法
分別為:

  • multiprocessing
  • ctypes
  • c extending

官方文件

multiprocessing — Process-based “threading” interface

以下只有python3才有支援

2.7:QQ

concurrent.futures — Launching parallel tasks

ThreadpoolExecutor
PeocessPoolExecutor
concurrent.futures

process example

#! /usr/bin/env python # -*- coding: utf-8 -*- # vim:fenc=utf-8 # Last modified: 2016-10-01 16:50:37 from multiprocessing import Process from time import sleep class testCount(Process): def __init__(self, num): super(testCount,self).__init__() self.num = num self._lock = True def printnum(self): print("num is:", self.num) def run(self): while self._lock: self.num += 1 self.printnum() sleep(20) def stopped(self): self._lock = False def read_fifo(filename): while True: with open(filename, "r") as fifo: yield fifo.readline() if __name__ == "__main__": Test = testCount(1) Test.start() #must be start important Test.isDaemon(True) print("start reading fifo...") count = 3 reader = read_fifo() while count < 0: data = reader.next() print("data:", data) count -= 1 Test.stopped() Test.join()

multitasking python

https://github.com/bfortuner/ml-study/blob/master/multitasking_python.ipynb

api with block(like urlopen)

multiprocess:Should see good benefit
multithread:Should see good benefit

IO heavy

multithread:Should see good benefit
multiprocess:Should see good benefit

在IO下兩者在多tasks中都有良好的效果

Numpy Functions

numpy addition

multithread:yes,a little
multiprocess:yes, a little

Dot Product

numpy:very good
multithread:No benefit
multiprocess:No benefit

numpy function直接使用numpy的function會帶來更好的效果

CPU Intensive

multithread:No benefit
multiprocess:very good

CPU Intensive下的multiprocess表現較優

Resize Images

Pillow(The friendly PIL fork (Python Imaging Library))
multithread:good
multiprocess:good

  • CPU Bound => Multi Processing
  • I/O Bound, Fast I/O, Limited Number of Connections => Multi Threading
  • I/O Bound, Slow I/O, Many connections => Asyncio

asyncio benchmark

https://github.com/python/asyncio/wiki/Benchmarks
https://docs.python.org/3/library/asyncio-task.html

import asyncio async def compute(x, y): print("Compute %s + %s ..." % (x, y)) await asyncio.sleep(1.0) return x + y async def print_sum(x, y): result = await compute(x, y) print("%s + %s = %s" % (x, y, result)) loop = asyncio.get_event_loop() loop.run_until_complete(print_sum(1, 2)) loop.close()

Parallel execution of tasks

import asyncio

async def factorial(name, number):
    f = 1
    for i in range(2, number+1):
        print("Task %s: Compute factorial(%s)..." % (name, i))
        await asyncio.sleep(1)
        f *= i
    print("Task %s: factorial(%s) = %s" % (name, number, f))

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(
    factorial("A", 2),
    factorial("B", 3),
    factorial("C", 4),
))
loop.close()
Output:

Task A: Compute factorial(2)...
Task B: Compute factorial(2)...
Task C: Compute factorial(2)...
Task A: factorial(2) = 2
Task B: Compute factorial(3)...
Task C: Compute factorial(3)...
Task B: factorial(3) = 6
Task C: Compute factorial(4)...
Task C: factorial(4) = 24