Problem with multiprocessing library in Python

KristaBel report abuse

I need to speed up the execution of my code. After I had written the synchronized version of my code, I have measured the execution time. The same I did after I rewrite the code and implemented multiprocessing. I have checked it many times and it seems that there are no bugs. However, the speed gain is insignificant. I expected that it should be several times faster, but it is not. What can be wrong?

Answers

PyAnto report abuse

Hi @KristaBel

Concurrency and multiprocessing are hard topics to debug and optimize. What I would check first is, whether multiprocessing is the right library for your use case. There are some tasks where it is better to use asyncio or threading packages.

Regards.

KristaBel report abuse

My task is to make the process of communication with the API and database faster. I need to perform many requests over the network and many SQL queries to my local database. Is it OK to use multiprocessing for this task?

PyAnto report abuse

Hi @KristaBel I am almost confident that multiprocessing is not the right choice here. Try to use asyncio or at least threading.

KristaBel report abuse

Ok, but could you please explain what is the problem here with using multiprocessing?

BB2Lon report abuse

Hi @KristaBel ! Actually, there are 2 types of tasks: CPU-bound and IO-bound.

  1. CPU-bound tasks are tasks where most of the execution time your processor perform some computations. When the CPU finishes execution of the given task, it starts to execute the next task, and so on. In this case, using multiprocessing you can distribute the work between different processors or cores. Each processor will work independently, without waiting for the completion of the previous tasks. Obviously, this results in the faster completion of the entire pool of tasks.
  2. Next is the one applicable in your case which is IO-bound. Each task spends much time in waiting for the response from the API or the database. Multiprocessing can make the entire program only a little bit faster because each CPU core will still wait for a long time for responses. In the same time, the preparation of several cores to run in parallel is a time-consuming step. All this in total gives the poor results in terms of execution speed. Overlapping the execution times of several tasks is what you need here. While one task will be waiting for a response, other tasks will do their work, and so on. And that should give you the speed boost. The asyncio library is generally faster, but it can be a little bit more complex than threading.
KristaBel report abuse

Thank you very much. I have implemented threading and it gave me an incredible speed up! Now I want to learn asyncio and try to improve the performance even more!

Add Answer

Need support?

Just drop us an email to ... Show more