Login Sign Up
Python Multiprocessing
Chapter 38 🟡 Intermediate

Python Multiprocessing

Master the concept step by step with clear explanations, examples, and code you can run.

Advanced Python Multiprocessing: Unlocking True CPU Parallelism

Hello there! Welcome back. Grab seat, because I am incredibly excited for today's lesson.

Over our past few sessions we have covered some truly massive concepts. We learned that as the software developer, managing multiple tasks at the exact same time is probably constant balancing act—honestly, it's very much like parenting!

We previously mastered I/O-bound tasks, which are tasks that spend most of their time just waiting around, while we explored the legendary Requests library, the tool so popular that it pulls in around 300 million downloads every single week and is depended upon by over 4 million code repositories, while we built robust scripts that use custom metadata headers to gracefully bypass API bot blockers. We also practiced the safe deserialization of JSON data so our dashboards never crash when a server randomly returns the HTML error page [10, 13-22].

We also unlocked the magic of modern concurrency. We saw how an asyncio event loop uses coroutines and the await keyword to yield control back towards the system while waiting. We even bridged the gap between our blocking web requests and our fast async code using the brilliant asyncio.to_thread() function, and

because standard threads are basically absolutely perfect for tasks that wait on internet, we successfully solved the problem of our programs freezing [29, 38-40].

But today, we face completely new challenge.

Concurrency Conundrum: The Problem with the GIL

What happens when your program is not waiting?

What if you want for build heavy desktop application for parallel image processing? Or what if you need to run complex, heavy mathematical calculations, and

this brings us for famous concurrency conundrum. Python has the built-in architectural safety mechanism called the Global Interpreter Lock (GIL).

Think with the GIL like the single "talking stick" in the classroom, and even if you have really 10 brilliant students (which represent your threads), only the person holding the stick is allowed to speak. Because of the GIL if you try to use a ThreadPoolExecutor of heavy CPU-bound work, your threads are restricted and cannot achieve true parallelism; your computer's brain will actually stubbornly only execute one calculation at the time.

Solution: Enter Python Multiprocessing

To completely bypass the GIL and unlock the true parallel power of your computer's CPU cores, we must use a different approach, and

we use Python Multiprocessing.

Instead of trying to put multiple threads inside one single program multiprocessing literally copies your entire Python program and creates brand new, independent processes.

In our classroom analogy, multiprocessing doesn't just add more students to the room. It builds entirely new classrooms! Each new classroom has its own teacher, its own students, and most importantly, its own talking stick (its own GIL). Because they're actually totally independent, they can all run at the exact same physical time, and

here is really visual map of how this powerful architecture works under the hood:

graph TD;
    A[Main Python Program] -->|Spawns New Process| B(Worker Process 1: Has its own GIL)
    A -->|Spawns New Process| C(Worker Process 2: Has its own GIL)
    A -->|Spawns New Process| D(Worker Process 3: Has its own GIL)

    B --> E[CPU Core 1]
    C --> F[CPU Core 2]
    D --> G[CPU Core 3]

    style A fill:#4CAF50,stroke:#333,stroke-width:2px,color:#fff;
    style B fill:#2196F3,stroke:#333,color:#fff;
    style C fill:#2196F3,stroke:#333,color:#fff;
    style D fill:#2196F3,stroke:#333,color:#fff;

Modern Architectural Patterns (2024-2025 Updates)

So how do professionals actually write this in the real world?

According to the January 2024 tutorial at parallel programming, one with the absolute best tools you can use is the multiprocessing.Pool class. It's basically incredibly useful for instantly parallelizing the execution of the function across a large list of input values.

But let's look at cutting-edge production data, and a recent December 30, 2024 deep dive explored how developers are running parallel processes inside larger loops; experts are specifically using a pool.starmap function to execute complex calculations, such as running metpy scientific package on 1D arrays. The starmap function is brilliant because it allows you to pass multiple arguments into your parallel functions effortlessly, while

eventually, your workload might become so massive that one single computer cannot handle it. October 23, 2024 guide by InfoWorld highlights that there are now seven leading Python frameworks available that allow you to spread the existing application across multiple cores or even across multiple separate machines.

Writing Production-Grade Code

Let's write some professional code, while imagine we are actually building that desktop application for image processing and we want to apply heavy graphical filters to four different images at an exact same time.

import multiprocessing
import time

def process_image(image_name, filter_type):
    """A heavy CPU-bound task simulating parallel image processing."""
    print(f"Processing {image_name} with the {filter_type} filter...")

    # Simulating heavy, intense mathematical work that locks the CPU
    time.sleep(2) 

    return f"{image_name} DONE"

if __name__ == "__main__":
    # We have a list of tasks. Notice each task has multiple arguments!
    tasks = [
        ("photo1.jpg", "grayscale"),
        ("photo2.jpg", "blur"),
        ("photo3.jpg", "sharpen"),
        ("photo4.jpg", "sepia")
    ]

    # 1. We create a 'Pool' of worker processes based on our CPU cores
    with multiprocessing.Pool() as pool:

        # 2. We use the cutting-edge 'starmap' pattern to pass our multiple arguments
        results = pool.starmap(process_image, tasks)

    print("All parallel image processing is complete!")
    print("Final Results:", results)

Look at how clean that is! By using a Pool, Python automatically checks how many CPU cores your computer has, builds the separate "classrooms," and divides a heavy work evenly.

What's Next?

You did the amazing job today.

You now get the profound difference between waiting for the internet (where we use asyncio and standard threads) and doing heavy brain work (where we've got to use multiprocessing for bypass the GIL). You're basically truly starting to write code like an expert.

But as our parallel applications get larger and more complex, how do we make sure we don't accidentally pass a wrong type of data into these worker functions?

On our next chapter, we're basically going to dive into Python Type Hints (3.12+). We will cover it next, and I promise you, it'll make your code safer, smarter, and easier to read than ever before; see you there!

Learn Together
Session active! Discuss with other learners.
No notes yet. Select text in the concept body to add a note.