Why does a Python script only use 6 out of 12 cores?

3 min read 28-10-2024
Why does a Python script only use 6 out of 12 cores?

When running a Python script on a multi-core processor, you might wonder why it only utilizes a portion of its available cores. For instance, if your machine has 12 cores, you might see that your script is only making use of 6. This situation can be perplexing, especially when you expect a parallel workload to effectively utilize all available cores.

The Original Problem Scenario

To provide context, let's consider the following code snippet that represents a typical parallel workload in Python.

import multiprocessing

def worker(num):
    """Thread worker function"""
    print(f'Worker: {num}')

if __name__ == '__main__':
    jobs = []
    for i in range(12):
        p = multiprocessing.Process(target=worker, args=(i,))
        jobs.append(p)
        p.start()

In this code, we create 12 processes intended to run simultaneously. However, you may observe that only 6 cores are actively processing the tasks at any given time.

Analyzing Core Utilization

GIL and Python's Architecture

One of the primary reasons Python scripts may not utilize all available cores is due to the Global Interpreter Lock (GIL). The GIL is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes simultaneously. Because of the GIL, Python threads cannot truly run in parallel on multiple cores. As a result, even though multiple processes are spawned, they can still be limited in their effective parallel execution when they compete for CPU resources.

Multiprocessing vs. Multithreading

Using the multiprocessing module, as shown in the example above, allows Python scripts to bypass the GIL limitation by spawning separate processes instead of threads. However, keep in mind that the actual usage of all CPU cores depends on:

  1. CPU-bound vs. I/O-bound Tasks: If your tasks are CPU-bound, they require substantial processing power and can more effectively use multiple cores. Conversely, I/O-bound tasks may involve waiting for external resources (like disk access), leading to less CPU core utilization.

  2. Task Granularity: If your individual tasks are too lightweight, the overhead of process management might prevent effective utilization of all cores. Larger, more computationally intensive tasks are more likely to saturate the available CPU resources.

  3. System Limits: The operating system's resource scheduler may limit the number of active processes based on various factors, including priority levels, resource allocation policies, and more.

Practical Examples

Example of Efficient Core Utilization

To achieve better utilization of all cores, consider modifying your workload to ensure that individual tasks are sufficiently heavy. For instance:

import multiprocessing
import time

def heavy_computation(num):
    """Simulate a heavy computation"""
    time.sleep(2)  # Simulates a computation time
    print(f'Completed task: {num}')

if __name__ == '__main__':
    jobs = []
    for i in range(12):
        p = multiprocessing.Process(target=heavy_computation, args=(i,))
        jobs.append(p)
        p.start()

In this revised example, each process simulates a heavy computation. By ensuring that tasks are more resource-intensive, you are more likely to keep all 12 cores busy.

Monitoring Core Utilization

To monitor your script's CPU utilization, you can use tools such as:

  • top or htop (Linux): For real-time monitoring of CPU usage.
  • Task Manager (Windows): To check which processes are using CPU resources and how many cores are active.
  • psutil library in Python: To programmatically check CPU utilization within your scripts.

Conclusion

Understanding why your Python script may not be utilizing all available cores is crucial in optimizing performance. The GIL, task characteristics, and system limitations all play a role in core utilization. By designing your tasks to be more CPU-bound and employing the multiprocessing module thoughtfully, you can ensure more effective use of your system's resources.

Additional Resources

By taking advantage of these insights, you can better harness the power of multi-core systems when executing Python scripts.