What is a Subprocess
A subprocess is an independent process that runs alongside the main process of a program. In Python, subprocesses are commonly used to execute external commands or scripts, interact with other applications, and manage parallelism in computation.
The subprocess Module
The subprocess module, introduced in Python 2.4 and improved in Python 3, is the recommended way to manage subprocesses in Python. It provides a simple and consistent interface for creating, interacting with, and controlling subprocesses. This article will walk you through the various functions and classes provided by the subprocess module.
Running External Commands
The run() Function
The run()
function is the most straightforward way to execute an external command using the subprocess module. It takes a list of command-line arguments, runs the specified command, and waits for it to complete. Here's an example:
import subprocess
result = subprocess.run(["echo", "Hello, World!"])
Command Arguments
When passing command-line arguments to the run()
function, it's important to provide each argument as a separate item in the list. This ensures that arguments with spaces are handled correctly. For example:
import subprocess
# Incorrect: spaces in the file path will cause an error
result = subprocess.run(["ls", "/path/to directory with spaces"])
# Correct: each argument is a separate item in the list
result = subprocess.run(["ls", "/path/to", "directory with spaces"])
Handling Errors and Exceptions
By default, the run()
function raises a CalledProcessError
exception if the command returns a non-zero exit status. To handle this situation, you can use the check
parameter:
import subprocess
try:
result = subprocess.run(["false"], check=True)
except subprocess.CalledProcessError as e:
print(f"The command failed with error: {e}")
Capturing Command Output
Standard Output (stdout)
To capture the standard output (stdout) of a command, use the capture_output
parameter with the run()
function:
import subprocess
result = subprocess.run(["echo", "Hello, World!"], capture_output=True)
output = result.stdout.decode('utf-8')
print(output)
Standard Error (stderr)
Similarly, to capture the standard error (stderr) of a command, set the capture_output
parameter to True
and access the stderr
attribute of the returned object:
import subprocess
result = subprocess.run(["command_that_may_fail"], capture_output=True)
error_output = result.stderr.decode('utf-8')
print(error_output)
Redirecting Output
To redirect the output of a command to a file, use the stdout
and stderr
parameters with the run()
function:
import subprocess
with open("output.txt", "w") as output_file:
subprocess.run(["echo", "Hello, World!"], stdout=output_file)
Combining stdout and stderr
To capture both stdout and stderr in a single output, use the stderr
parameter and set it to subprocess.STDOUT
:
import subprocess
result = subprocess.run(["command_with_both_outputs"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
combined_output = result.stdout.decode('utf-8')
print(combined_output)
Advanced Subprocess Management
The Popen Class
The Popen
class provides more control over subprocesses compared to the run()
function. With Popen
, you can interact with the running process, send data to its input, and read data from its output in real-time. Here's an example of using Popen
to start a command:
import subprocess
process = subprocess.Popen(["command", "arg1", "arg2"])
Process Attributes and Methods
The Popen
instance has several attributes and methods to help you manage the running process:
pid
: The process ID of the running subprocess.returncode
: The return code of the process, if it has finished; otherwise, None.wait()
: Wait for the process to complete and return the return code.poll()
: Check if the process has completed; if so, return the return code, otherwise return None.
import subprocess
process = subprocess.Popen(["command", "arg1", "arg2"])
process.wait()
print(f"Process return code: {process.returncode}")
Process Interactions
With the Popen
class, you can send data to the process's standard input (stdin) and read data from its standard output (stdout) and standard error (stderr). To do this, set the stdin
, stdout
, and stderr
parameters to subprocess.PIPE
:
import subprocess
process = subprocess.Popen(["command", "arg1", "arg2"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# Send data to stdin
process.stdin.write(b"input data")
process.stdin.flush()
# Read data from stdout and stderr
output = process.stdout.read()
error_output = process.stderr.read()
# Close the input and wait for the process to finish
process.stdin.close()
process.wait()
Timeouts and Process Termination
Setting Timeouts
To set a timeout for a subprocess, use the timeout
parameter with the run()
function or the wait()
method of a Popen
instance:
import subprocess
try:
result = subprocess.run(["command", "arg1", "arg2"], timeout=5)
except subprocess.TimeoutExpired:
print("The command took too long to complete")
# Alternatively, with Popen
process = subprocess.Popen(["command", "arg1", "arg2"])
try:
process.wait(timeout=5)
except subprocess.TimeoutExpired:
print("The command took too long to complete")
Killing a Process
To terminate a running subprocess, use the terminate()
method of the Popen
instance:
import subprocess
process = subprocess.Popen(["command", "arg1", "arg2"])
process.terminate()
Communicating with a Process
The communicate()
method of a Popen
instance allows you to send data to the process's stdin and read data from its stdout and stderr. It also waits for the process to complete. This method is useful when you need to send and receive data without risking deadlocks:
import subprocess
process = subprocess.Popen(["command", "arg1", "arg2"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, error_output = process.communicate(input=b"input data")
print(f"Output: {output.decode('utf-8')}")
print(f"Error Output: {error_output.decode('utf-8')}")
References