As a regular Linux user, I’ve been intrigued by the simplicity of the UNIX philosophy, which states, “Have one tool, and have that tool do its job well.” In this post, I will dive into how the
cp command works in Linux, followed by a basic implementation of the command in Python for a deeper understanding.
cp Command Works?
When you type
cp into your terminal, you’re invoking a program that’s part of the GNU Core Utilities, equipped to handle a variety of file system operations.
Here’s the typical workflow of the cp command:
- Argument Parsing: The command-line arguments, including any options (like
-rfor recursive copy or
-ifor interactive prompts), are parsed.
- Path Resolution: The command resolves the absolute or relative paths provided for the source and destination.
- Permission Checks: Before proceeding,
cpchecks if you have the read permission for the source file and the write permission for the destination directory.
- File Opening: Utilizing the
cpopens the source file for reading. If the destination file doesn’t exist, it’s created using the
creat()system call; otherwise, it’s opened with
- Data Transfer: Through a loop that uses
write()system calls, data is transferred from the source to the destination file in chunks. This way,
cpefficiently handles files of any size without consuming excessive memory.
- Metadata Copying: The command duplicates the source file’s metadata, such as permissions and timestamps, to the new file using system calls like
- Resource Cleanup: After the copy operation, cp closes both file descriptors using the
Now, let’s dive deep into the system level to understand how system calls are utilized at a lower level when using the
System Calls: Diving Deeper into the internals of
System calls provide the interface between a running process and the operating system. Here’s a closer look at the ones
open() system call is used by
cp to obtain a file descriptor for the source and destination files. This system call takes a
flags as arguments, determining how the file should be accessed. When copying a file,
cp opens the source file in read-only mode
(O_RDONLY) to ensure the file is not modified. If the destination file does not exist,
open() with the
O_CREAT flag to create it, and with
O_WRONLY to open it for writing.
After opening the source file,
cp uses the
read() system call to read data from the file into a buffer. This buffer temporarily stores the data as it’s being copied. The
read() function takes three arguments: the file descriptor, the buffer into which the data is read, and the number of bytes to read.
write() is the system call used to transfer data from the buffer to the destination file. It takes a file descriptor for the destination file, the buffer with the data, and the number of bytes to write from the buffer.
cp will repeatedly read from the source and write to the destination until all data is copied.
Once the copy operation is complete,
cp needs to release the file descriptors so they can be reused by the system. The
close() system call is used for this purpose, closing both the source and destination file descriptors.
Copying a file also involves duplicating its permissions. The
fchmod() system call is used by
cp to set the permissions of the destination file to match those of the source file. It requires the file descriptor of the open file and the mode (permission settings) to be applied.
futimens() system call allows
cp to preserve the timestamps of the source file, setting the access and modification times of the destination file to match. It takes a file descriptor and an array of timespec structures representing the new times.
creat() is worth mentioning as it’s often used as a shorthand for
open() with flags set to create a new file or rewrite an old one. It’s equivalent to
These system calls are the building blocks that allow
cp to function, orchestrating the process of duplicating file content, permissions, and timestamps from one location to another within linux.
cp in Python:
The script reads the source file in manageable chunks (1KB in this case) and writes these chunks to the destination file, ensuring that the content is preserved during the copy process.
If the destination path is a directory, the script appends the name of the source file to the destination path to maintain the original filename in the new location. This behavior mimics the standard
cp command in Linux when the target is a directory.
cp command is a great example of the Unix philosophy: simple tools that do one thing well. By understanding the system calls it leverages, we gain insight into the operating system’s inner workings. Moreover, by implementing its functionality in Python, we can appreciate the power and simplicity provided by high-level programming languages.
While our Python script does not cover all features of
cp, such as recursive copying or interaction with the user, it serves as an understanding of the
Finally, I hope you enjoyed reading this and had the opportunity to learn something new. If you have any feedback, please feel free to leave a comment below. If you prefer not to comment publicly, you can always send me an email.
If you’re interested in advancing your programming skills and would love to learn how to build cool projects like Docker, BitTorrent, or even understand the internals of your favorite tools such as
Git by recreating them in your preferred programming language, I highly recommend you join Code Crafters.
If you loved this post, you can always support my work by buying me a coffee. your support would mean the world to me! Also, if you end up sharing this on X, definitely tag me @muhammad_o7. Also follow me on LinkedIn
Note: If you like to be notified about the upcoming posts you can subscribe to the RSS or you can leave your email here