File Handling is an essential component of programming. Files are used to store data permanently. Python provides a method for storing program data and performing operations on it. Opening, writing, reading, closing, overwriting, and appending files are a few examples of these operations.
In this article, you'll learn about various file-handling operations, focusing on how to overwrite a file in Python.
Before proceeding forward, let's look at the fundamentals!
What is a File in Python?
A file is a collection of data stored as a unit on the disk. It is identified by its file name and file extension. Everything is stored in the form of a file, be it excel sheets, documents, or more.
We generally deal with two types of Files in Python:
- Binary File: As the name suggests, these files store binary data (like audio files, images, and videos). These are not human-readable and are generated for machine interpretations.
- Text File: These files store data in human-readable form, each new line ending with a newline character (\n). This is used to store 'character data'.
In this article, we'll focus on text files.
You must be wondering, what is the need for Files in Python?
Need for Files in Python
Here are a few reasons highlighting the need for files while programming:
- Storing data in a file preserves it even after the termination of the program. Hence, it may store the input, computations, or output, as per commands.
- Using files to extract data (input) while dealing with huge amounts of data, saves time.
- It is easy to relocate computational data through files.
Now, that you've learned about files and why we need them, let's take a look at File Operations.
File Operations in Python
File operations are the operations that can be performed on a file. These include operations carried out by the user using Python commands (or any other programming language).
A few fundamental file operations are listed below:
- Open: The first and most important operation on a file is to open it. When you create a file, you must open it in order to do further file processing operations. Python offers an in-built open() function to open a file. The open() function returns a file object, also known as a handle, to perform further operations accordingly.
- Read: As the name suggests, this operation reads the content of a file. Python provides various methods to read a file, the most common being the read() function. Note that in order to read a file you'll need to open that file in 'read mode'.
- Write: This operation is used to write information into a file. There are various modes, that can be used, for the write operation (we'll soon discuss the different modes).
- Close: After completing all procedures, the file must be closed in order to save the data. This operation frees up all the resources used up by the file while processing. Python has a close() method to close the file.
You're probably wondering why you need to manually close the file. Isn't Python's Garbage Collector capable of performing the task?
The Garbage Collectors clean up unreferenced objects. You should not rely on a garbage collector to close the file. It could result in data loss or error. To learn more about Garbage Collectors, see Delete a Variable in Python.
Take a look at the below for a few other file operations:
File Access Modes in Python
These specify how the file will be used after it is opened. File access modes regulate the type of activities that can be performed on a file. More specifically, they specify the position of a "file handler".
A File Handler is like a pointer that indicates the position from where data should be read or written in a file. You can also assume it is a cursor, for better understanding.
File Access Modes are important to learn since they play a major role while dealing with files. They tell accessibility to the file while performing any file operation. So, before moving on to overwriting a file in Python, let's get a better understanding of file access modes in Python.
Python has six File Access Modes. They are as follows:
|1.||Read Only ('r')||
Default mode. Opens a file in Python to read. (Raises an I\O error if the file does not exist.)
|2.||Read & Write ('r+')||With this, you can read as well as write in the file.|
|3.||Write Only ('w')||It is used to write in a file. (This creates a new file if the file doesn't exist). This overwrites on an existing file.|
|4.||Write & Read ('w+')||Used for writing as well as reading an opened file in Python|
|5.||Append Only ('a')||This is used to insert data at the end of an opened file. Here, the existing data won't get truncated.|
|6.||Append & Read ('a+')||This is used to open a file for writing (at the end) and reading.|
The access methods are mentioned along with the file name in the open() function.
The syntax to open a file is:
f = open("FilePath", "access mode")
The file could be in the same or a different directory. As a result, you must take it into account while specifying the file location. It might be either an Absolute Path (starting from the root directory) or a Relative Path (lies in the same folder).
Let's take an example of how to open a file in Python:
(Note: I've created a file "favtutor.txt" in the same directory as my Python file (temp.py))
Note that the file handler 'f' calls out the read() method to read (obtain) the lines from the file.
You'll notice that the file "favtutor.txt" was created before we called out the open() function. Let's try opening another (non-existing) file in Python:
It produces an error indicating the absence of any file named 'abc.txt'.
Try opening the file using "write-only" mode. You'll notice the basic difference between read-only and write-only modes.
While the first generated an error on a non-existing file, the latter will create a file by the command.
Now that we've learned some basics of File Handling in Python, let's move on to the file operation - "overwriting" a file in Python.
What is Overwriting a File in Python?
Overwriting is the process of replacing old data with new data. It involves altering the pre-written data in a file. Overwriting a file can be understood as deleting an existing file and replacing it with a new one with the same name.
You must not confuse 'overwriting' and 'deleting' a file in Python with the same operations.
A deleted file can be recovered from the computer's memory, however, an overwritten file cannot. This occurs because an overwritten file replaces the original content of that file, causing the file to alter physically. As a result, retrieving data in this circumstance has less probability.
Refer to the image below for a better understanding of Overwriting a File in Python:
Before moving on to ways to overwrite a file in Python, remember that sometimes "overwriting a file in Python" is considered similar to "replacing a few lines in a file". But it is different from appending data in a file.
How to Overwrite a File in Python?
The need to overwrite often occurs when you need to completely alter a file in Python. The below methods contain both, methods to overwrite the complete file, as well as, methods to overwrite a few lines in Python.
So, let's get started!
01) Using write only ('w') File Access Mode
The 'write only' ('w') File Access Mode allows you to only write in the file. (Remember to specify the file access mode in the open() function.) Python includes the write function for writing to a file.
If the file contains any content, it will be completely overwritten by anything you write on the opened file. All the previous data in the file will be lost, and can't be recovered in many cases when you overwrite a file using this method.
Let's take an example to understand it better.
Consider overwriting a file called favtutor.txt. That file must be opened in write mode and assigned to file handler 'f.' The file handler 'f' can be used to perform the write() operation.
Here's a snapshot of the already existing file:
Overwriting the file in Python:
Since we cannot read the file (as it is in write-only mode), here's a snapshot of the file after overwriting-
Using the write-only access mode is the simplest way to overwrite a file in Python.
02) Using os.remove() function
This is another method to overwrite a complete file in Python. This includes deleting the existing file and creating a new file with the same credentials. While it indirectly overwrites the file, it isn't recommended to use this method often. This might change the inode number of the file.
Inodes store information about files and directories (folders), such as file ownership, access mode (read, write, execute permissions), and file type. Each file is connected with an inode, which is identifiable by an integer known as an i-number or inode number.
Note that this requires importing the os module.
Let's try removing a file (abc.txt) in Python to overwrite it.
The file abc.txt doesn't exist, hence we encounter an error on trying to remove that. Before removing a file, you should check whether the file exists or not. If the file doesn't exist, it will produce an error. Hence, to check whether the file exists or not, the os module has os.path.exists() function.
(Again on a file that does not exist!) Example:
The if-else block saves us from encountering an error. We can also open the file using 'w+' or 'w' access modes in the else block. These modes create a new file if the file doesn't exist.
Now, let's take an example of overwriting an existing file in Python using os.remove() function:
I've opened the file at the end to read the content of the file. Note that you need to close and open a file again in order to change its access mode.
03) Using seek() and truncate() function
This method can be used to overwrite a file (completely or a particular line) in Python. This method involves two functions :
- The seek() function: This function sets the handler (pointer) at the beginning of the file. This is called upon to ensure that the handler stays at the start of the file, hence by default it is set at 0.
- The truncate() function: This function removes the data of the file.
While calling upon the seek() function, you can add your data to the file using the write() function, and then truncate the file at the end.
Note that the file is overwritten by new data.
04) Using replace() method
This method comes under overwriting a specific phase in an existing file. Partially, this method of overwriting a file in Python involves the use of the 'write only' mode. Here, we store the data in another variable, making a few replacements in it, using replace() method. This new variable is then called upon to overwrite the file.
Firstly, you need to read the file in order to store the data in another variable (here, variable named content). Then replace() method is applied to the stored data. This method requires sequence matching in order to replace the existing data with another.
Note how the words "File" is 'overwritten' by "Data". The above method also answers the question - "How to replace a string in a file in Python? "
05) Using re.sub() function
This is another way of overwriting a file in Python. It is similar to replacing data in a file. This method requires the sub() function of the Regular Expression module (re module). Recall that the sub() function returns a string in which the replace string replaces all matching occurrences of the given pattern. Hence, this also sequentially matches the characters to find a match.
You'll also encounter some new functions here:
- Path(): This function is imported from the pathlib module. It returns the path of a directory/file.
- write_text(): This function opens a file to read, writes text in it, and closes the file.
- read_text(): This function opens a file, reads it, and closes it.
So, the write_text() and read_text() functions don't need to declare commands to open or close the file separately.
Let's take an example to overwrite a file in Python using re module:
Note that the data (string here) is case-sensitive. Since the sub() function matches the text, it only overrides 'file' and not 'File' in the above example.
In this article, we've talked about some of the easiest ways to overwrite a file in Python. Apart from the above methods, there is another method to overwrite a file in Python.
It requires relocating a file (say abc.txt) to a directory having a file of the same name (abc.txt, the file to be overwritten). This method uses the shutil module. Though it is not recommended, since it's basically replacing a file with another with the help of default naming conventions in a computer.
Though I've mostly used 'read only' and 'write only' modes, you should try these methods with other file access modes. You might get surprised by the output. Happy Coding!