Methods of Renaming Columns in DataFrame
Renaming DataFrame columns is a common operation in data analysis, often used for making column names more understandable, following a specific naming convention, or replacing non-standard characters with standard ones. In this article, I will go through two primary methods to rename columns in a DataFrame.
Directly Modifying the DataFrame.columns Attribute
The first method involves directly assigning a new list of column names to the DataFrame.columns
attribute. This can be done in the following way:
df.columns = ['new_colname1', 'new_colname2', ..., 'new_colnameN']
In this approach, a new list of column names is created and assigned to the DataFrame's columns attribute. The number of names in the list must match the number of columns in the DataFrame, and the names should be in the same order as the original column names. This method works best when all column names need to be changed, and the number of columns is manageable.
Using the DataFrame.rename() Method
For a more flexible approach, you can use the DataFrame.rename()
method, which allows you to specify which columns you want to rename. This can be especially useful when you only need to change a few column names. Here's how to use this method:
df.rename(columns={'old_colname1': 'new_colname1', 'old_colname2': 'new_colname2'}, inplace=True)
In this code, a dictionary is passed to the columns
parameter of the rename()
method. Each key-value pair in the dictionary corresponds to an old column name and its new name. The inplace=True
parameter means that the changes are applied directly to the original DataFrame. If inplace=False
is used (which is the default), then the method returns a new DataFrame with the renamed columns, and the original DataFrame remains unchanged.