2022-11-18

Converting Pandas DataFrame to Dict

Introduction

In this article, I will discuss the process of converting a Pandas DataFrame to a dictionary.

to_dict Method

The main method to convert a Pandas DataFrame into a dictionary is by using the to_dict() function. The syntax of this function is as follows:

python

dataframe.to_dict(orient='dict', into=dict)

This method provides flexibility in the output format through the use of the orient and into parameters.

Orient

We will explore the orient parameter and the possible values.

dict

Setting the orient parameter to dict will create a dictionary of dictionaries, with the keys of the outer dictionary representing the column names, and the inner dictionaries containing the corresponding data.

python

import pandas as pd

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

result = df.to_dict(orient='dict')
print(result)

{'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 4, 1: 5, 2: 6}}

list

When orient is set to list, the resulting dictionary will have the column names as keys and the column data as lists of values.

python

result = df.to_dict(orient='list')
print(result)

{'A': [1, 2, 3], 'B': [4, 5, 6]}

series

With orient set to series, the output will be a dictionary of Pandas Series objects, with the column names as keys.

python

result = df.to_dict(orient='series')
print(result)

{'A': 0    1
1    2
2    3
Name: A, dtype: int64, 'B': 0    4
1    5
2    6
Name: B, dtype: int64}

split

The split orientation generates a dictionary with three keys: index, columns, and data. The values for these keys are the index labels, column names, and data values, respectively.

python

result = df.to_dict(orient='split')
print(result)

{'index': [0, 1, 2], 'columns': ['A', 'B'], 'data': [[1, 4], [2, 5], [3, 6]]}

records

When orient is set to records, the output is a list of dictionaries, with each dictionary representing a row in the DataFrame. The keys in each dictionary correspond to the column names.

python

result = df.to_dict(orient='records')
print(result)

[{'A': 1, 'B': 4}, {'A': 2, 'B': 5}, {'A': 3, 'B': 6}]

index

Setting the orient parameter to index creates a dictionary of dictionaries, with the outer dictionary's keys representing the index labels and the inner dictionaries containing the corresponding data.

python

result = df.to_dict(orient='index')
print(result)

{0: {'A': 1, 'B': 4}, 1: {'A': 2, 'B': 5}, 2: {'A': 3, 'B': 6}}

Converting DataFrames into OrderedDict

By default, the to_dict() function returns a standard Python dictionary. However, you can also convert the DataFrame into an OrderedDict by setting the into parameter to collections.OrderedDict. OrderedDict maintains the order of keys in the dictionary, which may be useful in certain scenarios.

Let's take a look at an example of converting a DataFrame to an OrderedDict with the orient parameter set to dict.

python

import pandas as pd
from collections import OrderedDict

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

result = df.to_dict(orient='dict', into=OrderedDict)
print(result)

OrderedDict([('A', {0: 1, 1: 2, 2: 3}), ('B', {0: 4, 1: 5, 2: 6})])

As you can see, the output is an OrderedDict with the column names ('A' and 'B') as keys and the corresponding data as inner dictionaries. The key order is maintained in the OrderedDict.

You can also convert the DataFrame to an OrderedDict using other orient values. For example, let's convert the DataFrame with orient set to records:

python

result = df.to_dict(orient='records', into=OrderedDict)
print(result)

[OrderedDict([('A', 1), ('B', 4)]), OrderedDict([('A', 2), ('B', 5)]), OrderedDict([('A', 3), ('B', 6)])]

In this case, the output is a list of ordered dictionaries, with each dictionary representing a row in the DataFrame. The key order is preserved within each dictionary.

Time Series Data with Pandas

Pandas DataFrame Normalization

Descriptive Statistics

Differential Equation

Dimensionality Reduction

Discrete Choice Model

Google Search Console

Hugging Face

Hypothesis Testing

Inferential Statistics

Probability Distribution

Ryusei Kakujo

Weave the future of cities through data

Transportation modeling/ Urban planning/ Machine learning/ Computer science/ GIS