Q:

Import multiple csv files into pandas and concatenate into one DataFrame

belongs to collection: Python Pandas Programs

0

Importing CSV files in pandas is pretty simple. The DataFrame.read_csv() method is used to import the CSV file in the pandas. This CSV file is then converted into the DataFrame for further analysis. Sometimes, a single dataset does not contain all the required data, in that case we might need to import multiple CSV files and then concatenate into one DataFrame.

Note: To import CSV file, you must have a CSV file in your computer. 

To import multiple CSV files, we need to apply a check with the help of which we can get all the required files. The glob.glob() method returns a list of all the files containing some component of file name which is passed as a parameter inside it.

The glob.glob() Method

This method takes the path of the folder where all the required files are located. Secondly, it takes the string as a parameter which works as an identification of the required file.

Syntax:

req_files = glob.glob("C:/Users/hp/Desktop/Includehelp/*.csv")

Here, glob.glob() returns a list of all the files having ".csv" in its name.

pandas.concat() Method

This method just combines the datasets passed inside it as a parameter either along the rows or the columns. The list of path of datasets is passed as a parameter.

To work with MultiIndex in Python Pandas, we need to import the pandas library. Below is the syntax,

import pandas as pd

All Answers

need an explanation for this answer? contact us directly to get an explanation for this answer

Let us understand with the help of an example.

# Importing pandas package
import pandas as pd

# Importing glob library in order to get 
# a list of all the required files
import glob

# First Importing separate datasets
data1 = pd.read_csv('C:/Users/hp/Desktop/Includehelp/mycsv.csv')
data2 = pd.read_csv('C:/Users/hp/Desktop/Includehelp/mycsv1.csv')

# Creating separate DataFrames with 
# these two files
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

# Printing separate DataFrames
print("First DataFrame:\n",df1,"\n\n")
print("Second DataFrame:\n",df2,"\n\n")

# Requesting a list of all the csv files present 
# in a specified folder
req_files = glob.glob("C:/Users/hp/Desktop/Includehelp/*.csv")

# Importing multiple CSV files and 
# concatenating them
data = pd.concat(map(pd.read_csv, req_files,),ignore_index=True)

# Creating the dataframe for 
# the combined data
df3=pd.DataFrame(data)

# Printing the combined DataFrame
print("Combined Dataframe:\n",df3)

Output:

Import multiple csv files

need an explanation for this answer? contact us directly to get an explanation for this answer

total answers (1)

Python Pandas Programs

This question belongs to these collections

Similar questions


need a help?


find thousands of online teachers now
How to avoid Pandas creating an index in a saved C... >>
<< How to Use \'NOT IN\' Filter in Pandas?...