In this post we will be creating a Python script that will select multiple columns from a Pandas data frame, Pandas is a Python library which is an amazing tool for data analysis and manipulation.
In this example our data from will be a CSV file (Sample.csv) that we will be reading from. A sample of the data can be seen in the image below, the column headings have been marked in bold text.
To load the CSV data in to a Pandas data frame we would use the snippet of code below.
import pandas as pd df = pd.read_csv('Sample.csv')
Now that we have a data frame we can move on to selecting multiple columns. In the snippet of code below we will be selecting our columns using their names as we know what they are – Seen in the above image in bold.
Below we are retrieving the ‘Name’ and ‘Platform’ columns from our data frame.
MultipleColumns = df[['Name', 'Platform']] print(MultipleColumns)
If for any reason you are unaware of the column headings or your data frame does not have column headings you may index the headings numerically instead. See the code below where we are retrieving the same columns via their numerical index.
MultipleColumns = df[df.columns[1:3]] print(MultipleColumns)