Load a CSV as dataframe with Pandas

In days gone by, when it came to wrangling with tabular data, my first port of call would have been to load the data in Excel and slog it out for as long as it took. Now, I use Pandas to wrangle tabular data.

Having used Pandas for a while now, I've come to appreciate that dealing with larger quantities in data in Excel is made difficult by two things in particular. First, memory is an issue - try using Find and Replace to remove a few thousand commas and you'll notice that your computer begins to run out of puff. Second, Excel formulas aren't a patch on the power of the Python language, the power of which can be brought to bear on a Pandas dataframe.

Load a CSV as a dataframe

df = pd.read_csv('name_of_file.csv')

Load specific columns in the CSV as a dataframe

df = pd.read_csv('name_of_file', usecols=['ColumnName', 'AnotherColumnName', 'Etcetera'])

Drop NaN Values from the Dataframe

df = pd.read_csv('name_of_file', usecols=['ColumnName', 'AnotherColumnName', 'Etcetera']).dropna()

Load a CSV as a dataframe and specify a column as the index 

mydf = pd.read_csv('name_of_file.csv', index_col='ColumnName')