As I work to master the skills of machine learning and data science with Kaggle's 30 Days of ML course and IBM's Data Science Professional Certificate, I decided to concurrently embark on a journey into the world of Formula 1 data to refine and apply the skills while I learn.
In order to do this I have decided to follow this outline:
First, we must import the libraries that will be used in this project.
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 130)
import matplotlib.pyplot as plt
%matplotlib inline
To load our data,
def load_data(path):
try:
df = pd.read_csv(path)
return df
except IOError as e:
print("There was an error reading data: " + str(e))
return
dataFolder_path = 'F1Data/F1RaceDataFrom1950to2021'
raw_pathDrivers = dataFolder_path + '/drivers.csv'
df_rawDrivers = load_data(raw_pathDrivers)
df_rawDrivers
raw_pathRaces = dataFolder_path + '/races.csv'
df_rawRaces = load_data(raw_pathRaces)
df_rawRaces
df_rawRaces21 = df_rawRaces.loc[df_rawRaces['year'] == 2021]
df_rawRaces21