Skip to content Skip to sidebar Skip to footer

Pandas Read_csv() Conditionally Skipping Header Row

I'm trying to read a csv file but my csv files differ. Some have different format and some have other. I'm trying to add controls so that I will not need to edit my code or my inpu

Solution 1:

If the headers in your CSV files follow a similar pattern, you can do something simple like sniffing out the first line before determining whether to skip the first row or not.

filename = '/path/to/file.csv'skiprows = int('Created in' in next(open(filename)))
df = pd.read_csv(filename, skiprows=skiprows)

Good pratice would be to use a context manager, so you could also do this:

filename = '/path/to/file.csv'
skiprows = 0withopen(filename, 'r+') as f:
    for line in f:
        if line.startswith('Created '):
            skiprows = 1break
df = pd.read_csv(filename, skiprows=skiprows)

Solution 2:

You can skip rows which start with specific character while using 'comment' argument in pandas read_csv command. In your case you can skip the lines which starts with "C" using the following code:

filename = '/path/to/file.csv'
pd.read_csv(filename, comment = "C")

Post a Comment for "Pandas Read_csv() Conditionally Skipping Header Row"