Skip to content Skip to sidebar Skip to footer

Python Pandas For Reading In File With Date

In the dataframe below, the 3rd line is the header and the Y, M and D columns are giving year month and day respectively. However, I am not able to read them in using this code: df

Solution 1:

The default separator in read_csv is a comma. Your file doesn't use commas as separators, so you're only getting one big column:

>>> pd.read_csv(file_name, skiprows = 2)
       Y   M   D     PRCP     VWC1    
0   2006   1   1      0.0  0.17608E+00
1   2006   1   2      6.0  0.21377E+00
2   2006   1   3      0.1  0.22291E+00
3   2006   1   4      3.0  0.23460E+00
4   2006   1   5      6.7  0.26076E+00
>>> pd.read_csv(file_name, skiprows = 2).columns
Index([u'    Y   M   D     PRCP     VWC1    '], dtype='object')

You should be able to use delim_whitespace=True:

>>> df = pd.read_csv(file_name, skiprows = 2, delim_whitespace=True,
                     parse_dates={"datetime": [0,1,2]}, index_col="datetime")
>>> df
            PRCP     VWC1
datetime                 
2006-01-01   0.0  0.17608
2006-01-02   6.0  0.21377
2006-01-03   0.1  0.22291
2006-01-04   3.0  0.23460
2006-01-05   6.7  0.26076
>>> df.index
<class 'pandas.tseries.index.DatetimeIndex'>
[2006-01-01, ..., 2006-01-05]
Length: 5, Freq: None, Timezone: None

(I didn't specify the date_parser, because I'm lazy and this would be read correctly by default, but it's actually not a bad habit to be explicit.)


Post a Comment for "Python Pandas For Reading In File With Date"