Skip to content Skip to sidebar Skip to footer

Cannot Interpolate Dataframe Even If Most Of The Data Is Filled

I tried to interpolate the NaN in my DataFrame using interpolate() method. However, the method failed with error : Cannot interpolate with all NaNs. Here's the code: try: df

Solution 1:

Check that your DataFrame has numeric dtypes, not object dtypes. The TypeError: Cannot interpolate with all NaNs can occur if the DataFrame contains columns of object dtype. For example, if

import numpy as np
import pandas as pd

df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')}, 
                  index=['2016-01-21 20:06:22', '2016-01-21 20:06:23', 
                         '2016-01-21 20:06:24'])

then df.interpolate() raises the TypeError.

To check if your DataFrame has columns with object dtype, look at df3.dtypes:

In[92]: df.dtypesOut[92]: 
Aobjectdtype: object

To fix the problem, you need to ensure the DataFrame has numeric columns with native NumPy dtypes. Obviously, it would be best to build the DataFrame correctly from the very beginning. So the best solution depends on how you are building the DataFrame.

A less appealing patch-up fix would be to use pd.to_numeric to convert the object arrays to numeric arrays after-the-fact:

for col indf:
    df[col] = pd.to_numeric(df[col], errors='coerce')

With errors='coerce', any value that could not be converted to a number is converted to NaN. After calling pd.to_numeric on each column, notice that the dtype is now float64:

In[94]: df.dtypesOut[94]: 
Afloat64dtype: object

Once the DataFrame has numeric dtypes, and the DataFrame has a DatetimeIndex, then df.interpolate(method='time') will work:

import numpy as np
import pandas as pd

df = pd.DataFrame({'A':np.array([1,np.nan,30], dtype='O')}, 
                  index=['2016-01-21 20:06:22', '2016-01-21 20:06:23', 
                         '2016-01-21 20:06:24'])

for col indf:
    df[col] = pd.to_numeric(df[col], errors='coerce')
df.index = pd.DatetimeIndex(df.index)
df = df.interpolate(method='time')
print(df)

yields

A2016-01-21 20:06:22   1.02016-01-21 20:06:23  15.52016-01-21 20:06:24  30.0

Solution 2:

I had a similar problem, recreated the dataframe with definition of dtype as float (e.g. dtype='float32'). it fixed.

df = pd.DataFrame(data = df.values, columns= cols, dtype='float32')

Post a Comment for "Cannot Interpolate Dataframe Even If Most Of The Data Is Filled"