Skip to content Skip to sidebar Skip to footer

Sort Dataframe Multiindex Level And By Column

Updated: pandas version 0.23.0 solves this problem with Sorting by a combination of columns and index levels I have struggled with this and I suspect there is a better way. How

Solution 1:

Here are some potential solutions for your needs:

Method-1:

 (df.sort_values('value_1', ascending=False)
    .sort_index(level=[0], ascending=[True]))

Method-2:

 (df.set_index('value_1', append=True)
    .sort_index(level=[0,2], ascending=[True,False])
    .reset_index('value_1'))

Tested on pandas 0.22.0, Python 3.6.4

Solution 2:

Here is my ugly option:

In [139]: (df.assign(x=df.index.get_level_values(0) * \
                       10**np.ceil(np.log10(df.value_1.max()))-df.value_1)
             .sort_values('x')
             .drop('x',1))
Out[139]:
            MyName  value_1
idx_0 idx_1
111         S        51          C        47          O        39          T        23          T        126          B       114          O       102          S        910         T        818         O        75          N        6

some explanations:

In [140]: np.ceil(np.log10(df.value_1.max()))
Out[140]: 2.0

In [141]: df.assign(x=df.index.get_level_values(0)*10**np.ceil(np.log10(df.value_1.max()))-df.value_1)
Out[141]:
            MyName  value_1      x
idx_0 idx_1
26          B       11189.04          O       10190.02          S        9191.010         T        8192.018         O        7193.05          N        6194.0111         S        595.01          C        496.07          O        397.09          T        298.03          T        199.0

another option is to add idx_0 sort by it and by value_1 and drop that additional column:

In [142]: (df.assign(x=df.index.get_level_values(0)).sort_values(['x', 'value_1'], ascending=[1,0])
             .drop('x',1))
Out[142]:
            MyName  value_1
idx_0 idx_1
111         S        51          C        47          O        39          T        23          T        126          B       114          O       102          S        910         T        818         O        75          N        6

Solution 3:

Update using pandas version 0.23.0

Sorting by a combination of columns and index levels

df.sort_values(by=['idx_0','value_1'], ascending=[True,False])

output:

             value_1 MyName
idx_0 idx_1                
1115      S
      14      C
      73      O
      92T31T2611      B
      410      O
      29      S
      108T187      O
      56      N

Interestingly enough, @jxc pointed out a solution that I thought should work and was almost exactly as my first failure.

df.sort_values('value_1', ascending=False)\
  .sort_index(level=0, ascending=[True])

It is the passing ascending as a list which makes the above statement work as excepted. I think in pandas passing a scalar value and a list of one should work the same. However, in this case, it appears not to work the same.

I'll submit a bug report.

Post a Comment for "Sort Dataframe Multiindex Level And By Column"