Pandas Series from dictionary/ summarized DataFrame
Here is some dummy data to show what my initial table looks like:
import pandas as pd
import numpy as np
df=pd.DataFrame(columns=["Col_A","Col_B","Col_C"])
df.loc['2'] = pd.Series({"Col_A":.001,"Col_C":.4566})
df.loc['3'] = pd.Series({"Col_A":.25, "Col_B":.46})
df.loc['4'] = pd.Series({"Col_A":1, "Col_C":2})
df.loc['5'] = pd.Series({"Col_C":5.5})
Comes out like this:
Col_A Col_B Col_C
2 0.001 NaN 0.4566
3 0.250 0.46 NaN
4 1.000 NaN 2.0000
5 NaN NaN 5.5000
I would like to create a new dataframe with a summary of the non Null fields like so:
Column
Index
2 'Col_A: 0.001 units; Col_C: 0.4566 units'
3 'Col_A: 0.250 units; Col_B: 0.46 units'
4 'Col_A: 1.000 units; Col_C: 2.0000 units'
5 'Col_C: 5.5000 units'
I got as far as making this:
{'3': [('Col_A', 0.25), ('Col_B', 0.46)], '2': [('Col_A', 0.001), ('Col_C', nan)], '5': [('Col_C', nan)], '4': [('Col_A', 1.0), ('Col_C', nan)]}
But I cant figure out how to put it in the form above.
Thank you for your help.
2 answers

Option 1
melt
your dataframe (widetolong conversion) Drop NaNs using
dropna
 combine column names with values using
agg
+str.join
 combine all values by index using
groupby
+agg
+str.join
(df.reset_index() .melt('index') .dropna() .set_index('index') .astype(str) .agg(': '.join, 1) .add(' units') .groupby(level=0) .agg('; '.join) ) index 2 Col_A: 0.001 units; Col_C: 0.4566 units 3 Col_A: 0.25 units; Col_B: 0.46 units 4 Col_A: 1.0 units; Col_C: 2.0 units 5 Col_C: 5.5 units dtype: object
Note that this returns a
Series
.
Option 2
Replacing the
melt
withstack
from Option 1;(df.stack() .astype(str) .reset_index(level=1) .agg(': '.join, 1) .add(' units') .groupby(level=0) .agg('; '.join) ) 2 Col_A: 0.001 units; Col_C: 0.4566 units 3 Col_A: 0.25 units; Col_B: 0.46 units 4 Col_A: 1.0 units; Col_C: 2.0 units 5 Col_C: 5.5 units dtype: object

Using
stack
df.apply(lambda x : x.name+":"+x.astype(str)+' units').mask(df.isnull()).stack().groupby(level=0).apply(';'.join) Out[856]: 2 Col_A:0.001 units;Col_C:0.4566 units 3 Col_A:0.25 units;Col_B:0.46 units 4 Col_A:1.0 units;Col_C:2.0 units 5 Col_C:5.5 units dtype: object