Aggregate Unique Values From Multiple Columns With Pandas GroupBy


Answer :

Use groupby and agg, and aggregate only unique values by calling Series.unique:

df.astype(str).groupby('prop1').agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

df.astype(str).groupby('prop1', sort=False).agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    L30    3,54,11,10    bob,john  11.2,10.0 K20       12,1,66  travis,leo   10.0,4.0 

If handling NaNs is important, call fillna in advance:

import re df.fillna('').astype(str).groupby('prop1').agg(     lambda x: re.sub(',+', ',', ','.join(x.unique())) )              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

Comments

Popular posts from this blog

Chemistry - Bond Angles In NH3 And NCl3

Are Regular VACUUM ANALYZE Still Recommended Under 9.1?

Change The Font Size Of Visual Studio Solution Explorer