Pandas Dataframe to dictionary groupby index

I have a dataframe with 3 columns, all of them have string values. The dataframe is of this form:

Key Word    Synonym    Alternatives
   A          word1         NaN
   A          word2         NaN
   A          word3         word11
   B          word4         word12
   B          word5         NaN 
   B          word6         word13
   C          word7         word14
   C          word8         NaN
   C          word9         NaN
   D          word10        word15

What I want, is to convert it to a dictionary, which will be grouped based on the Key Word column, and for every key_word, to return all the corresponding synonyms and alternative synonyms. So, all the values of A will be referring to the corresponding values that exist in the Synonym and Alternatives for A etc. Is there a way to do this? Thank you in advance.

1 answer

  • answered 2017-09-19 11:54 jezrael

    I think you need stack for drop NaNs and then groupby with list. Last call to_dict:

    d = df.set_index('Key').stack().groupby('Key').apply(list).to_dict()
    print (d)
    {'B': ['word4', 'word12', 'word5', 'word6', 'word13'], 
     'D': ['word10', 'word15'], 
     'C': ['word7', 'word14', 'word8', 'word9'], 
     'A': ['word1', 'word2', 'word3', 'word11']}