Apply a function with multiple arguments on an entire dataframe in Pandas

I have the following dataframe in pandas:

df = pd.DataFrame({'field_1' : ['a', 'b', np.nan, 'a', 'c'], 'field_2': ['c', 'b', 'a', np.nan, 'c']}, index=[1,2,3,4,5])

I want to apply the following function on the entire dataframe that replaces each value with something else.

For example:

def func_replace(value, n):
    if value == 'a':
        return 'This is a'*n
    elif value == 'b':
        return 'This is b'*n
    elif value == 'c':
        return 'This is c'*n
    elif str(value) == 'nan':
        return np.nan
    else:
         'The value is not included'

so that the final product would look like (given that n=1).

For example:

df = pd.DataFrame({'field_1' : ['This is a', 'This is b', np.nan, 'This is a', 'This is c'], 'field_2': ['This is c', 'This is b', 'This is a', np.nan, 'This is c']}, index=[1,2,3,4,5])

I tried the following:

df.apply(func_replace, args=(1), axis=1)

and bunch of other options, but it always gives me an error.

I know that I can write a for loop that goes through every column and uses lambda function to solve this problem, but I feel that there is an easier option.

I feel the solution is easier than I think, but I just can't figure out the correct syntax.

Any help would be really appreciated.

2 answers

  • answered 2018-03-20 18:08 sriramn

    Just modify your function to operate at the level of each value in a Series and use applymap.

    df = pd.DataFrame({'field_1' : ['a', 'b', np.nan, 'a', 'c'], 'field_2': ['c', 'b', 'a', np.nan, 'c']}, index=[1,2,3,4,5])
    
    df
    Out[35]: 
      field_1 field_2
    1       a       c
    2       b       b
    3     NaN       a
    4       a     NaN
    5       c       c
    

    Now, if we define the function as:

    def func_replace(value):
        if value == 'a':
            return 'This is a'
        elif value == 'b':
            return 'This is b'
        elif value == 'c':
            return 'This is c'
        elif str(value) == 'nan':
            return np.nan
        else:
            'The value is not included'
    

    Calling this function on each value on the DataFrame is very straightforward:

    df.applymap(func_replace)
    Out[42]: 
         field_1    field_2
    1  This is a  This is c
    2  This is b  This is b
    3        NaN  This is a
    4  This is a        NaN
    5  This is c  This is c
    

  • answered 2018-03-20 18:08 Lambda

    I think you need:

    def func_replace(df, n):
        df_temp = df.replace({r"[^abc]": "The value is not included"}, regex=True)
        return df_temp.replace(["a", "b", "c"], ["This is a " * n, "This is b " * n, "This is c " * n])
    
    df.apply(func_replace, args=(2,))