Sum based on a value in dataframe

I need to sum column "Runs" when MatchN is x, B is between i and j.

MatchN  I   B   Runs
1000887 1   0.1 1
1000887 1   0.2 3
1000887 1   0.3 0
1000887 1   0.4 2
1000887 1   0.5 1

I tried using for loop but not able to crack it so far. Any suggestions?

2 answers

  • answered 2017-06-17 18:07 Willem Van Onsem

    You can first use a filter, and then sum op the B column, like:

    df[(df['MatchN'] == x) & (i <= df['B']) & (df['B'] <= j)]['Runs'].sum()
    #  \_________________________ _________________________/ \___ __/\_ __/
    #                            v                               v     v
    #                        filter part                     column sum part
    

    So the filter part, is the logical and of three conditions:

    1. df['MatchN'] == x;
    2. i <= df['B']; and
    3. df['B'] <= j.

    We use the & operator to combine the three filters. Next we select these rows with df[<filter-condition>] (with <filter-condition> our previously discussed filter).

    Next we select the Runs column of the filtered rows, and then finally we calculate the .sum() of that column.

  • answered 2017-06-17 18:07 Scott Boston

    You can use query:

    x = '1000887'
    i = 0.2
    j = 0.4
    df.query('MatchN == @x and @i <= B <= @j')['Runs'].sum()
    

    Output:

    5