Skip to content Skip to sidebar Skip to footer

Iterrows Performance

I'm working on python 2.7, pandas ( version 0.18.1 ) data frames. I have to modify a column in the data frame based on several columns in the same data frame. For that I have writt

Solution 1:

Assuming your empty cells are NaN values, this gives you the first non-NA value of each row for the group of columns you are interested in:

df[df>0][columns1].bfill(axis=1).iloc[:,0]

0     NaN
1     NaN
2     NaN
3     NaN
4     NaN
5    20.0
6     NaN
7    20.0
8     NaN

Thus, this will give you the abs(a-b) you're searching for:

res = (df[df>0][columns1].bfill(axis=1).iloc[:,0]
      -df[df>0][columns2].bfill(axis=1).iloc[:,0]).abs()
res

0        NaN
1        NaN
2        NaN
3        NaN
4        NaN
5    22977.5
6        NaN
7        NaN
8        NaN

You can either combine it with your initialized discount column:

res.combine_first(df.discount)

or fill the blanks:

res.fillna(0)

Post a Comment for "Iterrows Performance"