Iterrows Performance
I'm working on python 2.7, pandas ( version 0.18.1 ) data frames. I have to modify a column in the data frame based on several columns in the same data frame. For that I have writt
Solution 1:
Assuming your empty cells are NaN values, this gives you the first non-NA value of each row for the group of columns you are interested in:
df[df>0][columns1].bfill(axis=1).iloc[:,0]
0     NaN
1     NaN
2     NaN
3     NaN
4     NaN
5    20.0
6     NaN
7    20.0
8     NaN
Thus, this will give you the abs(a-b) you're searching for:
res = (df[df>0][columns1].bfill(axis=1).iloc[:,0]
      -df[df>0][columns2].bfill(axis=1).iloc[:,0]).abs()
res
0        NaN
1        NaN
2        NaN
3        NaN
4        NaN
5    22977.5
6        NaN
7        NaN
8        NaN
You can either combine it with your initialized discount column:
res.combine_first(df.discount)
or fill the blanks:
res.fillna(0)
Post a Comment for "Iterrows Performance"