r - Create New Column With Consecutive Count Of First Series Based on ID Column -


i work in healthcare industry , i'm using machine learning algorithms develop model predict when patients not show appointments. i'm trying create new feature sum of each patient's recent consecutive no-shows. i've looked around lot on stackoverflow , other resources, cannot find i'm looking for. example, if patient has no-showed past 2 recent appointments, every row of new feature's column id filled in 2's. if no-showed 3 times, showed recent appointment, new column filled in 0's.

i tried using plyr's ddply cumsum, did not give me results i'm looking for. used:

ddply(a, .(id), transform, consecutivenoshows = cumsum(noshow)) 

here example data set ('1' signifies no-show):

id  noshow  1       1  1       1  1       0  1       0  1       1  2       0  2       1  2       1  3       1  3       0  3       1  3       1  3       1 

this desired outcome:

id  noshow  consecutivenoshows  1       1                   2  1       1                   2  1       0                   2  1       0                   2  1       1                   2  2       0                   0  2       1                   0  2       1                   0  3       1                   1  3       0                   1  3       1                   1  3       1                   1  3       1                   1 

i'll grateful help. thank you.

the idea sum() each id number of noshow before 0 appears.

library(dplyr) df %>%   group_by(id) %>%   mutate(consecutivenoshows = sum(!cumsum(noshow == 0) >= 1)) 

which gives:

#source: local data frame [13 x 3] #groups: id [3] # #      id noshow consecutivenoshows #   <int>  <int>              <int> #1      1      1                  2 #2      1      1                  2 #3      1      0                  2 #4      1      0                  2 #5      1      1                  2 #6      2      0                  0 #7      2      1                  0 #8      2      1                  0 #9      3      1                  1 #10     3      0                  1 #11     3      1                  1 #12     3      1                  1 #13     3      1                  1 

Comments