i trying import table web page using readhtmltable function, first few rows of data looks when in r.
event athlete country result medal year 1 100m men tom burke usa 12.0 gold 1896 2 fritz hofmann deu 12.2 est. silver 1896 3 francis lane usa 12.6 bronze 1896 4 alajos szokolyi hun 12.6 est. bronze 1896 5 400m men tom burke usa 54.2 gold 1896 6 herbert jamison usa n/a silver 1896 7 charles gmelin gbr n/a bronze 1896 8 800m men teddy flack aus 2:11.0 gold 1896 9 nֳ¡ndor dֳ¡ni hun 2:11.8 est. silver 1896 10 demitrios golemis gre n/a bronze 1896
now if @ event column can see of rows of event field empty, way table on website, looking efficient way fill blanks @ end should this
event athlete country result medal year 1 100m men tom burke usa 12.0 gold 1896 2 100m men fritz hofmann deu 12.2 est. silver 1896 3 100m men francis lane usa 12.6 bronze 1896 4 100m men alajos szokolyi hun 12.6 est. bronze 1896 5 400m men tom burke usa 54.2 gold 1896
basically every time field in event column empty need fill last value not empty. column saved in r factor , know technically can using loop , going on of vector elements time consuming considering fat there 300000 rows in table. hoping more efficient
here's toy example of how purrr package used solve problem, assuming data in data.frame
, missing values na
:
library(purrr) df <- data.frame("event" = c(1, na, 2, na, 3, na, 5), "other" = 1:7) df # event other # 1 1 1 # 2 na 2 # 3 2 3 # 4 na 4 # 5 3 5 # 6 na 6 # 7 5 7 df$event <- accumulate(.x = df$event, .f = function(x, y) { if(is.na(y)) x else y }) df # event other # 1 1 1 # 2 1 2 # 3 2 3 # 4 2 4 # 5 3 5 # 6 3 6 # 7 5 7
Comments
Post a Comment