Pulling score text data into r -


trying figure out how pull following data r:

http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0

this works, want eliminate junk on top , bottom, , scores.

read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',           widths=c(11,26,3,26,3,4,21),            skip = 8)  

first of welcome stack exchange! changed somethings in code such needed 6 widths, had column got rid of that. when pulling in data online noticed first row pretty strange got ride of , manually added later.

data <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',widths=c(10,26,3,26,3,4), sep = "\t", header = false, skip = 8) # line subsets data don't have "junk" @ bottom , deletes row # html tagging.   data <- data[2:2424,] data <- data.frame(data)  # create vector has column headers names <- c("date", "team1","runs", "team 2","runs","something") colnames(data) <- names  # create first row of data deleted.  firstrow = data.frame("2016-04-03", "@pirates", 4, "cardinals",1,"") colnames(firstrow) <- names  finaldata <- rbind.data.frame(firstrow,data) 

for future reference if can post screenshot of deem junk helpful people attempting question.

update

data <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',                      widths=c(10,26,3,26,3,4), sep = "\t", header = false, skip = 9)  data <- data.frame(data)  # line subsets data don't have "junk" @ bottom , deletes row # html tagging.   firstrow <- read.fwf('http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=3&sch=on&format=0',                   widths=c(-8,-1,-1,9,26,3,26,3,4), sep = "\t", header = false, n = 1, skip = 8) firstrow <- data.frame(firstrow,stringsasfactors=false)  firstrow[,1] <- paste("2",firstrow[1,1],sep = "")  # create vector has column headers names <- c("date", "team1","runs", "team 2","runs","something") colnames(data) <- names    colnames(firstrow) <- names  finaldata <- rbind.data.frame(firstrow,data) 

the negative values column move data over, played around until worked out missing in first row "2". paste in "2" , use rbind function create full data frame. hope helps out.

i tested on page well: http://masseyratings.com/scores.php?s=285971&sub=14342&all=1&mode=2&sch=on&format=0 , worked expected.


Comments