r - Finding the mean of a value based on multiple variables -


plot of reads vs gelscoresmean #reads each permutation of gelscore?

i work in genetics lab @ university, doing data analysis in our computer lab. after running pcr scored our gels according band, smear, primer dimer, , non specific product. these variables assigned values of 0,1,or 2. trying find mean number of reads(sequencing results) returned each combination of 4 gel-scores. each variable has own column in datasheet.

datasheet: vial id, band, smear, primer.dimer, non.spec, reads

ex. mean number of reads gels band=0, smear=0, primerdimer=0 nonspec=0.

ex. mean number of reads gels band=0, smear=1, primerdimer=1 nonspec=2.

etc.

any suggestions appreciated, thank you

i can plot data using generic plot function. although mean bars displayed, cannot ascertain values.

"plot(reads~as.factor(datasheet$band+(primer.dimer*10)+(smear*100)+(non.specific.product*1000))"

you can using dplyr , tidyr packages:

    library(dplyr)     library(tidyr)      set.seed(14592)      df <- data.frame(       vial_id      = 1:10,        band         = sample(0:2, 10, replace = true),        smear        = sample(0:2, 10, replace = true),        primer_dimer = sample(0:2, 10, replace = true),        non_spec     = sample(0:2, 10, replace = true),        reads        = rnorm(10)     )     df %>%        unite(group_id, band:non_spec, remove = false) %>%        group_by(group_id) %>%        summarize(group_mean = mean(reads)) 

this uses tidyr's unite function create unique group id each combination of gel scores uses dplyr's group_by , summarize functions find mean read each group.


Comments