r - Is there a more efficient way to perform a function across sequential rows of a matrix? -


i'm looking calculate variation of information between every row of matrix every other row of same matrix. distance metric isn't included in dist have iterate through manually. each row clustering , each column sample. values of matrix {1,0} indicating whether sample member or not of cluster. here example matrix , have now. can take quite time though, there more efficient way perform calculation?

# subset clusterings meet threshold of member count m <- 100 n <- 70 membership <- matrix(sample(0:1, m * n, replace = true), m, n)  # create distance matrix, set diagonal 0 dist.matrix <- matrix(, nrow = m, ncol = m) diag(dist.matrix) <- 0  # iterate through each row , calculate distances subsequent rows # fill values in distance matrix (i in 1:m) {     (j in (i+1):m) {         if (j > m) break         vi <- igraph::compare(membership[i,], membership[j,], method = "vi")         dist.matrix[i,j] <- vi         dist.matrix[j,i] <- vi     } } 

you can use expand.grid define combinations, sapply compute values, , reshape produce final matrix

df_combs <- expand.grid(1:nrow(membership), 1:nrow(membership)) df_combs$compare <- apply(df_combs, 1, function(x) igraph::compare(membership[x[1],], membership[x[2],], method = "vi")) df_wide <- reshape(df_combs, direction = "wide", timevar = "var1", idvar = "var2") df_wide$var2 <- null 

df_wide same dist.matrix.


Comments