i'm looking calculate variation of information between every row of matrix every other row of same matrix. distance metric isn't included in dist
have iterate through manually. each row clustering , each column sample. values of matrix {1,0} indicating whether sample member or not of cluster. here example matrix , have now. can take quite time though, there more efficient way perform calculation?
# subset clusterings meet threshold of member count m <- 100 n <- 70 membership <- matrix(sample(0:1, m * n, replace = true), m, n) # create distance matrix, set diagonal 0 dist.matrix <- matrix(, nrow = m, ncol = m) diag(dist.matrix) <- 0 # iterate through each row , calculate distances subsequent rows # fill values in distance matrix (i in 1:m) { (j in (i+1):m) { if (j > m) break vi <- igraph::compare(membership[i,], membership[j,], method = "vi") dist.matrix[i,j] <- vi dist.matrix[j,i] <- vi } }
you can use expand.grid define combinations, sapply compute values, , reshape produce final matrix
df_combs <- expand.grid(1:nrow(membership), 1:nrow(membership)) df_combs$compare <- apply(df_combs, 1, function(x) igraph::compare(membership[x[1],], membership[x[2],], method = "vi")) df_wide <- reshape(df_combs, direction = "wide", timevar = "var1", idvar = "var2") df_wide$var2 <- null
df_wide same dist.matrix.
Comments
Post a Comment