R/igraph Notes
After trying out a few graph packages I have found igraph to be the most suitable to my needs for handling large graphs with an efficient memory management and high execution speed. Thanks to Gabor Csardi and all the developers that make this package available!
I used the igraph package as provided for the largest part of my development, but some functions need fast data structures and therefore are more efficient when written in C.
The igraph R source package contains the following relevant items:
| src/ |
C source files; various config and make files (Makevars; Makedeps) |
| R/ |
R methods used to invoke the algorithms (in C / R) |
| man/ |
Manual information about each R call |
| chm/ |
HTML automatically generated info from man info |
| Namespace |
File that exports calls for the igraph package to R |
Here is how I added the function igraph_visitor that returns a list of subgraphs built from the kth order neighborhood view collected by each node of a given graph.
Adding a C function to the igraph R package
1- Writing the C code:
I inserted the new function igraph_visitor into src/my_igraph_ext.c file (although it could have gone in components.c).
The code highly resembles the igraph_decompose function in components.c and looks as follows:
int igraph_visitor(const igraph_t *graf, igraph_vector_ptr_t *res, int order){
igraph_t *newg;
igraph_dqueue_t q=IGRAPH_DQUEUE_NULL;
igraph_vector_t tmp=IGRAPH_VECTOR_NULL;
igraph_vector_t view;
long int no_of_nodes=igraph_vcount(graf);
int steps=0;
long int i,j;
IGRAPH_CHECK(igraph_dqueue_init(&q, 100));
IGRAPH_VECTOR_INIT_FINALLY(&tmp, 0);
IGRAPH_VECTOR_INIT_FINALLY(&view,0);
igraph_vector_ptr_clear(res);
for(i=0;i<no_of_nodes;i++){
IGRAPH_ALLOW_INTERRUPTION();
igraph_vector_clear(&view);
IGRAPH_CHECK(igraph_dqueue_push(&q,i));
IGRAPH_CHECK(igraph_vector_push_back(&view, i));
steps=0;
while (!igraph_dqueue_empty(&q)) {
long int actnode=(long int)igraph_dqueue_pop(&q);
IGRAPH_CHECK(igraph_neighbors(graf, &tmp, actnode, IGRAPH_ALL));
for (j=0; j<igraph_vector_size(&tmp); j++) {
long int neighbor=VECTOR(tmp)[j];
IGRAPH_CHECK(igraph_vector_push_back(&view, neighbor));
if(steps<order){
IGRAPH_CHECK(igraph_dqueue_push(&q, neighbor));
}
}
steps++;
}
newg=calloc(1,sizeof(igraph_t));
if (newg==0) {
IGRAPH_ERROR("Cannot decompose graph", IGRAPH_ENOMEM);
}
IGRAPH_CHECK(igraph_vector_ptr_push_back(res, newg));
IGRAPH_FINALLY(igraph_destroy, newg);
IGRAPH_CHECK(igraph_subgraph(graf, newg,igraph_vss_vector(&view)));
IGRAPH_FINALLY_CLEAN(1);
}
igraph_dqueue_destroy(&q);
igraph_vector_destroy(&view);
igraph_vector_destroy(&tmp);
IGRAPH_FINALLY_CLEAN(2);
return 0;
} |
The code should be developed in a standalone file and tested independently in C to make sure that it works as expected, an example main function to perform this task is as follows:
int main(void){
igraph_t graph;
igraph_vector_ptr_t sg;
igraph_vector_t v;
igraph_integer_t diam;
igraph_vector_t invdeg;
int i,j;
long int no_of_nodes;
igraph_real_t edges[] = { 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8};
igraph_vector_init(&invdeg,0);
igraph_vector_ptr_init(&sg, 0);
IGRAPH_FINALLY(igraph_vector_ptr_destroy, &sg);
igraph_vector_view(&v, edges, sizeof(edges)/sizeof(double));
IGRAPH_CHECK(igraph_create(&graph, &v, 0, IGRAPH_UNDIRECTED));
igraph_visitor(&graph, &sg, 2);
for(i=0;i<igraph_vector_ptr_size(&sg);i++){
no_of_nodes=igraph_vcount((igraph_t *)VECTOR(sg)[i]);
printf("Vcount of subgraph %d: %d\n",i,no_of_nodes);
igraph_vector_clear(&invdeg);
IGRAPH_CHECK(igraph_degree((igraph_t *)VECTOR(sg)[i], &invdeg, igraph_vss_all(),
IGRAPH_ALL, IGRAPH_NO_LOOPS));
printf("Vertices: ");
for(j=0;j<no_of_nodes;j++){
printf("%d ",(int)VECTOR(invdeg)[j]);
}
printf("\n");
}
igraph_destroy(&graph);
igraph_vector_ptr_destroy(&sg);
igraph_vector_destroy(&invdeg);
IGRAPH_FINALLY_CLEAN(1);
return 0;
}
|
The new function has to be added to the header file src/ igraph.h
int igraph_visitor(const igraph_t *graph, igraph_vector_ptr_t *res, int order); |
2- Adding the R invocation:
igraph_visitor is invoked from R using the call visitor. This is made possible by :
Firstly specifying how igraph_visitor gets invoked in the file rinterface.c which again highly resembles decompose.graph :
SEXP R_igraph_visitor(SEXP graph, SEXP order){
igraph_t g;
igraph_integer_t o=REAL(order)[0];
igraph_vector_ptr_t comps;
SEXP result;
long int i;
R_igraph_before();
R_SEXP_to_igraph(graph, &g);
igraph_vector_ptr_init(&comps, 0);
IGRAPH_FINALLY(igraph_vector_ptr_destroy, &comps);
igraph_visitor(&g, &comps, o);
PROTECT(result=NEW_LIST(igraph_vector_ptr_size(&comps)));
for (i=0; i<igraph_vector_ptr_size(&comps); i++) {
SET_VECTOR_ELT(result, i, R_igraph_to_SEXP(VECTOR(comps)[i]));
igraph_destroy(VECTOR(comps)[i]);
igraph_free(VECTOR(comps)[i]);
}
igraph_vector_ptr_destroy(&comps);
IGRAPH_FINALLY_CLEAN(1);
R_igraph_after();
UNPROTECT(1);
return result;
} |
Notice the "SEXP order" as argument, as all parameters of functions have to be an SEXP and not basic datatypes (e.g. int, char, etc...).
Secondly specifying how R invokes R_igraph_visitor, in this case this is added to R/my_igraph_ext.R:
visitor <- function(graph, order) {
if (!is.igraph(graph)) {
stop("Not a graph object")
}
.Call("R_igraph_visitor", graph, order,
PACKAGE="igraph")
}
|
3- Exporting the R call to the igraph namespace:
To make visitor visible from the igraph namespace it has to be exported to R. Edit the file NAMESPACE by adding the visitor call to the exported function list, here's the section that exports structural properties functions:
# igraph extensions
export(visitor)
|
Comparing the R and the C implementation
My original R implementation of visitor was in R and looked as follows:
R_visitor=function(graf,order=2){
queue=list()
gr=list()
view=list()
steps=0
for(i in 1:vcount(graf)){
queue[[i]]=i-1
view[[i]]=i-1
steps=1
while(length(queue[[i]])!=0){
curr=queue[[i]][1]
neigs=neighbors(graf,curr,mode="all")
view[[i]]=append(neigs,view[[i]])
if(steps!=order){
queue[[i]]=append(queue[[i]],neigs)
steps=steps+1
}
queue[[i]][1]=NA
queue[[i]]=queue[[i]][which(queue[[i]]!='NA')]
}
}
for(i in 1:length(view)){
view[[i]]=unique(view[[i]])
gr[[i]]=subgraph(graf,view[[i]])
}
return(gr)
} |
I compared the C and the R version by using system.time and computing it on a series of random graphs from 100 to 3000 nodes by 500, the timing difference
| R implementation |
0.18 |
1.69 |
4.07 |
7.33 |
11.44 |
15.75 |
| C implementation |
0.02 |
0.71 |
2.17 |
4.71 |
7.97 |
12.30 |
|