Thursday Poster Symposium

Active learning strategies in community reconstruction

Tzu-Chi Yen

Tzu-Chi Yen

Abstract:

The stochastic block model (SBM) is a powerful tool to parameterize the generative process for graphs with community structure. In its semisupervised inference, the model is learned with the membership of a subset of nodes in addition to the graph structure. It is nontrivial to identify the most informative nodes and no simple node-based criterion, such as the degree centrality, can a priori be used to ascertain whether a node by itself is informative. To understand the relative importance of features that constitute an informative subset, we take a data-centric approach and train a random forest classifier with custom-defined features. We assume the model parameters are known and perform inference with belief propagation. We then focus our analysis on sparse SBMs with a varying number of communities, signal-to-noise ratio, and size of the disclosed subset. The result suggests a structural diversity of optimal subsets and gives some generic guidelines for their selection.