Current Grant Funding

My research is currently partially funded by the National Science Foundation through the grant NSF DMS - 1830547: Spatio-Temporal Data Analysis with Dynamic Network Models (August, 2018 - July, 2021).

I have also received funding from the National Science Foundation for organizing the Data Institute Conference at the University of San Francisco through the grant NSF DMS - 1841307: The Annual Data Institute Conference (March, 2019).

Overview of Research

From its historical underpinnings in the social sciences, network analysis is now a central component of data science research, playing important roles in academic, industry and government sectors. Networks are, for example, widely used in the study of cognitive neuroscience, the analysis of genome wide association, as well as the understanding of social dynamics. In government, networks are often used for analyzing international trade, conflict, and suspicious intelligence groups. Networks also play a key role in the development, analysis, and monetization of large social media sites like Twitter, LinkedIn, Facebook, and Google, as well as health monitoring technology for digital phenotyping on wearables like the Apple Watch and other cellular devices.

Motivated by the interdisciplinary nature of the field, I take part in a number of ongoing collaborations with researchers across academia, industry, and government each of which provide (and have provided) exciting research opportunities. Generally speaking, my research focuses on developing scalable and interpretable techniques to analyze complex network data. My approach merges computational statistics, random graph theory, and machine learning to provide simple and interpretable machinery to model, explore, and analyze complex interacting systems. I am driven by the data - I seek to use my strengths in network analysis and statistical machine learning to make sense of complex data like that arising in functional connectivity and social media, while furthermore demystifying complex models like contemporary deep learning models on networks.

My work generally falls into three scientific areas, which I summarize below. I list publications in each of these areas.

For more information about my code or publications, see my Google Scholar Page or my Github Page.

  1. Network Modeling and Analysis of Functional Connectivity

Network analyses of the brain have revealed general organizing principles of the whole brain, including high modularity, a "rich-club" of interconnected hub regions, and topologies that demonstrate small-world structure. These findings have shown, for instance, that the regions of the brain not only exhibit strong clustering, but also enable efficient and robust transfer of information across regions. Networks have also advanced our understanding of neural processes, such as learning and memory, cognitive control, and emotion. Several large-scale projects have arisen from network neuroscience, such as the Human Connectome Project as well as the BRAIN initiative.

Despite the many successes of network neuroscience in understanding the structure and function of the brain, many important challenges remain which can be addressed with statistical network methods. I am motivated by three important challenges:

(1) characterizing the generative mechanisms of functional connectivity

(2) analyzing functional connectivity data over populations of individuals

(3) inferring differences among people, disease, and stage of disease

I have made major headway in this area through the development of new generative network models for functional connectivity as well as the development of scalable feature engineering techniques for multilayer network systems describing populations of functional connectivity scans. My work in this area is as follows:

  • Popa, A. M. and Wilson, J. D. (2019). Functional embeddings from resting state fMRI identify differences in schizophrenia. (Under Review).

  • Wilson, J. D., Cranmer, S. J., and Lu, Z.-L. (2019). A hierarchical latent space network model for population studies of functional connectivity. (Accepted, Computational Brain and Biology.)

  • Stillman, P. E., Wilson, J. D., Denny, M. J., Desmarais, B. A., Cranmer, S. J., and Lu, Z.-L. (2019). A consistent organizational structure across multiple functional subnetworks of the human brain. NeuroImage, 197:24 - 36.

  • Wilson, J. D., Baybay, M., Sankar, R., and Stillman, P. (2018). Fast embedding of multilayer networks: An algorithm and application to group fMRI. (Under Review, arXiv preprint arXiv:1809.06437)

  • Stillman, P. E., Wilson, J. D., Denny, M. J., Desmarais, B. A., Bhamidi, S., Cranmer, S. J., and Lu, Z.- L. (2017). Statistical modeling of the default mode brain network reveals a segregated highway structure. Scientific reports, 7(1):11694.

  • Wilson, J. D., Denny, M. J., Bhamidi, S., Cranmer, S. J., and Desmarais, B. A. (2017). Stochastic weighted graphs: Flexible model specification and simulation. Social Networks, 49:37 - 47.

  • Wilson, J. D., Palowitch, J., Bhamidi, S., and Nobel, A. B. (2017). Community extraction in multilayer networks with heterogeneous community structure. The Journal of Machine Learning Research, 18(1):5458-5506.

2. Monitoring Dynamic Networks and the EFFects of Strategic InFLuencers

Recent news has very clearly established the importance of social media platforms like Facebook and Twitter on society -- from the dissemination of the #MeToo movement to the motivation of political and industry leaders' actions on women's rights and gun control. Making sense of the rich but noisy information from social media platforms will lead to a better understanding of social interactions, views, and ideologies. In the last few years, strategic influencers, both known and unknown, have used social media platforms to alter the communication, ideologies, and actions of societies throughout the world. In this line of research, I aim to model and analyze the role of influencers on social media platforms through the use of network analysis and other machine learning techniques. There are three major components of this research on which I am focusing:

(1) modeling social dynamics with dynamic network models

(2) monitoring and detecting change in dynamic networks

(3) analyzing the effect of strategic influencers on social dynamics

Much of my previous work has focused on the identification of change in a dynamic network system. My future research will focus on model selection for the joint actions of influencers and network systems, as well as the efficient identification of change in the joint dynamic system.

My work in this area is as follows:

  • Yu, L., Zwetsloot, I. M., Stevens, N. T., Wilson, J. D., and Tsui, K. L. (2019). Monitoring dynamic networks: a simulation-based strategy for comparing monitoring methods and a comparative study. (Under Review; arXiv preprint arXiv:1905.10302.)

  • Wilson, J. D., Stevens, N. T., and Woodall, W. H. (2019). Modeling and detecting change in temporal networks via the degree corrected stochastic block model. Quality and Reliability Engineering International, 35(5):1363-1378.

  • Jeske, D. R., Stevens, N. T., Tartakovsky, A. G., and Wilson, J. D. (2018). Statistical methods for network surveillance. Applied Stochastic Models in Business and Industry, 34(4):425 - 445.

  • Sparks, R. and Wilson, J. D. (2018). Monitoring communication outbreaks among an unknown team of actors in dynamic networks. Journal of Quality Technology: 1-22.

  • Lee, J., Li, G., and Wilson, J. D. (2017). Varying-coefficient models for dynamic networks. (Under Review; arXiv preprint arXiv:1702.03632.)

  • Woodall, W. H., Zhao, M. J., Paynabar, K., Sparks, R., and Wilson, J. D. (2017). An overview and perspective on social network monitoring. IISE Transactions, 49(3):354-365.

3. Interpretable Learning on Networks

When choosing a supervised model that relates feature and response pairs, model interpretability is often at odds with predictive power. Indeed, these two objectives have traditionally led to the choice of either an interpretable or a predictive model. This choice has largely been divided among machine learning and statistics cultures, where machine learning practitioners focus on predictive ability and statistical practitioners focus on interpretability and inference. Recently, however, there has been a shift in the division of these two objectives as the machine learning community has begun to build what are being called "interpretable machine learning" models. Interpretable machine learning models aim to get the best of both worlds by achieving high predictive power and ensuring that the predictions of the model can be easily interpreted.

Interpretability is essential for the understanding of the data to which a network method is applied. In functional connectivity, for example, interpretability may enable the detection of biomarkers for certain psychological disorders, where "blind" machine learning methods would not be able. Generative models provide direct interpretations for the effect of motifs on network connectivity. As methods become more sophisticated, it is often the case that direct interpretability is no longer possible. In these situations, one can aim for indirect interpretability, wherein the features and implication of the algorithm itself is analyzed, often at an asymptotic level.

I aim to contribute to two major themes in this area:

(1) develop directly interpretable models and methods for complex network data

(2) analyze the features and asymptotic properties of widely-used machine learning algorithms on network data, and their relationship to interpretable models

In addition to my previously mentioned publications in community extraction and network embedding, examples of my past work include:

  • Parr, T. and Wilson, J. D. (2019). A stratification approach to partial dependence for codependent variables. (Under Review; arXiv preprint arXiv:1907.06698.)

  • MacMillan, K.* and Wilson, J.D. (2017) Topic supervised non-negative matrix factorization. (Technical Report)

  • Wilson, J. D., Wang, S, Mucha, P. J., Bhamidi, S. and Nobel, A.B. A testing based extraction algorithm for identifying significant communities in networks. The Annals of Applied Statistics, 8(3):1853-1891, 2014.

Other Analyses and Applications

In addition to the methodological developments of the papers above, I have also worked on several interdisciplinary applied works, including the following

  • Szekely, E., Pappa, I., Wilson, J.D., Bhamidi, S., Jaddoe, V., Verhulst, H.T., and Shaw, P. (2016) Childhood peer network characteristics: genetic influences and links with early mental health trajectories. Journal of Child Psychology and Psychiatry 57(6), 687 - 694. <reprint>

  • Parker, K.S., Wilson, J.D., Marschall, J., Mucha, P.J., and Henderson, J.P. (2015) Network analysis reveals sex- and antibiotic resistance-associated antivirulence targets in clinical uropathogens. American Chemical Society: Infectious Diseases 1(11), 523 - 532. <preprint>

  • Wilson, J.D. and Uminsky, D.T. The power of A/B Testing under Social Interference.