Overview of Research

I take part in a number of ongoing collaborations with researchers across academia, industry, and government – each of which provide exciting research opportunities. My research merges mathematical and computational statistics, random graph theory, and machine learning to provide scalable and interpretable machinery to model, explore, and analyze complex network, imaging, and healthcare systems. My aim is to provide reliable and interpretable strategies to make timely data-driven decisions in medicine, climate, and social dynamics. My work generally falls into three scientific themes:

Analysis of Brain Imaging Data from Multiple Modalities, Scanners, and Studies
Modeling and Monitoring Social Dynamics and Influencers on Social Media
Interpretable Machine Learning

My research has been supported by the National Science Foundation, National Institutes of Health (NIH, NIMH, NIA), and the University of Pittsburgh Alzheimer’s Disease Research Center. For a full list of my publications, please go to my Google Scholar Page, or see my (mostly) up-to-date CV for publications and support. I discuss my primary research themes below along with select publications for each area.

Analysis of Brain Imaging Data Across Multiple Modalities, Scanners, and Studies

***Structural MRI scans on the same individual from four different manufacturers. Such technical variability makes analysis across scanners challenging [image from*** MISPEL: A supervised deep learning harmonization method for multi-scanner neuroimaging data]

Brain images provide rich information about the health and wellness of an individual. Structural scans from from magnetic resonance imaging (MRI) or diffusion tensor imaging (DTI) can identify whether or not there are physical trauma, tumors, or other brain tissue abnormalities, while functional scans like functional MRI, electroencephalogram (EEG), or

positron emission tomography (PET) can help assess or diagnose progressive diseases like dementia or Alzheimer’s disease via disruptions in blood oxygen flow or electric impulses throughout the brain. When images are combined, one can obtain an even more detailed view of the brain beyond what each scan provides separately, enabling early detection of progressive diseases, and diagnoses of mental illness. Individual and technical variability of brain images, however, if not properly accounted for, can quickly degrade the inferential and predictive capabilities of jointly analyzing these data. Through the development of network- and imaging-based machine learning techniques, I am working to accomplish two complementary goals: (1) effectively analyze brain imaging data from multiple modalities while compensating for individual and technical heterogeneity, and (2) infer differences among people, disease, and stage of disease through population analysis. Much of my research in this area has focused on understanding the relationships between clinical assessments and brain images to provide a deeper understanding of psychosis, schizophrenia, late-life depression, and Alzheimer’s disease. Example Past Support: NIA P30 AG066468: Network modeling of functional connectivity trajectories for Alzheimer’s disease (March, 2022 - February, 2023).

Select Publications:

Ferrarelli, F., Keihani, A., Donati, F., Janssen, S., Huston, C., Moon, C., Hetherington, H., Wilson, J.D., and Mayeli, A. Multimodal Evidence of Mediodorsal Thalamus-prefrontal Circuit Dysfunctions in Clinical High-risk for Psychosis: Findings from a Combined 7T fMRI, MRSI and Sleep Hd-EEG Study (2025) Accepted, Molecular Psychiatry.
Wilson, J.D., Gerlach, A., Aizenstein, H., and Andreescu, C. (2024) Sex matters: Acute functional connectivity changes as markers of remission in late-life depression differ by sex. Molecular Psychiatry 28 (12), 5228-5236.
Torbati, M.E., Minhas, D.S., Laymon, C.M., Maillard, P., Wilson, J.D., Chen, C-L., Crainiceanu, C.M. DeCarli, C.S., Hwang, S.J. and Tudorascu, D. (2023) MISPEL: A deep learning approach for harmonizing multi-scanner matched neuroimaging data. Medical Imaging Analysis 89, 102926.
Wilson, J.D., Baybay, M., Sankar, R., Stillman, P.E., and Popa, A.M. (2020) Analysis of population functional connectivity data via multilayer network embeddings. Network Science. 9(1), 99 - 122
Wilson, J.D., Cranmer, S., and Lu, Z.L. (2020) A hierarchical latent space network model for population studies of functional connectivity. Computational Brain and Behavior, 3, 394 - 399.

Modeling and Monitoring social dynamics and influencers on Social media

***Tweet trends of #Gamestop (Left) and the market impact on the GME stock (Right) during the Gamestop short squeeze of January 2021.***

Recent news has very clearly established the importance of social media platforms like Meta, X, and Reddit on society – from the dissemination of the Black Lives Matter movement to the motivation of political and industry leaders’ actions on women’s rights and gun control. Strategic influencers, both known and unknown, use social media platforms to alter the communication, ideologies, and actions of societies throughout the world. For example, users of Reddit and Twitter strongly encouraged public investment in GME to counter current hedge funds shorting the stock during the Gamestop short squeeze of 2021 (see Figure above). In examples like these, it is important to understand both the social dynamics of Reddit and Twitter as well as the effect influencers on each of the platforms had on motivating investment. I aim to model, and analyze the role of influencers on social media platforms by focusing on three main areas: (1) modeling the social dynamics of social media platforms with dynamic network models, (2) monitoring and detecting changes in social media, and (3) identifying influencers on social media and assessing their effect on social dynamics. Example Funding: NSF DMS - 1830547: Spatio-Temporal Data Analysis with Dynamic Network Models (August, 2018 - July, 2021).

Select Publications:

Yu, L., Zwetsloot, I. M., Stevens, N. T., Wilson, J.D., and Tsui, K. (2022) Monitoring dynamic networks: a simulation-based strategy for comparing monitoring methods and a comparative study. Quality and Reliability Engineering International 38 (3), 1226-1250.
Lee, J., Li, G., and Wilson J.D. (2020) Varying-coefficient models for dynamic networks. Computational Statistics and Data Analysis 152: 107052. Published online at DOI: https://doi. org/10.1016/j.csda.2020.107052.
Wilson, J.D., Stevens, N.T., and Woodall, W.H. (2019) Modeling and estimating change in temporal networks via a dynamic degree corrected stochastic block model. Quality and Reliability Engineering International 35(5), 1363 - 1378.
Sparks, R., and Wilson, J.D. (2019) Monitoring communication outbreaks among an un- known team of actors in dynamic networks. Journal of Quality Technology, 51 (4) 353 - 374.
Woodall, W.H., Zhao, M., Paynabar, K., Sparks, R., and Wilson, J.D. (2017) An overview and perspective on social network monitoring. IISE Transactions 49 (3), 354 - 365.

Interpretable machine learning

***Clustering of PCA Embeddings of ARGO float ocean temperature profiles. The movement of these clusters closely monitor climate events like El Niño. Image from [**https://agupubs.onlinelibrary.wiley.com/doi/pdfdirect/10.1029/2019JC015947***]

Machine learning and artificial intelligence methodology has become increasingly popular due in large part to its ability to make incredibly accurate predictions. Along with this tremendous gain in predictive ability, however, has been a noticeable lack of model interpretability. Interpretability is essential for the understanding of the data to which a learning technique is applied. In medicine, for example, interpretability may enable the detection of biomarkers for certain psychological disorders, where black box machine learning methods would not be able.

Similarly, in reliably forecasting and preparing for atmospheric events, one must understand and compensate for changing temperatures in the ocean. Using data from NOAA’s ARGO floats that monitor ocean temperature profiles (see Figure above), one can assess temperature fluctuations and localization in the ocean to more reliably forecast El Niño events. With applications like these, I aim to contribute to three major themes in this area: (1) develop interpretable models and methods for complex network, imaging, and temporal data, (2) assess the contribution of localized features in otherwise complicated machine learning techniques, and (3) relate the features and asymptotic properties of widely-used machine learning algorithms and their relationship to interpretable mathematical models.

Select Publications:

Parr, T., Hamrick, J., and Wilson, J.D. (2024) Nonparametric feature impact and importance. Information Sciences 653, 119563.
Parr, T., and Wilson J.D. (2021) Partial dependence through stratification. Machine Learning with Applications 6, 100146.
Houghton, I.A. and Wilson, J.D. (2020) El Niño detection via unsupervised clustering of Argo temperature profiles. Journal of Geophysical Research - Oceans 125(9), e2019JC015947.
Wilson, J.D., Palowitch, J., Bhamidi, S., and Nobel, A.B. (2017) Community extraction in multilayer networks with heterogeneous community structure. Journal of Machine Learning Research 18(1), 5458 - 5506.
Wilson, J.D., Wang, S., Mucha, P.J., Bhamidi, S., and Nobel, A.B. (2014) A testing based extraction algorithm for identifying significant communities in networks. Annals of Applied Statistics 8(3), 1853 - 1891.

James D. Wilson

Overview of Research

Analysis of Brain Imaging Data Across Multiple Modalities, Scanners, and Studies

Modeling and Monitoring social dynamics and influencers on Social media

Interpretable machine learning

Other Quick Links