Research Interests

My research focus is developing and analyzing novel methods for finding structure in noisy, high-dimensional data using tools from probability, random matrix theory, graph theory, linear algebra, harmonic analysis, and machine learning. Modern data sets often have an enormous number of features, with each observation taking values in a high-dimensional Euclidean space, and yet the data contains an underlying structure that is low-dimensional. This low-dimensional structure may arise because all of the sample points lie on a low-dimensional subspace or manifold, or because the data is well separated into distinct clusters under some metric. My research involves representing this low-dimensional structure with an appropriate data model and then constructing algorithms that can correctly extract the low-dimensional structure with high probability. This process involves careful analysis of noise, sampling, and the effects of the data dimension, in order to quantify in which regimes one can successfully extract the low-dimensional structure. In order to improve the state-of-art in data analysis, these algorithms must be computationally efficient in addition to accurate. Thus an important component of my research is developing fast numerical implementations which minimize the dependence on the ambient dimension and are log linear in the sample size. Although I am a mathematician by training, I also pursue inter-disciplinary collaborations where I can utilize tools from machine learning in domain specific areas including cybersecurity and molecular biology.

Publications

MDS_image.jpg

An Analysis of Classical Multidimensional Scaling

A Little, Y Xie, Q Sun. Submitted to Biometrika. Pre-print at https://arxiv.org/pdf/1812.11954.pdf.

LLPD_image_rect.jpg

Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms

A Little, M Maggioni, J Murphy. Submitted to Journal of Machine Learning Research. Pre-print at https://arxiv.org/abs/1712.06206.

kfc2a_image.jpg

Feature Design for Protein Interface Hotspots using KFC2 and Rosetta

F Seeger, A Little, Y Chen, T Woolf, H Cheng and J Mitchell. To appear in Research in Data Science, Spring 2019.

NursePractitioner_image.jpg

Translating Evidence into Practice: Interpreting Measures of Risk

L Hart, A Little. The Nurse Practitioner, Vol. 42, No. 2, 2017.

MEPS_image3.png

S-STEM: Mathematics, Engineering, and Physics Scholars

LA Clements, H Wang, A Little, WB Lane, and H Duong. American Society for Engineering Education (ASEE) Annual Conference & Exposition, 2017.

ACHA_image.jpg

Multiscale geometric methods for data sets I: Multiscale SVD, noise and curvature

A Little, M Maggioni, L Rosasco. Applied and Computational Harmonic Analysis (ACHA), 2016.

IEEE_cyber_SC_image.jpg

Spectral Clustering Technique for Classifying Network Attacks

A Little, X Mountrouidou, D Moseley. IEEE International Conference on Intelligent Data and Security (IDS), New York City, April 2016.

ICMLA_image.jpg

A Multiscale Spectral Method for Learning Number of Clusters

A Little, A Byrd. 14th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, Dec. 2015.

Excursions_image.jpg

Multi-Resolution Geometric Analysis for Data in High Dimensions

G. Chen, A.V. Little, M. Maggioni. Excursions in Harmonic Analysis, Vol. 1, Editors T.D. Andrews et al., Birkhauser, 2013.

sampta_image.jpg

Multiscale Geometric Methods for Estimating Intrinsic Dimension

A Little, M Maggioni, L Rosasco. 9th International Conference on Sampling Theory and Applications (SampTA), Singapore, May 2011.

wavelets_bookchap_image.jpg

Some recent advances in the geometric analysis of point clouds

G Chen, A Little, M Maggioni. Wavelets and Multiscale Analysis: Theory and Applications, Editors J. Cohen and A. Zayed, Birkhauser, 2011.

AAAI_image.jpg

Multiscale Estimation of Intrinsic Dimensionality of Data Sets

A Little, Y Jung, M Maggioni. Association for the Advancement of Artificial Intelligence (AAAI) Fall Symposium (FS-09-04), 2009.

ssp_image.jpg

Estimation of Intrinsic Dimensionality of Samples from Noisy Low- dimensional Manifolds in High Dimensions with Multiscale SVD

J Lee, A Little, Y Jung, M Maggioni. 15th IEEE Workshop on Statistical Signal Processing (SSP), Cardiff, 2009.

RoseHulman_image2.jpg

Positive Solutions to a Diffusive Logistic Equation with Constant Yield Harvesting

T Ladner, A Little, K Marks, A Russell. Rose-Hulman Undergraduate Math Journal, Vol. 6, Issue 1, 2005.