UCLA CAM REU topics summer 2022

AI with Community Partners:

This project will use machine learning and AI techniques to assist several community partners with their data driven projects. The mathematics will include learning about and implementing various ML and statistical learning methods like neural nets, random forests, nonnegative matrix and tensor factorizations, and others. We will also develop and design new methods that address the practical needs of our partners that are not currently satisfied by existing approaches. We will carefully consider issues surrounding bias in all of our methods.  Our partners include: the California Innocence Project - a nonprofit legal team that works to free innocent people in prison, Homeboy Industries - the largest gang rehabilitation and re-entry program in the world,and CDTech / YLEAD a partner of Public Allies whose mission is to build livable and economically viable communities in the low-income areas of Greater Los Angeles. An example of such projects includes for example a Covid Outreach project with CDTech that aims to further inform and protect those communities through information and resource delivery. In addition to our mathematical contribution to these nonprofits, this team will also participate in outreach activities, meeting with the constitutes and those they serve; in particular to encourage and help prepare them for higher education.

Genesis and evolution of scientific fields

This project uses machine learning to understand the genesis and evolution of scientific areas of research. The students will work with data from scientific publications and develop techniques for representing the structure of knowledge and collaboration such that the emergence of new research areas can be identified and understood. Students will use machine learning tools such as natural language processing, knowledge graphs, semantic embeddings, graph embeddings, etc. We are especially interested in techniques that can address multimodal dynamic data (e.g. text, graph structure and time).

Gang Reduction

We plan to analyze data collected by the city of Los Angeles as part of it's gang reduction program. This data involves both a youth program and a crime reduction program. Recent work in this area by REU students includes natural language processing of text data and dynamic mode decomposition to study the evolution of the program using survey data.

Active Learning

Active learning combines two different ideas - the first is a general method for semisupervised learning (SSL). The second is a method to strategically choose a small amount of unlabeled data to send to "human in the loop" for ground truth classification. This project will involve graph-based multi-class SSL classifiers for high dimensional data. Students will develop rigorous theory along with code for generalized graph-based Bayesian models for active learning with both sequential active learning, batch learning, and multiple classes. Students will work on real-world data with asymmetric group sizes. Specific types of data we plan to study are hyperspectral and multimodal imagery and video. We plan to use data in the public domain.

Knowledge Graphs

Knowledge graphs are data structures in which information is organized along nodes and edges of a graph and there is an unerlying heirarchical structure defined through an "ontology". This project will build knowledge graphs from datasets involving narratives and social media data. Students will develop machine learning methods for these data structures.

Particle Laden Flow

This is a laboratory based project involving measurement and modeling of slurries. Students will learn laboratory skills and how to take and record data. Students will also work with conservation law models from nonlinear PDE so some background in partial differential equations is desired.