Novel Machine Learning Methods for Characterizing Complex Datasets and Models

October 30, 2020 1:00 PM

Presenter

Dr. Velimir Vesselinov
Computational Earth Science
Los Alamos National Laboratory

Abstract

The integration of large datasets and powerful computational capabilities has resulted in the widespread use of machine learning (ML) in science, technology, and industry. However, most of the recent ML developments focus on supervised methods which require large training tests. However, these supervised ML methods are not highly applicable to science-driven data applications where typically the availability of training sets is very limited. The supervised ML methods are also impacted by adversarial problems which can cause inaccurate ML predictions when random noise is added to the training data. Instead, unsupervised ML methods are generally preferred for data-analytics problems.

Recently, we have developed a series of novel unsupervised machine learning (ML) methods based on matrix and tensor factorizations. Our machine learning algorithms are open source and available at https://github.com/orgs/TensorDecompositions. More information about our recent work can be found at http://tensors.lanl.gov. The unsupervised ML methods can be applied for feature extraction, blind source separation, model diagnostics, detection of disruptions and anomalies, image recognition, discovery of unknown dependencies and phenomena represented in the datasets as well as development of physics and reduced-order models representing the data.

Here, we demonstrate the applicability of novel machine learning methods based on unsupervised techniques and incorporating physics information for characterizing complex datasets and models. The methods will be demonstrated to address oil & gas, geothermal and wildfire applications.