Document Type

Dissertation

Date of Degree

Fall 2014

Degree Name

PhD (Doctor of Philosophy)

Degree In

Chemistry

First Advisor

Gary W. Small

Abstract

Pattern recognition has over past decades become a fast growing area of chemometrics. Accurate, user-friendly, and fast pattern recognition methods are desired to accommodate the increased capacity of automated instruments to obtain large-scale data under complex circumstances. It has found significant applications in diverse fields such as environmental monitoring and biomedical diagnostics. In this dissertation, the capabilities of pattern recognition methods in case studies related to environmental remote sensing and biomedical sensing are investigated.

For remote sensing applications, two types of airborne spectroscopic data, passive Fourier transform infrared (FTIR) and gamma-ray, are subject to analysis in order to develop automated classifiers for either ammonia vapor or the radioisotope cesium-137 in the open-air. Support vector machine (SVM) classification is the primary pattern recognition method used in this work. In order to overcome the limitation of available representative patterns associated with airborne data, and provide sufficient patterns presenting the analyte-active class for use in the training set, a spectral simulation protocol is employed to generate abundant patterns bearing both the signature of the target analyte and the background spectral profile. Signal processing procedures including segment selection and digital filtering are further used to extract the information most relevant to the target analyte out the acquired raw data. Also, to ease the computational demand from the SVM, an alternative pattern recognition method, piecewise linear discriminant analysis (PLDA) is applied to optimize signal processing conditions for final SVM classification. Process control techniques are applied to the SVM score profiles of prediction sets to improve pattern recognition performance by incorporating probabilities associated with every SVM score. Ammonia classifiers developed from this methodology result in classification performance with high sensitivity and selectivity, and the cesium-137 classifiers developed from the same concepts exhibit excellent sensitivity to test data with very low signal strengths. Under the case of ammonia classification, the relationship between the concentration profile of the active patterns in the training set and the limit of detection of the corresponding classifier is investigated. Classifiers built to detect low concentrations of ammonia are developed and tested through this work.

For a glucose sensing application, studies are conducted to provide sound performance diagnostics for an established calibration model for glucose from near infrared spectroscopic data. Six-component aqueous matrixes of glucose in the presence of five other interfering species, all spanning physiological levels, serve as samples to be analyzed. A novel residual modeling protocol is proposed to retrieve the residual glucose concentrations, the concentration not being predicted by the calibration model, from the residual spectra, the portion of the raw spectra not being used by the calibration model. The recovered glucose concentration from the residual modeling can be used as a means, combined with process control techniques, to evaluate the performance of the established calibration model. Several modeling techniques are used for residual modeling, including PLS, support vector regression (SVR), a hybrid method, PLS-aided SVR, and an amplified version of the hybrid, amplified PLS-aided SVR. Through this work, a calibration updating strategy is developed which provides an effective way to monitor the established calibration model.

Public Abstract

How to discover a wealth of information about chemical of interest, whether is qualitative or quantitative information, buried in a variety of large-scale spectral data is the broad-sense topic of this research. The set of tool used for such data -mining tasks is chemometrics, Chemometric methods such as pattern recognition, multivariate calibration and signal processing were collectively used here to extract chemical information from spectral data. Such information can be used for chemical identification, qualification, or monitoring changes of chemical over time. Four projects presenting different challenging issues, either from environmental monitoring or biomedical application, down to the roots are all keen on solutions for chemical qualification. How to solve the issues were showcased here. Based on airborne infrared spectra collected in the remote sensing measurement using natural occurring sunlight as light source, pattern recognition methods was able to establish an automated ammonia classifier helping to pinpoint hazard emission, further refined classifier prove to be able to provide quantitative information, such as limit of detection. The similar methodology applied to airborne remote sensing gamma-ray spectra, efforts was made to develop an automated classifier for radioactive isotope, cesium-137. In the last project, an innovative way to morning the calibration model itself was developed pattern recognition, while glucose level in aqueous samples will be predicted based on near infrared spectra by the multivariate calibration model. Successful implementation of chemo metrics to gain chemical information from spectra in four different stories vouches the methodologies.

How to discover a wealth of information about chemical of interest, whether is qualitative or quantitative information, buried in a variety of large-scale spectral data is the broad-sense topic of this research. The set of tool used for such data-mining tasks is chemometrics, Chemometric methods such as pattern recognition, multivariate calibration and signal processing were collectively used here to extract chemical information from spectral data. Such information can be used for chemical identification, qualification, or monitoring changes of chemical over time. Four projects presenting different challenging issues, either from environmental monitoring or biomedical application, down to the roots are all keen on solutions for chemical qualification. How to solve the issues were showcased here. Based on airborne infrared spectra collected in the remote sensing measurement using natural occurring sunlight as light source, pattern recognition methods was able to establish an automated ammonia classifier helping to pinpoint hazard emission, further refined classifier prove to be able to provide quantitative information, such as limit of detection. The similar methodology applied to airborne remote sensing gamma-ray spectra, efforts was made to develop an automated classifier for radioactive isotope, cesium-137. In the last project, an innovative way to morning the calibration model itself was developed pattern recognition, while glucose level in aqueous samples will be predicted based on near infrared spectra by the multivariate calibration model. Successful implementation of chemo metrics to gain chemical information from spectra in four different stories vouches the methodologies.

Keywords

publicabstract

Pages

xv, 258 pages

Bibliography

Includes bibliographical references (pages 251-258).

Copyright

Copyright 2014 Hua Yu

Included in

Chemistry Commons

Share

COinS