Abstract
The rapid increase in both the quantity and complexity of data that are being generated daily in the field of environmental science and engineering (ESE) demands accompanied advancement in data analytics. Advanced data analysis approaches, such as machine learning (ML), have become indispensable tools for revealing hidden patterns or deducing correlations for which conventional analytical methods face limitations or challenges. However, ML concepts and practices have not been widely utilized by researchers in ESE. This feature explores the potential of ML to revolutionize data analysis and modeling in the ESE field, and covers the essential knowledge needed for such applications. First, we use five examples to illustrate how ML addresses complex ESE problems. We then summarize four major types of applications of ML in ESE: making predictions; extracting feature importance; detecting anomalies; and discovering new materials or chemicals. Next, we introduce the essential knowledge required and current shortcomings in ML applications in ESE, with a focus on three important but often overlooked components when applying ML: correct model development, proper model interpretation, and sound applicability analysis. Finally, we discuss challenges and future opportunities in the application of ML tools in ESE to highlight the potential of ML in this field.
Original language | English (US) |
---|---|
Pages (from-to) | 12741-12754 |
Number of pages | 14 |
Journal | Environmental Science and Technology |
Volume | 55 |
Issue number | 19 |
DOIs | |
State | Published - Oct 5 2021 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- General Chemistry
- Environmental Chemistry
Keywords
- applicability domain
- artificial intelligence
- best practices
- feature importance
- machine learning modeling
- model applications
- model interpretation
- predictive modeling