TY - JOUR
T1 - A Critical Survey of Deconvolution Methods for Separating Cell Types in Complex Tissues
AU - Mohammadi, Shahin
AU - Zuckerman, Neta
AU - Goldsmith, Andrea
AU - Grama, Ananth
N1 - Funding Information:
This work was supported by the Center for Science of Information (CSoI), an NSF Science and Technology Center, under Grant Agreement CCF-0939370, and by NSF, under Grant BIO 1124962
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2017/2
Y1 - 2017/2
N2 - Identifying properties and concentrations of components from an observed mixture, known as deconvolution, is a fundamental problem in signal processing. It has diverse applications in fields ranging from hyperspectral imaging to noise cancellation in audio recordings. This paper focuses on in-silico deconvolution of signals associated with complex tissues into their constitutive cell-type-specific components and a quantitative characterization of the cell types. Deconvolving mixed tissues/cell types is useful in the removal of contaminants (e.g., surrounding cells) from tumor biopsies, as well as in monitoring changes in the cell population in response to treatment or infection. In these contexts, the observed signal from the mixture of cell types is assumed to be a convolution, using a linear instantaneous (LI) mixing process, of the expression levels of genes in constitutive cell types. The goal is to use known signals corresponding to individual cell types and a model of the mixing process to cast the deconvolution problem as a suitable optimization problem. In this paper, we present a survey and in-depth analysis of models, methods, and assumptions underlying deconvolution techniques. We investigate the choice of the different loss functions for evaluating estimation error, constraints on solutions, preprocessing and data filtering, feature selection, and regularization to enhance the quality of solutions and the impact of these choices on the performance of commonly used regression-based methods for deconvolution. We assess different combinations of these factors and use detailed statistical measures to evaluate their effectiveness. Some of these combinations have been proposed in the literature, whereas others represent novel algorithmic choices for deconvolution. We identify shortcomings of current methods and avenues for further investigation. For many of the identified shortcomings, such as normalization issues and data filtering, we provide new solutions. We summarize our findings in a prescriptive step-by-step process, which can be applied to a wide range of deconvolution problems.
AB - Identifying properties and concentrations of components from an observed mixture, known as deconvolution, is a fundamental problem in signal processing. It has diverse applications in fields ranging from hyperspectral imaging to noise cancellation in audio recordings. This paper focuses on in-silico deconvolution of signals associated with complex tissues into their constitutive cell-type-specific components and a quantitative characterization of the cell types. Deconvolving mixed tissues/cell types is useful in the removal of contaminants (e.g., surrounding cells) from tumor biopsies, as well as in monitoring changes in the cell population in response to treatment or infection. In these contexts, the observed signal from the mixture of cell types is assumed to be a convolution, using a linear instantaneous (LI) mixing process, of the expression levels of genes in constitutive cell types. The goal is to use known signals corresponding to individual cell types and a model of the mixing process to cast the deconvolution problem as a suitable optimization problem. In this paper, we present a survey and in-depth analysis of models, methods, and assumptions underlying deconvolution techniques. We investigate the choice of the different loss functions for evaluating estimation error, constraints on solutions, preprocessing and data filtering, feature selection, and regularization to enhance the quality of solutions and the impact of these choices on the performance of commonly used regression-based methods for deconvolution. We assess different combinations of these factors and use detailed statistical measures to evaluate their effectiveness. Some of these combinations have been proposed in the literature, whereas others represent novel algorithmic choices for deconvolution. We identify shortcomings of current methods and avenues for further investigation. For many of the identified shortcomings, such as normalization issues and data filtering, we provide new solutions. We summarize our findings in a prescriptive step-by-step process, which can be applied to a wide range of deconvolution problems.
KW - Deconvolution
KW - feature selection
KW - gene expression
KW - linear regression
KW - loss function
KW - range filtering
KW - regularization
UR - http://www.scopus.com/inward/record.url?scp=85014894756&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85014894756&partnerID=8YFLogxK
U2 - 10.1109/JPROC.2016.2607121
DO - 10.1109/JPROC.2016.2607121
M3 - Article
AN - SCOPUS:85014894756
SN - 0018-9219
VL - 105
SP - 340
EP - 366
JO - Proceedings of the IEEE
JF - Proceedings of the IEEE
IS - 2
M1 - 7676285
ER -