Marginal regression for multitask learning

Mladen Kolar, Han Liu

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Variable selection is an important and practical problem that arises in analysis of many high-dimensional datasets. Convex optimization procedures that arise from relaxing the NP-hard subset selection procedure, e.g., the Lasso or Dantzig selector, have become the focus of intense theoretical investigations. Although many efficient algorithms exist that solve these problems, finding a solution when the number of variables is large, e.g., several hundreds of thousands in problems arising in genome-wide association analysis, is still computationally challenging. A practical solution for these high-dimensional problems is marginal regression, where the output is regressed on each variable separately. We investigate theoretical properties of marginal regression in a multitask framework. Our contribution include: i) sharp analysis for marginal regression in a single task setting with random design, ii) sufficient conditions for the multitask screening to select the relevant variables, iii) a lower bound on the Hamming distance convergence for multitask variable selection problems. A simulation study further demonstrates the performance of marginal regression.

Original languageEnglish (US)
Pages (from-to)647-655
Number of pages9
JournalJournal of Machine Learning Research
Volume22
StatePublished - Jan 1 2012
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Marginal regression for multitask learning'. Together they form a unique fingerprint.

Cite this