Variable selection for ad prediction

Suma Bhat, Kenneth Church

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the problem of predicting the probability of a click for an advertisement when the outcome of a click or no-click is expressed by means of a set of a large number of variables. Many, if not most, of these variables are very weakly related to the clicking of the ad. Thus, a traditional approach to address this problem that treats each variable on an equal and blind footing takes away the interpretability in explaining the underlying process of the outcome. Such an approach would be computationally expensive and, further, may suffer from poor generalization. We investigate the forward selection method for variable subset selection in the domain of advertisement click-through-rate prediction. The forward selection method proceeds sequentially in a way that rewards a set of variables by how much information it provides regarding the outcome, but penalizes the set based on the number of variables in it. Concretely, we propose a logistic regression model for estimating the conditional expectation between the outcome and the ensemble of variables. The model obtained compares favorably with that obtained via an exhaustive search through the model space. We also observe that the set of variables selected by the forward selection procedure has better predictive power than that selected by considering their individual statistical significance. Thus we show that the forward-selection method for subset selection serves to produce a good model for predicting ad click-through-rates.

Original languageEnglish (US)
Title of host publicationProceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising, ADKDD'08
Pages45-49
Number of pages5
DOIs
StatePublished - 2008
Externally publishedYes
Event2nd International Workshop on Data Mining and Audience Intelligence for Advertising, ADKDD'08 - Las Vegas, NV, United States
Duration: Aug 24 2008Aug 24 2008

Publication series

NameProceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising, ADKDD'08

Conference

Conference2nd International Workshop on Data Mining and Audience Intelligence for Advertising, ADKDD'08
Country/TerritoryUnited States
CityLas Vegas, NV
Period8/24/088/24/08

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software

Keywords

  • Click-through-rate
  • Model selection
  • Variable selection
  • Web advertising

Fingerprint

Dive into the research topics of 'Variable selection for ad prediction'. Together they form a unique fingerprint.

Cite this