## Abstract

We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the l_{1}-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1-δ, it learns a predictor whose generalization error is at most e worse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ϵ, log(l/δ), F(k, L)), where F(k, L) is a function depending on (k, L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.

Original language | English (US) |
---|---|

Title of host publication | 33rd International Conference on Machine Learning, ICML 2016 |

Editors | Kilian Q. Weinberger, Maria Florina Balcan |

Publisher | International Machine Learning Society (IMLS) |

Pages | 1555-1563 |

Number of pages | 9 |

ISBN (Electronic) | 9781510829008 |

State | Published - 2016 |

Externally published | Yes |

Event | 33rd International Conference on Machine Learning, ICML 2016 - New York City, United States Duration: Jun 19 2016 → Jun 24 2016 |

### Publication series

Name | 33rd International Conference on Machine Learning, ICML 2016 |
---|---|

Volume | 3 |

### Other

Other | 33rd International Conference on Machine Learning, ICML 2016 |
---|---|

Country/Territory | United States |

City | New York City |

Period | 6/19/16 → 6/24/16 |

## All Science Journal Classification (ASJC) codes

- Artificial Intelligence
- Software
- Computer Networks and Communications

## Fingerprint

Dive into the research topics of 'L_{1}-regularized neural networks are improperly learnable in polynomial time'. Together they form a unique fingerprint.