### Abstract

Suppose we have many copies of an unknown n-qubit state ρ. We measure some copies of ρ using a known two-outcome measurement E_{1}, then other copies using a measurement E_{2}, and so on. At each stage t, we generate a current hypothesis ωt about the state ρ, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that |Tr (E_{i}ω_{t}) - Tr (E_{i}ρ)|, the error in our prediction for the next measurement, is at least ϵ at most O(n/ϵ^{2}) times. Even in the 'non-realizable' setting-where there could be arbitrary noise in the measurement outcomes- we show how to output hypothesis states that incur at most O(√Tn ) excess loss over the best possible state on the first T measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results-using convex optimization, quantum postselection, and sequential fat-shattering dimension-which have different advantages in terms of parameters and portability.

Original language | English (US) |
---|---|

Article number | 124019 |

Journal | Journal of Statistical Mechanics: Theory and Experiment |

Volume | 2019 |

Issue number | 12 |

DOIs | |

State | Published - Dec 20 2019 |

### All Science Journal Classification (ASJC) codes

- Statistical and Nonlinear Physics
- Statistics and Probability
- Statistics, Probability and Uncertainty

### Keywords

- machine learning

## Fingerprint Dive into the research topics of 'Online learning of quantum states'. Together they form a unique fingerprint.

## Cite this

*Journal of Statistical Mechanics: Theory and Experiment*,

*2019*(12), [124019]. https://doi.org/10.1088/1742-5468/ab3988