📝 Original Info
- Title: Quantum learning: optimal classification of qubit states
- ArXiv ID: 1004.2468
- Date: 2011-06-23
- Authors: Researchers from original ArXiv paper
📝 Abstract
Pattern recognition is a central topic in Learning Theory with numerous applications such as voice and text recognition, image analysis, computer diagnosis. The statistical set-up in classification is the following: we are given an i.i.d. training set $(X_{1},Y_{1}),... (X_{n},Y_{n})$ where $X_{i}$ represents a feature and $Y_{i}\in \{0,1\}$ is a label attached to that feature. The underlying joint distribution of $(X,Y)$ is unknown, but we can learn about it from the training set and we aim at devising low error classifiers $f:X\to Y$ used to predict the label of new incoming features. Here we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown qubit states. Given a number of `training' copies from each of the states, we would like to `learn' about them by performing a measurement on the training set. The outcome is then used to design mesurements for the classification of future systems with unknown labels. We find the asymptotically optimal classification strategy and show that typically, it performs strictly better than a plug-in strategy based on state estimation. The figure of merit is the excess risk which is the difference between the probability of error and the probability of error of the optimal measurement when the states are known, that is the Helstrom measurement. We show that the excess risk has rate $n^{-1}$ and compute the exact constant of the rate.
💡 Deep Analysis
Deep Dive into Quantum learning: optimal classification of qubit states.
Pattern recognition is a central topic in Learning Theory with numerous applications such as voice and text recognition, image analysis, computer diagnosis. The statistical set-up in classification is the following: we are given an i.i.d. training set $(X_{1},Y_{1}),... (X_{n},Y_{n})$ where $X_{i}$ represents a feature and $Y_{i}\in \{0,1\}$ is a label attached to that feature. The underlying joint distribution of $(X,Y)$ is unknown, but we can learn about it from the training set and we aim at devising low error classifiers $f:X\to Y$ used to predict the label of new incoming features. Here we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown qubit states. Given a number of training' copies from each of the states, we would like to learn’ about them by performing a measurement on the training set. The outcome is then used to design mesurements for the classification of future systems with unknown labels. We find the asymptotically optimal
📄 Full Content
Statistical learning theory [1,2,3,4] is a broad research field stretching over statistics and computer science, whose general goal is to devise algorithms which have the ability to learn from data. One of the central learning problems is how to recognise patterns [5], with practical applications in speech and text recognition, image analysis, computer-aided diagnosis, data mining. The paradigm of Quantum Information theory is that quantum systems carry a new type of information with potentially revolutionary applications such as faster computation and secure communication [6]. Motivated by these theoretical challenges, Quantum Engineering is developing new tools to control and accurately measure individual quantum systems [7]. In the process of engineering exotic quantum states, statistical validation has become a standard experimental procedure [8,9] and Quantum Statistical Inference has passed from its purely theoretical status in the 70's [10,11] to a more practically oriented theory at the interface between the classical and quantum worlds [12,13,14,15]. In this paper we put forward a new type of quantum statistical problem inspired by learning theory, namely quantum state classification. Similar ideas have already appeared in the physics [16,17,18,19] and learning [20,21,22] literature but here we emphasise the close connection with learning and we aim at going beyond the special models based on group symmetry and pure states. However, we limit ourselves to a two dimensional state which could be regarded as a toy model from the viewpoint of learning theory, but hope that more interesting applications will follow. Before explaining what quantum classification is, let us briefly mention the classical set-up we aim at generalising. In supervised learning the goal is to learn to predict an output y ∈ Y, given the input (object) x ∈ X , where input and output are assumed to be correlated and have an unknown joint distribution P over X × Y. To do this, we are first provided with a set of n previously observed inputs with known output variables (called training examples), i.e. independent random pairs (X i , Y i ), i = 1, . . . , n drawn from P. Using the training set, we construct a function h n : X → Y to predict the output for future, yet unseen objects. When Y = {0, 1}, i.e. the output is a binary variable, this is called binary classification and is the typical set-up in pattern recognition. The input space is usually considered to be a subset of p-dimensional space R p , so that the object x can be described by p measurement values often called features. This description is very general as it allows e.g. to handle categorical (non-numerical) values (encoded as integer numbers), images (e.g. measured brightness of each pixel corresponds to a separate feature), time series (features corresponds to the values of the signal at given times), etc. In this paper, we consider the classification problem in which the objects to be classified are quantum states. Simply, we have a quantum system prepared in either of two unknown quantum states and we want to know which one it is. As in the classical case, this only makes sense if we are also provided with training examples from both states, with their respective labels, from which we can learn about the two alternatives. How could such a scenario occur? Suppose we send one bit of information through a noisy quantum channel which is not known. To decode the information (the input in this case) we need to be able to classify the output states corresponding to the two inputs. Alternatively, the binary variable may be related to a coupling of the channel which we want to detect. Needless to say, quantum systems are intrinsically statistical and can be 'learned' only by repeated preparation, so that the problem is really the quantum extension of the classical classification problem. On the other hand this is related to the problem of state discrimination which in the case of two hypotheses, has an explicit solution known as the Helstrom measurement [11]. The point is that when the states are unknown, the Helstrom measurements is itself unknown and has to be learned from the training set. An intuitive solution would be a plug-in procedure: first estimate the two states, and then apply the Helstrom measurement corresponding to the estimates on any new to-be-classified state. This indeed gives a reasonable classification strategy, but as we will see, this is not the best one. The optimal strategy in the asymptotic framework is to directly estimate the Helstrom measurement without intermediate states estimation. The optimality is defined by the natural figure of merit called excess risk, which is the difference between the expected error probability and the error probability of the Helstrom measurement. We show that the excess risk converges to zero with the size of the training set as n -1 and the ratio between the optimal and state estimation plug-in risk is a constant f
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.