Hi all,
Our next meeting in the Information Theory and Applications seminar will take place on Monday, January 29 at 10:00, in room A500.
The speaker is Tomer Berg, who will tell us about how to test whether a stream of observed samples came from the uniform distribution, or from a distribution epsilon-far from uniform, using algorithms with limited memory.
See you there,
Or, Oron, Yuval and Alex
---------------------------------------------
Title: On The Memory Complexity of Uniformity Testing
Abstract: In the problem of uniformity testing with limited memory, we observe a sequence of independent identically distributed random variables drawn from a distribution p over [n], which is either uniform or is epsilon-far from uniform under the total variation distance, and our goal is to determine the correct hypothesis. At each time point we are allowed to update the state of a finite-memory machine with S states, where each state of the machine is assigned one of the hypotheses, and we are interested in obtaining an asymptotic probability of error at most delta<1/2 uniformly under both hypotheses.
We derive upper and lower bounds on the number of states S needed in order to achieve a constant error probability delta, as a function of n and epsilon, where our upper bound is O(n*log(n) /epsilon) and our lower bound is Omega (n+ 1/epsilon). Prior works in the field have almost exclusively used collision counting for upper bounds but, as it turns out, in the limited memory with unlimited samples setup, counting collisions is not the best one can do. Thus, different proof techniques are needed in order to attain our bounds.