Hi all,
Our next meeting in the Information Theory and Applications seminar will
take place on Monday, January 29 at 10:00, in room A500.
The speaker is Tomer Berg, who will tell us about how to test whether a
stream of observed samples came from the uniform distribution, or from a
distribution epsilon-far from uniform, using algorithms with limited memory.
See you there,
Or, Oron, Yuval and Alex
---------------------------------------------
*Title*: On The Memory Complexity of Uniformity Testing
*Abstract*: In the problem of uniformity testing with limited memory, we
observe a sequence of independent identically distributed random variables
drawn from a distribution p over [n], which is either uniform or is
epsilon-far from uniform under the total variation distance, and our goal
is to determine the correct hypothesis. At each time point we are allowed
to update the state of a finite-memory machine with S states, where each
state of the machine is assigned one of the hypotheses, and we are
interested in obtaining an asymptotic probability of error at most
delta<1/2 uniformly under both hypotheses.
We derive upper and lower bounds on the number of states S needed in
order to achieve a constant error probability delta, as a function of n and
epsilon, where our upper bound is O(n*log(n) /epsilon) and our lower bound
is Omega (n+ 1/epsilon). Prior works in the field have almost exclusively
used collision counting for upper bounds but, as it turns out, in the
limited memory with unlimited samples setup, counting collisions is not the
best one can do. Thus, different proof techniques are needed in order to
attain our bounds.