Abstract:
As the first stage for discovering association rules, frequent itemsets
mining is an important challenging task for large databases. Sampling provides an
efficient way to get approximating answers in much shorter time. Based on the
characteristics of frequent itemsets counting, a new bound for sampling is
proposed, with which less samples are necessary to achieve the required accuracy
and the efficiency is much improved over traditional Chernoff bounds.