Sparse signal representation from overcomplete dictionaries have been extensively investigated in recent research, leading to state-of-the-art results in signal, image and video restoration. One of the most important issues is involved in selecting the proper size of dictionary. However, the related guidelines are still not established. In this paper, we tackle this problem by proposing a so-called sub clustering K-SVD algorithm. This approach incorporates the subtractive clustering method into K-SVD to retain the most important atom candidates. At the same time, the redundant atoms are removed to produce a well-trained dictionary. As for a given dataset and approximation error bound, the proposed approach can deduce the optimized size of dictionary, which is greatly compressed as compared with the one needed in the K-SVD algorithm.