In applications of behavior recognition, the use of spatiotemporal invariant feature points can improve the robustness to noise, illumination and geometric distortions. In this paper, we develop a novel detection model of spatiotemporal invariant feature by generalizing the notion of image phase congruency to video volume phase congruency. The proposed model detects feature points by measuring the spatiotemporal phase congruency of Fourier series components along with their characteristic scale and principal orientation. Compared with other state-of-the-art methods, the key advantages of this interest point detector include the invariance to contrast variations and more precise feature location. Furthermore, an invariant feature descriptor is advanced based on the phase congruency map, resulting in enhanced discriminative power in classification tasks. Experimental results on KTH human motion dataset demonstrate the validity and effectiveness of the extracted invariant features in the human behavior recognition scheme.