Temporal action localization (TAL) is a task of detecting the start and end timestamps of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing w...Temporal action localization (TAL) is a task of detecting the start and end timestamps of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing weakly-supervised TAL (W-TAL) methods with only video-level labels cannot provide sufficient supervision. Single-frame supervision has attracted the interest of researchers. Existing paradigms model single-frame annotations from the perspective of video snippet sequences, neglect action discrimination of annotated frames, and do not pay sufficient attention to their correlations in the same category. Considering a category, the annotated frames exhibit distinctive appearance characteristics or clear action patterns.Thus, a novel method to enhance action discrimination via category-specific frame clustering for W-TAL is proposed. Specifically,the K-means clustering algorithm is employed to aggregate the annotated discriminative frames of the same category, which are regarded as exemplars to exhibit the characteristics of the action category. Then, the class activation scores are obtained by calculating the similarities between a frame and exemplars of various categories. Category-specific representation modeling can provide complimentary guidance to snippet sequence modeling in the mainline. As a result, a convex combination fusion mechanism is presented for annotated frames and snippet sequences to enhance the consistency properties of action discrimination,which can generate a robust class activation sequence for precise action classification and localization. Due to the supplementary guidance of action discriminative enhancement for video snippet sequences, our method outperforms existing single-frame annotation based methods. Experiments conducted on three datasets (THUMOS14, GTEA, and BEOID) show that our method achieves high localization performance compared with state-of-the-art methods.展开更多
Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to ta...Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods require the use of sophisticated nor- malization and initialization procedures. Thus head-pose in- variant facial expression recognition continues to be an is- sue to traditional methods. In this paper, we propose a novel approach for pose-invariant FER based on pose-robust fea- tures which are learned by deep learning methods -- prin- cipal component analysis network (PCANet) and convolu- tional neural networks (CNN) (PRP-CNN). In the first stage, unlabeled frontal face images are used to learn features by PCANet. The features, in the second stage, are used as the tar- get of CNN to learn a feature mapping between frontal faces and non-frontal faces. We then describe the non-frontal face images using the novel descriptions generated by the maps, and get unified descriptors for arbitrary face images. Finally, the pose-robust features are used to train a single classifier for FER instead of training multiple models for each spe- cific pose. Our method, on the whole, does not require pose/ landmark annotation and can recognize facial expression in a wide range of orientations. Extensive experiments on two public databases show that our framework yields dramatic improvements in facial expression analysis.展开更多
The growing popularity and application of Web services have led to increased attention regarding the vulnerability of software based on these services. Vulnerability testing examines the trustworthiness and reduces th...The growing popularity and application of Web services have led to increased attention regarding the vulnerability of software based on these services. Vulnerability testing examines the trustworthiness and reduces the security risks of software systems. This paper proposes a worst-input mutation approach for testing Web service vulnerability based on Simple Object Access Protocol (SOAP) messages. Based on characteristics of SOAP messages, the proposed approach uses the farthest neighbor concept to guide generation of the test suite. The corresponding automatic test case generation algorithm, namely, the Test Case generation based on the Farthest Neighbor (TCFN), is also presented. The method involves partitioning the input domain into sub-domains according to the number and type of SOAP message parameters in the TCFN, selecting the candidate test case whose distance is the farthest from all executed test cases, and applying it to test the Web service. We also implement and describe a prototype Web service vulnerability testing tool. The tool was applied to the testing of Web services on the Internet. The experimental results show that the proposed approach can find more vulnerability faults than other related approaches.展开更多
基金supported by the National Natural Science Foundation of China(No.61672268)。
文摘Temporal action localization (TAL) is a task of detecting the start and end timestamps of action instances and classifying them in an untrimmed video. As the number of action categories per video increases, existing weakly-supervised TAL (W-TAL) methods with only video-level labels cannot provide sufficient supervision. Single-frame supervision has attracted the interest of researchers. Existing paradigms model single-frame annotations from the perspective of video snippet sequences, neglect action discrimination of annotated frames, and do not pay sufficient attention to their correlations in the same category. Considering a category, the annotated frames exhibit distinctive appearance characteristics or clear action patterns.Thus, a novel method to enhance action discrimination via category-specific frame clustering for W-TAL is proposed. Specifically,the K-means clustering algorithm is employed to aggregate the annotated discriminative frames of the same category, which are regarded as exemplars to exhibit the characteristics of the action category. Then, the class activation scores are obtained by calculating the similarities between a frame and exemplars of various categories. Category-specific representation modeling can provide complimentary guidance to snippet sequence modeling in the mainline. As a result, a convex combination fusion mechanism is presented for annotated frames and snippet sequences to enhance the consistency properties of action discrimination,which can generate a robust class activation sequence for precise action classification and localization. Due to the supplementary guidance of action discriminative enhancement for video snippet sequences, our method outperforms existing single-frame annotation based methods. Experiments conducted on three datasets (THUMOS14, GTEA, and BEOID) show that our method achieves high localization performance compared with state-of-the-art methods.
文摘Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods require the use of sophisticated nor- malization and initialization procedures. Thus head-pose in- variant facial expression recognition continues to be an is- sue to traditional methods. In this paper, we propose a novel approach for pose-invariant FER based on pose-robust fea- tures which are learned by deep learning methods -- prin- cipal component analysis network (PCANet) and convolu- tional neural networks (CNN) (PRP-CNN). In the first stage, unlabeled frontal face images are used to learn features by PCANet. The features, in the second stage, are used as the tar- get of CNN to learn a feature mapping between frontal faces and non-frontal faces. We then describe the non-frontal face images using the novel descriptions generated by the maps, and get unified descriptors for arbitrary face images. Finally, the pose-robust features are used to train a single classifier for FER instead of training multiple models for each spe- cific pose. Our method, on the whole, does not require pose/ landmark annotation and can recognize facial expression in a wide range of orientations. Extensive experiments on two public databases show that our framework yields dramatic improvements in facial expression analysis.
基金supported by the National Natural Science Foundation of China (Nos. 61202110 and 61063013)the Natural Science Foundation of Jiangsu Province (No. BK2012284)
文摘The growing popularity and application of Web services have led to increased attention regarding the vulnerability of software based on these services. Vulnerability testing examines the trustworthiness and reduces the security risks of software systems. This paper proposes a worst-input mutation approach for testing Web service vulnerability based on Simple Object Access Protocol (SOAP) messages. Based on characteristics of SOAP messages, the proposed approach uses the farthest neighbor concept to guide generation of the test suite. The corresponding automatic test case generation algorithm, namely, the Test Case generation based on the Farthest Neighbor (TCFN), is also presented. The method involves partitioning the input domain into sub-domains according to the number and type of SOAP message parameters in the TCFN, selecting the candidate test case whose distance is the farthest from all executed test cases, and applying it to test the Web service. We also implement and describe a prototype Web service vulnerability testing tool. The tool was applied to the testing of Web services on the Internet. The experimental results show that the proposed approach can find more vulnerability faults than other related approaches.