期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Gesture-based target acquisition in virtual and augmented reality 被引量:3
1
作者 Yukang YAN Xin YI +1 位作者 Chun YU Yuanchun SHI 《Virtual Reality & Intelligent Hardware》 2019年第3期276-289,共14页
Background Gesture is a basic interaction channel that is frequently used by humans to communicate in daily life. In this paper, we explore to use gesture-based approaches for target acquisition in virtual and augment... Background Gesture is a basic interaction channel that is frequently used by humans to communicate in daily life. In this paper, we explore to use gesture-based approaches for target acquisition in virtual and augmented reality. A typical process of gesture-based target acquisition is: when a user intends to acquire a target, she performs a gesture with her hands, head or other parts of the body, the computer senses and recognizes the gesture and infers the most possible target. Methods We build mental model and behavior model of the user to study two key parts of the interaction process. Mental model describes how user thinks up a gesture for acquiring a target, and can be the intuitive mapping between gestures and targets. Behavior model describes how user moves the body parts to perform the gestures, and the relationship between the gesture that user intends to perform and signals that computer senses. Results In this paper, we present and discuss three pieces of research that focus on the mental model and behavior model of gesture-based target acquisition in VR and AR. Conclusions We show that leveraging these two models, interaction experience and performance can be improved in VR and AR environments. 展开更多
关键词 Gesture-based interaction Mental model Behavior model Virtual reality Augmented reality
下载PDF
FilterGNN:Image feature matching with cascaded outlier filters and linearattention
2
作者 Jun-Xiong Cai Tai-Jiang Mu Yu-Kun Lai 《Computational Visual Media》 SCIE EI CSCD 2024年第5期873-884,共12页
The cross-view matching of local image features is a fundamental task in visual localization and 3D reconstruction.This study proposes FilterGNN,a transformer-based graph neural network(GNN),aiming to improve the matc... The cross-view matching of local image features is a fundamental task in visual localization and 3D reconstruction.This study proposes FilterGNN,a transformer-based graph neural network(GNN),aiming to improve the matching efficiency and accuracy of visual descriptors.Based on high matching sparseness and coarse-to-fine covisible area detection,FilterGNN utilizes cascaded optimal graph-matching filter modules to dynamically reject outlier matches.Moreover,we successfully adapted linear attention in FilterGNN with post-instance normalization support,which significantly reduces the complexity of complete graph learning from O(N2)to O(N).Experiments show that FilterGNN requires only 6%of the time cost and 33.3%of the memory cost compared with SuperGlue under a large-scale input size and achieves a competitive performance in various tasks,such as pose estimation,visual localization,and sparse 3D reconstruction. 展开更多
关键词 image matching TRANSFORMER linear attention visual localization sparse reconstruction
原文传递
Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception
3
作者 贾珈 梁伟俭 +4 位作者 吴育昊 张秀龙 王昊 蔡莲红 蒙美玲 《Journal of Computer Science & Technology》 SCIE EI CSCD 2014年第5期751-761,共11页
Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning... Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners. 展开更多
关键词 second language learning computer-aided pronunciation training mispronunciation computational speech perception
原文传递
Dual relations in physical and cyber space 被引量:4
4
作者 XU Guangyou TAO Linmi +1 位作者 ZHANG David SHI Yuanchun 《Chinese Science Bulletin》 SCIE EI CAS 2006年第1期121-128,共8页
With the rapid development of computer, communication, and sensing technology, our living space has been transformed from physical space into a space shared by physical space and cyberspace. In the light of this fact ... With the rapid development of computer, communication, and sensing technology, our living space has been transformed from physical space into a space shared by physical space and cyberspace. In the light of this fact and based on analyzing the char- acteristics of physical and cyberspace, respectively, this paper proposed that there are dual relations be- tween physical space and cyberspace. Establishing dual relations is realized in the following two processes: the process of information extraction, analysis and structurization from physical space to cyberspace and the process of providing the information services from cyberspace to physical space by means of inferring the intention, state and demand of users, as well. HCI (Human Cyberspace Interaction) in dual space means to establish the dual relations, which embodied the human centered HCI, i.e. the interaction is carried out in the way accustomed to users and without distract- ing their attention. 展开更多
关键词 实际空间 信息空间 对偶关系 传感技术
下载PDF
Detecting human-object interaction with multi-level pairwise feature network 被引量:3
5
作者 Hanchao Liu Tai-Jiang Mu Xiaolei Huan 《Computational Visual Media》 EI CSCD 2021年第2期229-239,共11页
Human–object interaction(HOI)detection is crucial for human-centric image understanding which aims to infer human,action,object triplets within an image.Recent studies often exploit visual features and the spatial co... Human–object interaction(HOI)detection is crucial for human-centric image understanding which aims to infer human,action,object triplets within an image.Recent studies often exploit visual features and the spatial configuration of a human–object pair in order to learn the action linking the human and object in the pair.We argue that such a paradigm of pairwise feature extraction and action inference can be applied not only at the whole human and object instance level,but also at the part level at which a body part interacts with an object,and at the semantic level by considering the semantic label of an object along with human appearance and human–object spatial configuration,to infer the action.We thus propose a multi-level pairwise feature network(PFNet)for detecting human–object interactions.The network consists of three parallel streams to characterize HOI utilizing pairwise features at the above three levels;the three streams are finally fused to give the action prediction.Extensive experiments show that our proposed PFNet outperforms other state-of-the-art methods on the VCOCO dataset and achieves comparable results to the state-of-the-art on the HICO-DET dataset. 展开更多
关键词 human–object interaction detection pairwise feature network deep learning MULTI-LEVEL object instance
原文传递
A Chan–Vese Model Based on the Markov Chain for Unsupervised Medical Image Segmentation 被引量:2
6
作者 Quanwei Huang Yuezhi Zhou +4 位作者 Linmi Tao Weikang Yu Yaoxue Zhang Li Huo Zuoxiang He 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第6期833-844,共12页
The accurate segmentation of medical images is crucial to medical care and research;however, many efficient supervised image segmentation methods require sufficient pixel level labels. Such requirement is difficult to... The accurate segmentation of medical images is crucial to medical care and research;however, many efficient supervised image segmentation methods require sufficient pixel level labels. Such requirement is difficult to meet in practice and even impossible in some cases, e.g., rare Pathoma images. Inspired by traditional unsupervised methods, we propose a novel Chan–Vese model based on the Markov chain for unsupervised medical image segmentation. It combines local information brought by superpixels with the global difference between the target tissue and the background. Based on the Chan–Vese model, we utilize weight maps generated by the Markov chain to model and solve the segmentation problem iteratively using the min-cut algorithm at the superpixel level.Our method exploits abundant boundary and local region information in segmentation and thus can handle images with intensity inhomogeneity and object sparsity. In our method, users gain the power of fine-tuning parameters to achieve satisfactory results for each segmentation. By contrast, the result from deep learning based methods is rigid.The performance of our method is assessed by using four Computerized Tomography(CT) datasets. Experimental results show that the proposed method outperforms traditional unsupervised segmentation techniques. 展开更多
关键词 medical image unsupervised segmentation Markov chain
原文传递
Estimating Illumination Parameters Using Spherical Harmonics Coefficients in Frequency Space 被引量:1
7
作者 谢峰 陶霖密 徐光祐 《Tsinghua Science and Technology》 SCIE EI CAS 2007年第1期44-50,共7页
An algorithm is presented for estimating the direction and strength of point light with the strength of ambient illumination. Existing approaches evaluate these illumination parameters directly in the high dimensional... An algorithm is presented for estimating the direction and strength of point light with the strength of ambient illumination. Existing approaches evaluate these illumination parameters directly in the high dimensional image space, while we estimate the parameters in two steps: first by projecting the image to an orthogonal linear subspace based on spherical harmonic basis functions and then by calculating the parameters in the low dimensional subspace. The test results using the CMU PIE database and Yale Database B show the stability and effectiveness of the method. The resulting illumination information can be used to synthesize more realistic relighting images and to recognize objects under variable illumination. 展开更多
关键词 illumination parameters estimation spherical harmonic image relighting
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部