期刊文献+
共找到11篇文章
< 1 >
每页显示 20 50 100
Deep Scalogram Representations for Acoustic Scene Classification 被引量:4
1
作者 Zhao Ren Kun Qian +3 位作者 Zixing Zhang Vedhas Pandit Alice Baird Bjorn Schuller 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第3期662-669,共8页
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info... Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system. 展开更多
关键词 Acoustic scene classification(ASC) (bidirectional) gated recurrent neural networks((B) GRNNs) convolutional neural networks(CNNs) deep scalogram representation spectrogram representation
下载PDF
Intelligent Deep Data Analytics Based Remote Sensing Scene Classification Model
2
作者 Ahmed Althobaiti Abdullah Alhumaidi Alotaibi +2 位作者 Sayed Abdel-Khalek Suliman A.Alsuhibany Romany F.Mansour 《Computers, Materials & Continua》 SCIE EI 2022年第7期1921-1938,共18页
Latest advancements in the integration of camera sensors paves a way for newUnmannedAerialVehicles(UAVs)applications such as analyzing geographical(spatial)variations of earth science in mitigating harmful environment... Latest advancements in the integration of camera sensors paves a way for newUnmannedAerialVehicles(UAVs)applications such as analyzing geographical(spatial)variations of earth science in mitigating harmful environmental impacts and climate change.UAVs have achieved significant attention as a remote sensing environment,which captures high-resolution images from different scenes such as land,forest fire,flooding threats,road collision,landslides,and so on to enhance data analysis and decision making.Dynamic scene classification has attracted much attention in the examination of earth data captured by UAVs.This paper proposes a new multi-modal fusion based earth data classification(MMF-EDC)model.The MMF-EDC technique aims to identify the patterns that exist in the earth data and classifies them into appropriate class labels.The MMF-EDC technique involves a fusion of histogram of gradients(HOG),local binary patterns(LBP),and residual network(ResNet)models.This fusion process integrates many feature vectors and an entropy based fusion process is carried out to enhance the classification performance.In addition,the quantum artificial flora optimization(QAFO)algorithm is applied as a hyperparameter optimization technique.The AFO algorithm is inspired by the reproduction and the migration of flora helps to decide the optimal parameters of the ResNet model namely learning rate,number of hidden layers,and their number of neurons.Besides,Variational Autoencoder(VAE)based classification model is applied to assign appropriate class labels for a useful set of feature vectors.The proposedMMF-EDCmodel has been tested using UCM and WHU-RS datasets.The proposed MMFEDC model attains exhibits promising classification results on the applied remote sensing images with the accuracy of 0.989 and 0.994 on the test UCM and WHU-RS dataset respectively. 展开更多
关键词 Remote sensing unmanned aerial vehicles deep learning artificial intelligence scene classification
下载PDF
Adaptive Binary Coding for Scene Classification Based on Convolutional Networks
3
作者 Shuai Wang Xianyi Chen 《Computers, Materials & Continua》 SCIE EI 2020年第12期2065-2077,共13页
With the rapid development of computer technology,millions of images are produced everyday by different sources.How to efficiently process these images and accurately discern the scene in them becomes an important but... With the rapid development of computer technology,millions of images are produced everyday by different sources.How to efficiently process these images and accurately discern the scene in them becomes an important but tough task.In this paper,we propose a novel supervised learning framework based on proposed adaptive binary coding for scene classification.Specifically,we first extract some high-level features of images under consideration based on available models trained on public datasets.Then,we further design a binary encoding method called one-hot encoding to make the feature representation more efficient.Benefiting from the proposed adaptive binary coding,our method is free of time to train or fine-tune the deep network and can effectively handle different applications.Experimental results on three public datasets,i.e.,UIUC sports event dataset,MIT Indoor dataset,and UC Merced dataset in terms of three different classifiers,demonstrate that our method is superior to the state-of-the-art methods with large margins. 展开更多
关键词 scene classification convolutional neural network one-hot encoding supervised feature training
下载PDF
Natural Scene Classification Inspired by Visual Perception and Cognition Mechanisms
4
作者 ZHANG Rui 《重庆理工大学学报(自然科学)》 CAS 2011年第7期24-43,共20页
The process of human natural scene categorization consists of two correlated stages: visual perception and visual cognition of natural scenes.Inspired by this fact,we propose a biologically plausible approach for natu... The process of human natural scene categorization consists of two correlated stages: visual perception and visual cognition of natural scenes.Inspired by this fact,we propose a biologically plausible approach for natural scene image classification.This approach consists of one visual perception model and two visual cognition models.The visual perception model,composed of two steps,is used to extract discriminative features from natural scene images.In the first step,we mimic the oriented and bandpass properties of human primary visual cortex by a special complex wavelets transform,which can decompose a natural scene image into a series of 2D spatial structure signals.In the second step,a hybrid statistical feature extraction method is used to generate gist features from those 2D spatial structure signals.Then we design a cognitive feedback model to realize adaptive optimization for the visual perception model.At last,we build a multiple semantics based cognition model to imitate human cognitive mode in rapid natural scene categorization.Experiments on natural scene datasets show that the proposed method achieves high efficiency and accuracy for natural scene classification. 展开更多
关键词 natural scene classification visual perception model visual cognition model
下载PDF
Semi-supervised remote sensing image scene classification with prototype-based consistency
5
作者 Yang LI Zhang LI +2 位作者 Zi WANG Kun WANG Qifeng YU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第2期459-470,共12页
Deep learning significantly improves the accuracy of remote sensing image scene classification,benefiting from the large-scale datasets.However,annotating the remote sensing images is time-consuming and even tough for... Deep learning significantly improves the accuracy of remote sensing image scene classification,benefiting from the large-scale datasets.However,annotating the remote sensing images is time-consuming and even tough for experts.Deep neural networks trained using a few labeled samples usually generalize less to new unseen images.In this paper,we propose a semi-supervised approach for remote sensing image scene classification based on the prototype-based consistency,by exploring massive unlabeled images.To this end,we,first,propose a feature enhancement module to extract discriminative features.This is achieved by focusing the model on the foreground areas.Then,the prototype-based classifier is introduced to the framework,which is used to acquire consistent feature representations.We conduct a series of experiments on NWPU-RESISC45 and Aerial Image Dataset(AID).Our method improves the State-Of-The-Art(SOTA)method on NWPU-RESISC45 from 92.03%to 93.08%and on AID from 94.25%to 95.24%in terms of accuracy. 展开更多
关键词 Semi-supervised learning Remote sensing scene classification Prototype network Deep learning
原文传递
A coupled multi-task feature boosting method for remote sensing scene classification 被引量:1
6
作者 WANG TengFei GU YanFeng +1 位作者 GAO GuoMing ZENG XiaoPeng 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2023年第3期663-673,共11页
The scene classification plays an essential role in processing very high resolution(VHR)images for understanding.The scene classification in remote sensing faces two difficulties:the mismatching features caused by the... The scene classification plays an essential role in processing very high resolution(VHR)images for understanding.The scene classification in remote sensing faces two difficulties:the mismatching features caused by the model overfitting problem and the semantic information losing problem.The multi-task method helps solve the problems by using the share weights of multiply tasks.We propose a feature boosting method with a multi-task framework that combines the scene classification task and the semantic segmentation task to overcome the difficulties.Different from the traditional multi-task learning method,the two tasks are coupled together via a weakly supervised learning method so that it does not require the labelled semantic segmentation samples.First,we proposed a weakly supervised segmentation method to create the interconnection of the segmentation task and the classification task.And we achieve a coarse segmentation result which is highly correlated to the classification by the weakly supervised method.Second,according to the surface distribution of remote sensing,we propose a sparse surface constraint to obtain fine segmentation results.Fine features are obtained by constraining the shared weights of the weakly supervised segmentation method.Last,we classify the scenes using the fine features and conduct experiments on the public remote sensing scene classification datasets.Experimental results demonstrate that the proposed coupled multi-task model outperforms the stateof-the-art methods on remote sensing scene classification. 展开更多
关键词 scene classification coupled multi-task feature boosting
原文传递
TP-MobNet: A Two-pass Mobile Network for Low-complexity Classification of Acoustic Scene
7
作者 Soonshin Seo Junseok Oh +3 位作者 Eunsoo Cho Hosung Park Gyujin Kim Ji-Hwan Kim 《Computers, Materials & Continua》 SCIE EI 2022年第11期3291-3303,共13页
Acoustic scene classification(ASC)is a method of recognizing and classifying environments that employ acoustic signals.Various ASC approaches based on deep learning have been developed,with convolutional neural networ... Acoustic scene classification(ASC)is a method of recognizing and classifying environments that employ acoustic signals.Various ASC approaches based on deep learning have been developed,with convolutional neural networks(CNNs)proving to be the most reliable and commonly utilized in ASC systems due to their suitability for constructing lightweight models.When using ASC systems in the real world,model complexity and device robustness are essential considerations.In this paper,we propose a two-pass mobile network for low-complexity classification of the acoustic scene,named TP-MobNet.With inverse residuals and linear bottlenecks,TPMobNet is based on MobileNetV2,and following mobile blocks,coordinate attention and two-pass fusion approaches are utilized.The log-range dependencies and precise position information in feature maps can be trained via coordinate attention.By capturing more diverse feature resolutions at the network’s end sides,two-pass fusions can also train generalization.Also,the model size is reduced by applying weight quantization to the trained model.By adding weight quantization to the trained model,the model size is also lowered.The TAU Urban Acoustic Scenes 2020 Mobile development set was used for all of the experiments.It has been confirmed that the proposed model,with a model size of 219.6 kB,achieves an accuracy of 73.94%. 展开更多
关键词 Acoustic scene classification LOW-COMPLEXITY device robustness two-pass mobile network coordinate attention weight quantization
下载PDF
A More Efficient Approach for Remote Sensing Image Classification
8
作者 Huaxiang Song 《Computers, Materials & Continua》 SCIE EI 2023年第3期5741-5756,共16页
Over the past decade,the significant growth of the convolutional neural network(CNN)based on deep learning(DL)approaches has greatly improved the machine learning(ML)algorithm’s performance on the semantic scene clas... Over the past decade,the significant growth of the convolutional neural network(CNN)based on deep learning(DL)approaches has greatly improved the machine learning(ML)algorithm’s performance on the semantic scene classification(SSC)of remote sensing images(RSI).However,the unbalanced attention to classification accuracy and efficiency has made the superiority of DL-based algorithms,e.g.,automation and simplicity,partially lost.Traditional ML strategies(e.g.,the handcrafted features or indicators)and accuracy-aimed strategies with a high trade-off(e.g.,the multi-stage CNNs and ensemble of multi-CNNs)are widely used without any training efficiency optimization involved,which may result in suboptimal performance.To address this problem,we propose a fast and simple training CNN framework(named FST-EfficientNet)for RSI-SSC based on an EfficientNetversion2 small(EfficientNetV2-S)CNN model.The whole algorithm flow is completely one-stage and end-to-end without any handcrafted features or discriminators introduced.In the implementation of training efficiency optimization,only several routine data augmentation tricks coupled with a fixed ratio of resolution or a gradually increasing resolution strategy are employed,so that the algorithm’s trade-off is very cheap.The performance evaluation shows that our FST-EfficientNet achieves new state-of-the-art(SOTA)records in the overall accuracy(OA)with about 0.8%to 2.7%ahead of all earlier methods on the Aerial Image Dataset(AID)and Northwestern Poly-technical University Remote Sensing Image Scene Classification 45 Dataset(NWPU-RESISC45D).Meanwhile,the results also demonstrate the importance and indispensability of training efficiency optimization strategies for RSI-SSC by DL.In fact,it is not necessary to gain better classification accuracy by completely relying on an excessive trade-off without efficiency.Ultimately,these findings are expected to contribute to the development of more efficient CNN-based approaches in RSI-SSC. 展开更多
关键词 FST-EfficientNet efficient approach scene classification remote sensing deep learning
下载PDF
Bridging the semantic gap with human perception based features for scene categorization
9
作者 Padmavati Shrivastava K.K.Bhoyar A.S.Zadgaonkar 《International Journal of Intelligent Computing and Cybernetics》 EI 2017年第3期387-406,共20页
Purpose–The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision,in gathering knowledge about the structure,content and the surrounding environment of a real-w... Purpose–The purpose of this paper is to build a classification system which mimics the perceptual ability of human vision,in gathering knowledge about the structure,content and the surrounding environment of a real-world natural scene,at a quick glance accurately.This paper proposes a set of novel features to determine the gist of a given scene based on dominant color,dominant direction,openness and roughness features.Design/methodology/approach–The classification system is designed at two different levels.At the first level,a set of low level features are extracted for each semantic feature.At the second level the extracted features are subjected to the process of feature evaluation,based on inter-class and intra-class distances.The most discriminating features are retained and used for training the support vector machine(SVM)classifier for two different data sets.Findings–Accuracy of the proposed system has been evaluated on two data sets:the well-known Oliva-Torralba data set and the customized image data set comprising of high-resolution images of natural landscapes.The experimentation on these two data sets with the proposed novel feature set and SVM classifier has provided 92.68 percent average classification accuracy,using ten-fold cross validation approach.The set of proposed features efficiently represent visual information and are therefore capable of narrowing the semantic gap between low-level image representation and high-level human perception.Originality/value–The method presented in this paper represents a new approach for extracting low-level features of reduced dimensionality that is able to model human perception for the task of scene classification.The methods of mapping primitive features to high-level features are intuitive to the user and are capable of reducing the semantic gap.The proposed feature evaluation technique is general and can be applied across any domain. 展开更多
关键词 OPENNESS Roughness Human perception JND P-VALUE scene classification Semantic gap Feature elevation Dominant direction
下载PDF
Innovative Analysis Ready Data(ARD)product and process requirements,software system design,algorithms and implementation at the midstream as necessary-but-notsuffcient precondition of the downstream in a new notion of Space Economy 4.0-Part 1:Problem background in Artificial General Intelligence(AGl
10
作者 Andrea Baraldi Luca D.Sapia +3 位作者 Dirk Tiede Martin Sudmanns Hannah L.Augustin Stefan Lang 《Big Earth Data》 EI CSCD 2023年第3期455-693,共239页
Aiming at the convergence between Earth observation(EO)Big Data and Artificial General Intelligence(AGI),this two-part paper identifies an innovative,but realistic EO optical sensory imagederived semantics-enriched An... Aiming at the convergence between Earth observation(EO)Big Data and Artificial General Intelligence(AGI),this two-part paper identifies an innovative,but realistic EO optical sensory imagederived semantics-enriched Analysis Ready Data(ARD)productpair and process gold standard as linchpin for success of a new notion of Space Economy 4.0.To be implemented in operational mode at the space segment and/or midstream segment by both public and private EO big data providers,it is regarded as necessarybut-not-sufficient“horizontal”(enabling)precondition for:(I)Transforming existing EO big raster-based data cubes at the midstream segment,typically affected by the so-called data-rich information-poor syndrome,into a new generation of semanticsenabled EO big raster-based numerical data and vector-based categorical(symbolic,semi-symbolic or subsymbolic)information cube management systems,eligible for semantic content-based image retrieval and semantics-enabled information/knowledge discovery.(II)Boosting the downstream segment in the development of an ever-increasing ensemble of“vertical”(deep and narrow,user-specific and domain-dependent)value–adding information products and services,suitable for a potentially huge worldwide market of institutional and private end-users of space technology.For the sake of readability,this paper consists of two parts.In the present Part 1,first,background notions in the remote sensing metascience domain are critically revised for harmonization across the multidisciplinary domain of cognitive science.In short,keyword“information”is disambiguated into the two complementary notions of quantitative/unequivocal information-as-thing and qualitative/equivocal/inherently ill-posed information-as-data-interpretation.Moreover,buzzword“artificial intelligence”is disambiguated into the two better-constrained notions of Artificial Narrow Intelligence as part-without-inheritance-of AGI.Second,based on a betterdefined and better-understood vocabulary of multidisciplinary terms,existing EO optical sensory image-derived Level 2/ARD products and processes are investigated at the Marr five levels of understanding of an information processing system.To overcome their drawbacks,an innovative,but realistic EO optical sensory image-derived semantics-enriched ARD product-pair and process gold standard is proposed in the subsequent Part 2. 展开更多
关键词 Artificial Narrow Intelligence big data cognitive science computer vision Earth observation essential climate variables Global Earth Observation System of(component)Systems inductive/deductive/hybrid inference scene classification Map Space Economy 4.0 radiometric corrections of optical imagery from atmospheric topographic adjacency and bidirectional reflectance distribution function effects semantic content-based image retrieval 2D spatial topology-preserving/retinotopic image mapping world ontology(synonym for conceptual/mental/perceptual model of the world)
原文传递
Innovative Analysis Ready Data(ARD)product and process requirements,software system design,algorithms and implementation at the midstream as necessary-but-notsufficient precondition of the downstream in a new notion of Space Economy 4.0-Part 2:Software developments
11
作者 Andrea Baraldi Luca D.Sapia +3 位作者 Dirk Tiede Martin Sudmanns Hannah Augustin Stefan Lang 《Big Earth Data》 EI CSCD 2023年第3期694-811,共118页
Aiming at the convergence between Earth observation(EO)Big Data and Artificial General Intelligence(AGI),this paper consists of two parts.In the previous Part 1,existing EO optical sensory imagederived Level 2/Analysi... Aiming at the convergence between Earth observation(EO)Big Data and Artificial General Intelligence(AGI),this paper consists of two parts.In the previous Part 1,existing EO optical sensory imagederived Level 2/Analysis Ready Data(ARD)products and processes are critically compared,to overcome their lack of harmonization/standardization/interoperability and suitability in a new notion of Space Economy 4.0.In the present Part 2,original contributions comprise,at the Marr five levels of system understanding:(1)an innovative,but realistic EO optical sensory image-derived semantics-enriched ARD co-product pair requirements specification.First,in the pursuit of third-level semantic/ontological interoperability,a novel ARD symbolic(categorical and semantic)co-product,known as Scene Classification Map(SCM),adopts an augmented Cloud versus Not-Cloud taxonomy,whose Not-Cloud class legend complies with the standard fully-nested Land Cover Classification System’s Dichotomous Phase taxonomy proposed by the United Nations Food and Agriculture Organization.Second,a novel ARD subsymbolic numerical co-product,specifically,a panchromatic or multispectral EO image whose dimensionless digital numbers are radiometrically calibrated into a physical unit of radiometric measure,ranging from top-of-atmosphere reflectance to surface reflectance and surface albedo values,in a five-stage radiometric correction sequence.(2)An original ARD process requirements specification.(3)An innovative ARD processing system design(architecture),where stepwise SCM generation and stepwise SCM-conditional EO optical image radiometric correction are alternated in sequence.(4)An original modular hierarchical hybrid(combined deductive and inductive)computer vision subsystem design,provided with feedback loops,where software solutions at the Marr two shallowest levels of system understanding,specifically,algorithm and implementation,are selected from the scientific literature,to benefit from their technology readiness level as proof of feasibility,required in addition to proven suitability.To be implemented in operational mode at the space segment and/or midstream segment by both public and private EO big data providers,the proposed EO optical sensory image-derived semantics-enriched ARD product-pair and process reference standard is highlighted as linchpin for success of a new notion of Space Economy 4.0. 展开更多
关键词 Analysis Ready Data Artificial General Intelligence Artificial Narrow Intelligence big data cognitive science computer vision Earth observation essential climate variables Global Earth Observation System of(component)Systems inductive/deductive/hybrid inference scene classification Map Space Economy 4.0 radiometric corrections of optical imagery from atmospheric topographic adjacency and bidirectional reflectance distribution function effects semantic content-based image retrieval 2D spatial topology-preserving/retinotopic image mapping world ontology(synonym for conceptual/mental/perceptual model of the world)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部