期刊文献+
共找到955篇文章
< 1 2 48 >
每页显示 20 50 100
Efficiency of Two-Stage Adaptive Cluster Sampling Design in Estimating Fringe-Eared Oryx
1
作者 Jesse Wachira Mwangi Mohamed Esha Salim 《Open Journal of Statistics》 2012年第5期474-477,共4页
Two-stage adaptive cluster sampling and two-stage conventional sampling designs were used to estimate population total of Fringe-Eared Oryx that are clustered and sparsely distributed. The study region was Amboseli-We... Two-stage adaptive cluster sampling and two-stage conventional sampling designs were used to estimate population total of Fringe-Eared Oryx that are clustered and sparsely distributed. The study region was Amboseli-West Kilimanjaro and Magadi-Natron cross boarder landscape between Kenya and Tanzania. The study region was partitioned into different primary sampling units with different secondary sampling units that were of different sizes. Results show that two-stage adaptive cluster sampling design is efficient compared to simple random sampling and the conventional two- stage sampling design. The design is less variable compared to the conventional two-stage sampling design. 展开更多
关键词 Non-Overlapping Scheme cluster sampling Horvitz-Thompson ESTIMATOR
下载PDF
Scaling up the DBSCAN Algorithm for Clustering Large Spatial Databases Based on Sampling Technique 被引量:9
2
作者 Guan Ji hong 1, Zhou Shui geng 2, Bian Fu ling 3, He Yan xiang 1 1. School of Computer, Wuhan University, Wuhan 430072, China 2.State Key Laboratory of Software Engineering, Wuhan University, Wuhan 430072, China 3.College of Remote Sensin 《Wuhan University Journal of Natural Sciences》 CAS 2001年第Z1期467-473,共7页
Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni... Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases. 展开更多
关键词 SPATIAL DATABASES data MINING clusterING sampling DBSCAN algorithm
下载PDF
Efficiency of the Adaptive Cluster Sampling Designs in Estimation of Rare Populations
3
作者 Charles Mwangi Ali Islam Luke Orawo 《Open Journal of Statistics》 2014年第5期412-418,共7页
Adaptive cluster sampling (ACS) has been a very important tool in estimation of population parameters of rare and clustered population. The fundamental idea behind this sampling plan is to decide on an initial sample ... Adaptive cluster sampling (ACS) has been a very important tool in estimation of population parameters of rare and clustered population. The fundamental idea behind this sampling plan is to decide on an initial sample from a defined population and to keep on sampling within the vicinity of the units that satisfy the condition that at least one characteristic of interest exists in a unit selected in the initial sample. Despite being an important tool for sampling rare and clustered population, adaptive cluster sampling design is unable to control the final sample size when no prior knowledge of the population is available. Thus adaptive cluster sampling with data-driven stopping rule (ACS’) was proposed to control the final sample size when prior knowledge of population structure is not available. This study examined the behavior of the HT, and HH estimator under the ACS design and ACS’ design using artificial population that is designed to have all the characteristics of a rare and clustered population. The efficiencies of the HT and HH estimator were used to determine the most efficient design in estimation of population mean in rare and clustered population. Results of both the simulated data and the real data show that the adaptive cluster sampling with stopping rule is more efficient for estimation of rare and clustered population than ordinary adaptive cluster sampling. 展开更多
关键词 ADAPTIVE cluster sampling with STOPPING Rule (ACS’) Ordinary ADAPTIVE cluster sampling (ACS) Horvitz Thompson ESTIMATOR (HT) Hansen-Hurwitz ESTIMATOR (HH) Relative EFFICIENCY
下载PDF
Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement
4
作者 Nelson Kiprono Bii Christopher Ouma Onyango John Odhiambo 《Open Journal of Statistics》 2017年第5期834-848,共15页
Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estim... Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estimated when non-response exists randomly under two stage cluster sampling with replacement. It is assumed that non-response arises in the survey variable in the second stage of cluster sampling. Weighting method of compensating for non-response is applied. Asymptotic properties of the proposed estimator of the population mean are derived. Under mild assumptions, the estimator is shown to be asymptotically consistent. 展开更多
关键词 NON-RESPONSE Nadaraya-Watson Estimation Two Stage cluster sampling
下载PDF
A New Estimator Using Auxiliary Information in Stratified Adaptive Cluster Sampling
5
作者 Nipaporn Chutiman Monchaya Chiangpradit Sujitta Suraphee 《Open Journal of Statistics》 2013年第4期278-282,共5页
In this paper, we study the estimators of the population mean in stratified adaptive cluster sampling by using the information of the auxiliary variable. Simulations showed that if the variable of interest (y) and the... In this paper, we study the estimators of the population mean in stratified adaptive cluster sampling by using the information of the auxiliary variable. Simulations showed that if the variable of interest (y) and the auxiliary variables (x,z) have high positive correlation then the estimate of the mean square error of the ratio estimators is less than the estimate of the mean square error of the product estimator. The estimators which use only one auxiliary variable were better than the estimators which use two auxiliary variables. 展开更多
关键词 STRATIFIED Adaptive cluster sampling AUXILIARY VARIABLE RATIO ESTIMATOR Product ESTIMATOR
下载PDF
基于依赖结构和Gibbs Sampling的离散数据聚类
6
作者 王双成 俞时权 程新章 《计算机工程》 CAS CSCD 北大核心 2006年第9期28-30,共3页
建立了一种新的离散数据聚类方法,该方法结合变量之间的依赖结构和Gibbs sampling进行离散数据聚类,能够显著提高抽样效率,并且避免使用EM算法进行聚类所带来的问题。试验结果表明,该方法能够有效地进行离散数据的聚类。
关键词 聚类 离散数据 依赖结构 GIBBS抽样 MDL标准
下载PDF
Over-sampling algorithm for imbalanced data classification 被引量:5
7
作者 XU Xiaolong CHEN Wen SUN Yanfei 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第6期1182-1191,共10页
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic... For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value. 展开更多
关键词 imbalanced data density-based spatial clustering of applications with noise(DBSCAN) synthetic minority over sampling technique(SMOTE) over-sampling.
下载PDF
Cross-classes domain inference with network sampling for natural resource inventory 被引量:1
8
作者 Zhengyang Hou Ronald E.McRoberts +5 位作者 Chunyu Zhang Göran Ståhl Xiuhai Zhao Xuejun Wang Bo Li Qing Xu 《Forest Ecosystems》 SCIE CSCD 2022年第3期311-322,共12页
There are two distinct types of domains,design-and cross-classes domains,with the former extensively studied under the topic of small-area estimation.In natural resource inventory,however,most classes listed in the co... There are two distinct types of domains,design-and cross-classes domains,with the former extensively studied under the topic of small-area estimation.In natural resource inventory,however,most classes listed in the condition tables of national inventory programs are characterized as cross-classes domains,such as vegetation type,productivity class,and age class.To date,challenges remain active for inventorying cross-classes domains because these domains are usually of unknown sampling frame and spatial distribution with the result that inference relies on population-level as opposed to domain-level sampling.Multiple challenges are noteworthy:(1)efficient sampling strategies are difficult to develop because of little priori information about the target domain;(2)domain inference relies on a sample designed for the population,so within-domain sample sizes could be too small to support a precise estimation;and(3)increasing sample size for the population does not ensure an increase to the domain,so actual sample size for a target domain remains highly uncertain,particularly for small domains.In this paper,we introduce a design-based generalized systematic adaptive cluster sampling(GSACS)for inventorying cross-classes domains.Design-unbiased Hansen-Hurwitz and Horvitz-Thompson estimators are derived for domain totals and compared within GSACS and with systematic sampling(SYS).Comprehensive Monte Carlo simulations show that(1)GSACS Hansen-Hurwitz and Horvitz-Thompson estimators are unbiased and equally efficient,whereas thelatter outperforms the former for supporting a sample of size one;(2)SYS is a special case of GSACS while the latter outperforms the former in terms of increased efficiency and reduced intensity;(3)GSACS Horvitz-Thompson variance estimator is design-unbiased for a single SYS sample;and(4)rules-ofthumb summarized with respect to sampling design and spatial effect improve precision.Because inventorying a mini domain is analogous to inventorying a rare variable,alternative network sampling procedures are also readily available for inventorying cross-classes domains. 展开更多
关键词 Cross-classes domain estimation Design-based inference Network sampling Generalized systematic adaptive cluster sampling Forest inventory
下载PDF
Defense Against Poisoning Attack via Evaluating TrainingSamples Using Multiple Spectral Clustering Aggregation Method 被引量:1
9
作者 Wentao Zhao Pan Li +2 位作者 Chengzhang Zhu Dan Liu Xiao Liu 《Computers, Materials & Continua》 SCIE EI 2019年第6期817-832,共16页
The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poison... The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poisoning attack, which disturbsmachine learning algorithms by injecting poisoning samples, is an attack with the greatestthreat. In this paper, we focus on analyzing the characteristics of positioning samples andpropose a novel sample evaluation method to defend against the poisoning attack cateringfor the characteristics of poisoning samples. To capture the intrinsic data characteristicsfrom heterogeneous aspects, we first evaluate training data by multiple criteria, each ofwhich is reformulated from a spectral clustering. Then, we integrate the multipleevaluation scores generated by the multiple criteria through the proposed multiplespectral clustering aggregation (MSCA) method. Finally, we use the unified score as theindicator of poisoning attack samples. Experimental results on intrusion detection datasets show that MSCA significantly outperforms the K-means outlier detection in terms ofdata legality evaluation and poisoning attack detection. 展开更多
关键词 Poisoning attack sample evaluation spectral clustering ensemble learning.
下载PDF
Comparison of Survey Sampling Methods for Estimation of Vaccination Coverage in an Urban Setup of Assam, India
10
作者 Dilip C. Nath Bhushita Patowari 《Health》 2015年第11期1578-1590,共13页
Background: Immunization averts a large number of children in each year. The burden of vaccine preventable diseases remains high in developing countries compared to developed countries. To overcome from this burden di... Background: Immunization averts a large number of children in each year. The burden of vaccine preventable diseases remains high in developing countries compared to developed countries. To overcome from this burden different types of immunization programs have been implemented. For better immunization coverage in developing countries, considerable progress is to be made to improve the knowledge and awareness regarding importance of vaccines. In this study a compara-tive study of immunization coverage under two sampling methods has been performed. Methods: In this study variance and design effect of proportion of children vaccinated against different types of vaccines (BCG, OPV, DPT, Hepatitis B, Hib, Measles and MMR) are estimated under two stage (30 × 30) cluster and systematic sampling for comparison of these two survey sampling methods. Also the homogeneity of clusters has been tested by using chi-square test. Results: It is observed that BCG, OPV and DPT vaccination coverage is more than 90% whereas Hepatitis B, Measles, Hib and MMR vaccination coverage is between 50% - 64% only. Here systematic random sampling is more complicated than two stage (30 × 30) cluster sampling. Also the result shows that the clusters are homogeneous with respect to proportion of children vaccinated. Conclusion: There is no significant difference between the two survey methodologies regarding the point estimation of vaccination coverage but estimation of variances of vaccination coverage is less in two stage (30 × 30) cluster sampling than that of the systematic sampling. Also the clusters are homogeneous. Very less improvement has been observed in case of fully vaccination coverage than the previous study. From the study it can be said that two stage (30 × 30) cluster sampling will be preferred to systematic sampling and simple random sampling method. 展开更多
关键词 VACCINE COVERAGE cluster sampling Systematic sampling Design Effect Marascuilo Procedure
下载PDF
Spatial distribution and sampling of larvae of Phthorimaea operculella(Zeller) and its harmfulness
11
作者 马继盛 杨效文 +3 位作者 徐广 赵世民 董金川 李三立 《华北农学报》 CSCD 北大核心 1994年第S2期61-66,共6页
Studies in tobacco fields were conducted in 1993. The results showed that the distribution pattern of the larva was aggregative,and the aggregation did not change with the densities of population of the larva. The cha... Studies in tobacco fields were conducted in 1993. The results showed that the distribution pattern of the larva was aggregative,and the aggregation did not change with the densities of population of the larva. The characteristics of the vertical distribution of the larva on tobacco plants was more in the lower leaves than in the upper. The difference of population density among the tobacco fields with an elevation of 490 meters and 900 meters was not significant. The number of sampling was given under different precisions by using two-stage sampling technique. The average of leaf area loss caused by the larva in tobacco fields was 12.654 cm2. 展开更多
关键词 Phthorimaea operculella ( ZELLER ) spatial distribution two-stage sampling ELEVATION leaf area loss
下载PDF
Random Route and Quota Sampling: Do They Offer Any Advantage over Probably Sampling Methods?
12
作者 Vidal Díaz de Rada Valentín Martínez Martín 《Open Journal of Statistics》 2014年第5期391-401,共11页
The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to... The aim of this paper is to compare sample quality across two probability samples and one that uses probabilistic cluster sampling combined with random route and quota sampling within the selected clusters in order to define the ultimate survey units. All of them use the face-to-face interview as the survey procedure. The hypothesis to be tested is that it is possible to achieve the same degree of representativeness using a combination of random route sampling and quota sampling (with substitution) as it can be achieved by means of household sampling (without substitution) based on the municipal register of inhabitants. We have found such marked differences in the age and gender distribution of the probability sampling, where the deviations exceed 6%. A different picture emerges when it comes to comparing the employment variables, where the quota sampling overestimates the economic activity rate (2.5%) and the unemployment rate (8%) and underestimates the employment rate (3.46%). 展开更多
关键词 sampling Methods RANDOM sampling MULTISTAGE cluster sampling RANDOM ROUTE Method QUOTA sampling
下载PDF
Two-Stage Negative Adaptive Cluster Sampling
13
作者 R.V.Latpate J.K.Kshirsagar 《Communications in Mathematics and Statistics》 SCIE 2020年第1期1-21,共21页
If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final samp... If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final sample size.Hence,the cost of sampling increases substantially.To overcome this problem,the surveyors often use auxiliary information which is easy to obtain and inexpensive.An attempt is made through the auxiliary information to control the final sample size.In this article,we have proposed two-stage negative adaptive cluster sampling design.It is a new design,which is a combination of two-stage sampling and negative adaptive cluster sampling designs.In this design,we consider an auxiliary variablewhich is highly negatively correlatedwith the variable of interest and auxiliary information is completely known.In the first stage of this design,an initial random sample is drawn by using the auxiliary information.Further,using Thompson’s(JAmStat Assoc 85:1050-1059,1990)adaptive procedure networks in the population are discovered.These networks serve as the primary-stage units(PSUs).In the second stage,random samples of unequal sizes are drawn from the PSUs to get the secondary-stage units(SSUs).The values of the auxiliary variable and the variable of interest are recorded for these SSUs.Regression estimator is proposed to estimate the population total of the variable of interest.A new estimator,Composite Horwitz-Thompson(CHT)-type estimator,is also proposed.It is based on only the information on the variable of interest.Variances of the above two estimators along with their unbiased estimators are derived.Using this proposed methodology,sample survey was conducted at Western Ghat of Maharashtra,India.The comparison of the performance of these estimators and methodology is presented and compared with other existing methods.The cost-benefit analysis is given. 展开更多
关键词 Adaptive cluster sampling two-stage cluster sampling Negative adaptive cluster sampling two-stage NACS Regression estimator
原文传递
Modifications on the Strand’s Sampling Method Applied to Stands of Pinus elliottii Engelm
14
作者 Sylvio Péllico Netto Doádi Antonio Brena +1 位作者 Angelo Augusto Ebling Aurélio Lourenço Rodrigues 《Journal of Applied Mathematics and Physics》 2014年第7期593-602,共10页
This work was carried out with the objective of proposing some changes in the Strand’s sampling method, in which the trees are selected in sampling units with probability proportional to its diameter for the calculat... This work was carried out with the objective of proposing some changes in the Strand’s sampling method, in which the trees are selected in sampling units with probability proportional to its diameter for the calculation of the stand density and basal area, and proportional to its height for the calculation of volume per hectare. Data used to evaluate the efficiency of the sampling of Strand in clusters were collected in stands of Pinus elliottii Engelm, located in a National Forest, Rio Grande do Sul State, Brazil. In the course of this research work it was proposed to convert the sampling unit into a cluster, structurally more efficient to obtain consistent estimates of volume and of dominant heights, using volumetric equivalence, which results in a form factor equal to one for the final calculation of volume per hectare and an indirect method to obtain the average height of Lorey. The objectives of this study were achieved, because with this methodology it is not necessary to measure heights of trees in the sampling unit, except a dominant height by cluster to evaluate sites. The development of independent estimators for basal area and volume gave rise to the proposition of an estimator for average height of Lorey, but without measuring any tree height in the sampling. The proposed methodology is an attractive solution to reduce costs in forest inventories, with the ability to have greater accuracy and scope for information at the level of compartments, without increasing the cost of sampling in comparison to that performed with units of fixed area. The use of smaller permanent sampling units with higher intensity in the compartments before the final cut will substantially increase the precision of the estimators in these management units, which will enable them to eliminate the pre-cut inventory in forest enterprises. 展开更多
关键词 cluster sampling PPS sampling Forest Inventory
下载PDF
Index-adaptive Triangle-Based Graph Local Clustering
15
作者 Yuan Zhe Wei Zhewei Wen Ji-rong 《Computers, Materials & Continua》 SCIE EI 2023年第6期5009-5026,共18页
Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weig... Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making. 展开更多
关键词 Graph local clustering triangle motif sampling method
下载PDF
结合聚类边界采样的主动学习
16
作者 胡峰 李路正 +1 位作者 代劲 刘群 《智能系统学报》 CSCD 北大核心 2024年第2期482-492,共11页
主动学习是一种机器学习方法,需要选择最有价值的样本进行标注。目前,主动学习在应用时面临着一些挑战,其依赖分类器的先验假设,这容易导致分类器性能意外下降,同时需要一定规模的样本作为启动条件。聚类可以降低问题规模,是主动学习的... 主动学习是一种机器学习方法,需要选择最有价值的样本进行标注。目前,主动学习在应用时面临着一些挑战,其依赖分类器的先验假设,这容易导致分类器性能意外下降,同时需要一定规模的样本作为启动条件。聚类可以降低问题规模,是主动学习的一种有效手段。为此,结合密度聚类边界采样,开展主动学习方法的研究。针对容易产生分类错误的聚类边界区域,通过计算样本密度,提出一种密度峰值聚类边界点采样方法;在此基础上,给出密度熵的定义,并利用密度熵对聚类边界区域进行启发式搜索,提出一种基于聚类边界采样的主动学习方法。试验结果表明,与文献中的5种主动学习算法相比,该算法能够以更少标记量获得同等甚至更高的分类性能,是一种有效的主动学习算法;在标记不足,无标签样本总量20%的情况下,算法在Accuracy、F-score等指标上取得较好的结果。 展开更多
关键词 主动学习 机器学习 聚类边界 密度峰值聚类 几何采样 信息熵 版本空间 主动聚类
下载PDF
面向密度分布不均数据的加权逆近邻密度峰值聚类算法
17
作者 吕莉 陈威 +2 位作者 肖人彬 韩龙哲 谭德坤 《智能系统学报》 CSCD 北大核心 2024年第1期165-175,共11页
针对密度分布不均数据,密度峰值聚类算法易忽略类簇间样本的疏密差异,导致误选类簇中心;分配策略易将稀疏区域的样本误分到密集区域,导致聚类效果不佳的问题,本文提出一种面向密度分布不均数据的加权逆近邻密度峰值聚类算法。该算法首... 针对密度分布不均数据,密度峰值聚类算法易忽略类簇间样本的疏密差异,导致误选类簇中心;分配策略易将稀疏区域的样本误分到密集区域,导致聚类效果不佳的问题,本文提出一种面向密度分布不均数据的加权逆近邻密度峰值聚类算法。该算法首先在局部密度公式中引入基于sigmoid函数的权重系数,增加稀疏区域样本的权重,结合逆近邻思想,重新定义了样本的局部密度,有效提升类簇中心的识别率;其次,引入改进的样本相似度策略,利用样本间的逆近邻及共享逆近邻信息,使得同一类簇样本间具有较高的相似度,可有效改善稀疏区域样本分配错误的问题。在密度分布不均、复杂形态和UCI数据集上的对比实验表明,本文算法的聚类效果优于IDPC-FA、FNDPC、FKNN-DPC、DPC和DPCSA算法。 展开更多
关键词 密度峰值聚类 密度分布不均 逆近邻 共享逆近邻 样本相似度 局部密度 分配策略 数据挖掘
下载PDF
基于模糊聚类和改进Densenet网络的小样本轴承故障诊断
18
作者 魏文军 张轩铭 杨立本 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2024年第3期154-163,共10页
针对实际中轴承的故障数据少难以满足深度学习数据大量训练模型的要求,利用卷积神经网络的微小特征提取优势和模糊聚类不需要训练即可完成分类的特点,提出了一种基于模糊聚类和改进Densenet网络的小样本轴承故障诊断方法。首先将预训练... 针对实际中轴承的故障数据少难以满足深度学习数据大量训练模型的要求,利用卷积神经网络的微小特征提取优势和模糊聚类不需要训练即可完成分类的特点,提出了一种基于模糊聚类和改进Densenet网络的小样本轴承故障诊断方法。首先将预训练微调的Densenet网络去掉分类只保留特征提取层,设计一个维度自适应全局均值池化层(GAP)代替全连接层(FC),其次利用模糊聚类代替Densenet网络的softmax分类层,不需要训练即可完成分类。实验结果表明:该算法利用小样本数据训练网络中的GAP参数,模型需要的训练样本大大减少,诊断时将轴承时域图像输入到网络中,在GAP层输出1 920个特征数据,不同故障状态的特征数据构建特征向量矩阵,利用模糊聚类方法求得模糊相似矩阵和模糊等价矩阵,当置信因子从大到小变化时,由对应布尔矩阵得到动态聚类图,从而实现轴承故障分类。 展开更多
关键词 小样本 全局均值池化层 迁移学习 模糊聚类 故障诊断
下载PDF
基于无监督学习的抽油机井示功图自动聚类与批量标注方法
19
作者 王相 邵志伟 +2 位作者 张雷 张中慧 肖姝 《中国科技论文》 CAS 2024年第1期63-69,共7页
为充分利用大量未标注样本、节约人力与时间,提出了基于无监督学习的抽油机井示功图自动聚类与批量标注方法。首先,将抽油机驴头往复运动产生的位移、载荷数据转化为示功图图片样本,其中,示功图的横坐标为位移,纵坐标为载荷;其次,加载在... 为充分利用大量未标注样本、节约人力与时间,提出了基于无监督学习的抽油机井示功图自动聚类与批量标注方法。首先,将抽油机驴头往复运动产生的位移、载荷数据转化为示功图图片样本,其中,示功图的横坐标为位移,纵坐标为载荷;其次,加载在ImageNet上训练过的带有一系列权重参数、具有强特征提取能力的卷积神经网络模型;然后,去除该网络模型的全连接层,利用该网络模型提取示功图图片样本的特征;最后,利用k-means聚类算法对提取到的特征进行聚类分析,将具有相似特征的示功图聚到同一文件夹中。批量的对示功图聚类结果进行快速标注,从而形成抽油机井故障诊断的示功图样本集。实验随机搜集了100口抽油机井的20 000条示功图数据,结果表明,基于无监督学习的抽油机井示功图自动聚类与批量标注方法耗时短、准确率高,为示功图样本集标注提供了一种高效方法,对于充分挖掘油田大数据的应用价值具有示范意义。 展开更多
关键词 抽油机 示功图 故障诊断 K-MEANS聚类 样本标注
下载PDF
基于空间插值的不规则海洋地质样品测试分析数据聚类算法研究
20
作者 邵长高 严镔 陈秋 《热带海洋学报》 CAS CSCD 北大核心 2024年第2期166-172,共7页
海洋地质调查中获取大量海洋沉积物柱状样样品测试分析数据,样品测试分析目的不同导致柱状样数据采样深度不同,由此造成地质取样数据在三维空间上呈现不规则散点状分布。传统聚类算法无法在三维空间上对此类不规则散点数据进行聚类分析... 海洋地质调查中获取大量海洋沉积物柱状样样品测试分析数据,样品测试分析目的不同导致柱状样数据采样深度不同,由此造成地质取样数据在三维空间上呈现不规则散点状分布。传统聚类算法无法在三维空间上对此类不规则散点数据进行聚类分析。对此,文章设计了一种基于空间插值的不规则地质样品测试分析数据聚类算法,有效地将三维样品测试分析散点数据降为二维数据后进行聚类分析,本算法较好地解决了地质体中试验测试数据的不均衡性问题,为海洋地质大数据分析提供了基础技术方法。 展开更多
关键词 地质取样 实验测试 聚类算法 空间插值 三维
下载PDF
上一页 1 2 48 下一页 到第
使用帮助 返回顶部