首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Microarray data contains a large number of genes (usually more than 1000) and a relatively small number of samples (usually fewer than 100). This presents problems to discriminant analysis of microarray data. One way to alleviate the problem is to reduce dimensionality of data by selecting important genes to the discriminant problem. Gene selection can be cast as a feature selection problem in the context of pattern classification. Feature selection approaches are broadly grouped into filter methods and wrapper methods. The wrapper method outperforms the filter method but at the cost of more intensive computation. In the present study, we proposed a wrapper-like gene selection algorithm based on the Regularization Network. Compared with classical wrapper method, the computational costs in our gene selection algorithm is significantly reduced, because the evaluation criterion we proposed does not demand repeated training in the leave-one-out procedure.  相似文献   

2.
The paper discusses the possibility of implementing a minimum risk classifier using the learning machine approach. Necessary conditions on the choice of pairwise classification costs are imposed so that the minimum risk classifier can be implemented using pairwise class separating functions. Parameters of these functions are obtained using a two stage algorithm which minimizes a modified least squares criterion of class separation. In comparison to normal least squares objective function, this criterion increases the sensitivity of the learning scheme near the class separating surface and, consequently, allows for an improvement in the performance of the discriminant function decision making processor. Simplicity of the design procedure is achieved by partitioning the multimodal classes into unimodal subsets, since discriminant functions of unimodal classes can usually be implemented simply and with sufficient accuracy as low order polynomials. The proposed design approach is tested experimentally on an artificial pattern recognition problem.  相似文献   

3.
By examining the morphometry (i.e., length, width, length‐to‐width ratio, and volume) of pellets in three different categories (adult males, adult females, and yearlings) mule deer (Odocoileus hemionus), we were able to distinguish via discriminant function and fuzzy clustering analyses the age and sex of these animals. To determine a priori the identity of the pellet samples and evaluate the accuracy of our methods, we obtained samples from individuals in captivity. The discriminant function allowed us to correctly assign 100% of adult males, 91.66% of adult females, and 75% of yearlings to an age class, using previous information. The fuzzy clustering method enabled us to correctly distinguish 100% of adult males, 83.3% of adult females, and 75% of yearlings. The methods are based upon different assumptions. An important assumption with the discriminant function method is that the membership of each pellet group must be established a priori. This may be a disadvantage in certain cases, such as when pellet samples are gathered for an indirect population assessment procedure. Despite this drawback, however, both methods appear to be highly accurate. Zoo Biol 23:139–146, 2004. © 2004 Wiley‐Liss, Inc.  相似文献   

4.
Sexual differences in morphology, ranging from subtle to extravagant, occur commonly in many animal species. These differences can encompass overall body size (sexual size dimorphism, SSD) or the size and/or shape of specific body parts (sexual body component dimorphism, SBCD). Interacting forces of natural and sexual selection shape much of the expression of dimorphism we see, though non-adaptive processes may be involved. Differential scaling of individual features can result when selection favors either exaggerated (positive allometry) or reduced (negative allometry) size during growth. Studies of sexual dimorphism and character scaling rely on multivariate models that ideally use an unbiased reference character as an overall measure of body size. We explored several candidate reference characters in a cryptically dimorphic taxon, Hadrurus arizonensis. In this scorpion, essentially every body component among the 16 we examined could be interpreted as dimorphic, but identification of SSD and SBCD depended on which character was used as the reference (prosoma length, prosoma area, total length, principal component 1, or metasoma segment 1 width). Of these characters, discriminant function analysis suggested that metasoma segment 1 width was the most appropriate. The pattern of dimorphism in H. arizonensis mirrored that seen in other more obviously dimorphic scorpions, with static allometry trending towards isometry in most characters. Our findings are consistent with the conclusions of others that fecundity selection likely favors a larger prosoma in female scorpions, whereas sexual selection may favor other body parts being larger in males, especially the metasoma, pectines, and possibly the chela. For this scorpion and probably most other organisms, the choice of reference character profoundly affects interpretations of SSD, SBCD, and allometry. Thus, researchers need to broaden their consideration of an appropriate reference and exercise caution in interpreting findings. We highly recommend use of discriminant function analysis to identify the least-biased reference character.  相似文献   

5.
A new multiple trait strategy based on discriminant analysis was studied for efficient detection of linked QTL in outbred sib families, in comparison with a multivariate likelihood technique. The discriminant analysis technique describes the segregation of a linear combination of the traits in a univariate likelihood. This combination is calculated for each pair of positions depending on the inheritance of the pairs of QTL haplotypes in the progeny. The gains in power and accuracy for position estimations of multiple trait methods in grid searches were evaluated in reference to single trait detections of linked QTL. The methods were applied to simulated designs with two correlated traits submitted to various effects from the linked QTL. Multiple trait strategies were generally more powerful and accurate than the single trait technique. Linked QTL were distinguished when they were separated enough to identify informative recombinations: at least two genetic markers and 25 cM between the QTL under the simulated conditions. Except in a particular case, discriminant analysis was at least as powerful as the multivariate technique and its implementation was five times faster. Combining the advantages from both methodologies, we finally propose a complete strategy for rapid and efficient systematic multivariate detections in outbred populations.  相似文献   

6.
根据棉花的数量性状,利用聚类分析方法将17个亲本分为4类.根据棉花的综合性状和性状距离,给出了两套亲本选配方案.试验结果表明,这些方案能提高育种工作中亲本选配的效率.  相似文献   

7.
Abstract. The Northern Iberian Peninsula is dominated by various types of vegetation from deciduous oak and ash to evergreen oak woodlands. A recent vegetation map of Spain portrays vegetation series which are characterized in terms of their phytogeographic region or bioclimatic (altitudinal) belt. The aim of this paper is to determine whether the areas comprised by both phytogeographic regions (Eurosiberian and Mediterranean) in the study area, as established from the phytogeographic characterization of the vegetation, can be discriminated by climatic variables using multivariate methods, and to compare these with other conventional approaches. In addition, bioclimatic (altitudinal) belts and the main vegetation types were tested for discrimination by climatic variables. Conventional climatic criteria as well as discriminant and principal component analysis were applied to climatic data from 205 meteorological stations for which vegetation information had been taken from the vegetation map. Conventional criteria are good predictors of the phytogeographic division (Mediterranean and Eurosiberian regions) in the study area. Results were improved by multiple discriminant analysis based on climatic data of the dry period of the year (June to September). Both regions in the study area can be predicted with over 95 % accuracy. Using the same multivariate procedure and temperature data the bioclimatic (altitudinal) belts of the study area can be predicted with over 90 % accuracy. The main vegetation groups of the study area can also be predicted with over 80 % accuracy. Ordination analysis supported the results of the discriminant analysis. Empirical models have been generated to predict the phytogeographic- and belt character of any station in the area. The significance of the various periods of the year for discriminating regions and belts is evaluated. The responsiveness to climatic events during the year may be region specific. This study confirms the strong relationship between climate and vegetation in the Northern Iberian Peninsula, particularly regarding the Eurosiberian-Mediterranean boundary.  相似文献   

8.
Prediction error estimation: a comparison of resampling methods   总被引:1,自引:0,他引:1  
MOTIVATION: In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to this process: feature selection, model selection and prediction assessment. With a focus on prediction assessment, we compare several methods for estimating the 'true' prediction error of a prediction model in the presence of feature selection. RESULTS: For small studies where features are selected from thousands of candidates, the resubstitution and simple split-sample estimates are seriously biased. In these small samples, leave-one-out cross-validation (LOOCV), 10-fold cross-validation (CV) and the .632+ bootstrap have the smallest bias for diagonal discriminant analysis, nearest neighbor and classification trees. LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis. Additionally, LOOCV, 5- and 10-fold CV, and the .632+ bootstrap have the lowest mean square error. The .632+ bootstrap is quite biased in small sample sizes with strong signal-to-noise ratios. Differences in performance among resampling methods are reduced as the number of specimens available increase. SUPPLEMENTARY INFORMATION: A complete compilation of results and R code for simulations and analyses are available in Molinaro et al. (2005) (http://linus.nci.nih.gov/brb/TechReport.htm).  相似文献   

9.
The paper deals with the optimal Bayes discriminant rule for qualitative variables. The performance of variable selection is investigated under strong assumptions like the restriction to dichotomous variables, which are assumed to be independent or dependent with fixed dependence structure, and all parameters known. Differences in comparison with normal variables in linear discriminant analysis can be shown. This is a further reason for applying special methods of discriminant analysis in the case of qualitative variables.  相似文献   

10.
卧息地选择是野生动物对生态环境的行为适应。为了探讨麋鹿(Elaphurus davidianus)夜间卧息地选择的季节变化, 2013年11月至2014年12月, 采用跟踪调查法与直接观察法, 对湖北石首麋鹿国家级自然保护区围栏内麋鹿184个夜间卧息样方和184个对照样方的生态因子信息进行了观测记录。结果表明: 麋鹿春、秋、冬3个季节夜间卧息时均选择隐蔽度较高、草本盖度较高、食物丰富度较高、距隐蔽物(芦苇或树林)较近的林地生境(P < 0.05), 并且春、秋季夜间选择在距道路距离与距居民点距离上的差异不显著(P > 0.05); 夏季夜间选择在草本盖度较低、食物丰富度较低、隐蔽度较低、距隐蔽物较近、距道路与居民点距离较远、距水源较近的滩涂生境卧息(P < 0.05); 冬季夜间选择在风速较小、距道路与居民点较近的生境卧息。判别分析表明: 草本盖度、食物丰富度、距道路距离、隐蔽度、风速、距水源距离以及距隐蔽物距离这7个因子组成的判别函数可区分不同季节麋鹿的夜间卧息地, 且麋鹿在不同季节的夜间卧息地特征存在部分重叠, 这可能与不同季节间食物、水、温度与人为干扰等因子的差异性有关。建议该保护区扩大饲料基地面积、保留麋鹿卧息隐蔽环境、减少人为干扰、控制长江故道水位。  相似文献   

11.
Two new diphasmid vectors (lambda SK17 and SK22) and a novel procedure to construct linking libraries are described. A partial filling-in reaction provides counter-selection against false linking clones in the library, and obviates the need for supF selection. The diphasmid vectors, in combination with the novel selection procedure, have been used to construct a chromosome 3 specific NotI linking library from a human chromosome 3/mouse microcell hybrid cell line. The application of the new vectors and the strong biochemical and biological selections resulted in a library of 60,000 NotI linking clones. As practically all of them are real NotI linking clones (no false recombinants) the library represents approximately 3,000 human recombinants (equal to 10-15 genomic equivalents of chromosome 3). Previously published methods for construction of linking libraries are compared with the procedure described in the present paper. The advantages of the new vectors and the novel protocol are discussed.  相似文献   

12.
The aim of this study was to investigate whether the AgNOR technique could be helpful for the cytologic diagnosis of neoplastic and non-neoplastic urinary tract lesions. We analysed the AgNOR pattern in urinary cytology in samples from 70 patients. In every case the average number of silver precipitations per nucleus was counted and the range between the minimum and maximum AgNOR value calculated. Furthermore we noted whether the AgNOR precipitations had a homogeneous or heterogeneous distribution. The diseases were classified in three groups: non-neoplastic lesions, low grade and high grade carcinoma. Linear discriminant analysis (with jack-knife procedure) was performed with the AgNOR parameters as independent variables. The final diagnosis of each patient had been established by histological analysis of bladder biopsies. We obtained a correct classification in 84.3% of the cases. All patients with normal or reactive lesions were correctly classified and only two cases of low grade malignancy were erroneously diagnosed as non-malignant. Five high grade neoplasms had been classified as low grade and four low grade carcinomas had been over-diagnosed as high grade neoplasms. We conclude that a combined qualitative and quantitative AgNOR analysis can be useful in the differential diagnosis of urinary cytology.  相似文献   

13.
With the aim of reliably distinguishing these commercially important species on the basis of external characteristics alone, morphometric techniques were employed on a sample of the four species of Oreochromis ( Nyasalapia ) described from Lake Malawi: O. karongae, O. lidole, O. saka and O. squamipinnis . Univariate analysis of variance on the ratios of 23 variables to standard length indicated many differences among all species, but there was considerable individual variation, and consequent overlap. Residuals from a regression of each variable on length were employed for multivariate analysis. Cluster analysis on the means of the residuals was used to construct a phenogram which formed the basis for denning a series of dichotomous discriminant analyses. In each discriminant analysis, variables were successively eliminated in the reverse order of the magnitude of their correlation with the discriminant function. The combination of variables producing 95% accuracy of classification was selected, and the discriminant function equations for each step calculated. Some further variables were eliminated by checking for redundancy through analysis of correlations. The resulting equations enable O. karongae to be separated using eight measurements, O. lidole using 10, and O. saka and O. squamipinnis to be distinguished by a combination of 13 measurements.  相似文献   

14.
Man Jin  Yixin Fang 《Biometrics》2011,67(1):124-132
Summary In family studies, canonical discriminant analysis can be used to find linear combinations of phenotypes that exhibit high ratios of between‐family to within‐family variabilities. But with large numbers of phenotypes, canonical discriminant analysis may overfit. To estimate the predicted ratios associated with the coefficients obtained from canonical discriminant analysis, two methods are developed; one is based on bias correction and the other based on cross‐validation. Because the cross‐validation is computationally intensive, an approximation to the cross‐validation is also developed. Furthermore, these methods can be applied to perform variable selection in canonical discriminant analysis. The proposed methods are illustrated with simulation studies and applications to two real examples.  相似文献   

15.
The grand game of metazoan phylogeny: rules and strategies   总被引:4,自引:0,他引:4  
Many cladistic analyses of animal phylogeny have been published by authors arguing that their results are well supported. Comparison of these analyses indicates that there can be as yet no general consensus about the evolution of the animal phyla. We show that the various cladistic studies published to date differ significantly in methods of character selection, character coding, scoring and weighting, ground-pattern reconstructions, and taxa selection. These methodological differences are seldom made explicit, which hinders comparison of different studies and makes it impossible to assess a particular phylogeny outside its own scope. The effects of these methodological differences must be considered before we can hope to reach a morphological reference framework needed for effective comparison and combination with the evidence obtained from molecular and developmental genetic studies.  相似文献   

16.
Many animals possess multiple ornaments or behaviours that seem to have evolved via sexual selection. A complete understanding of sexual selection requires an explanation for such multiple traits. The dabbling ducks (Tribe: Anatini) exhibit considerable variation among species in the number of displays in the male courtship repertoire. I tested five hypotheses concerning the evolution of the variation in display repertoire size of dabbling ducks: (1) species recognition, (2) courtship habitat, (3) sexual selection intensity, (4) display media tradeoff and (5) time constraints on pair formation. I tested these hypotheses, using an explicit phylogenetic hypothesis developed from DNA sequences for the dabbling ducks, with two types of statistical comparative methods (discrete and continuous character). The variation observed in male courtship display repertoire size in dabbling ducks was consistent with the courtship habitat and sexual selection intensity hypotheses. Specifically, the size of the display repertoire was larger in species that exhibit courtship exclusively on water and larger in species with dimorphic plumage. These results suggest that ecological (habitat) as well as social (sexual selection) factors may be important in driving the evolution of displays in the dabbling ducks.  相似文献   

17.
Each holotype specimen provides the only objective link to a particular Linnean binomen. Sequence information from them is increasingly valuable due to the growing usage of DNA barcodes in taxonomy. As type specimens are often old, it may only be possible to recover fragmentary sequence information from them. We tested the efficacy of short sequences from type specimens in the resolution of a challenging taxonomic puzzle: the Elachista dispunctella complex which includes 64 described species with minuscule morphological differences. We applied a multistep procedure to resolve the taxonomy of this species complex. First, we sequenced a large number of newly collected specimens and as many holotypes as possible. Second, we used all >400 bp examine species boundaries. We employed three unsupervised methods (BIN, ABGD, GMYC) with specified criteria on how to handle discordant results and examined diagnostic bases from each delineated putative species (operational taxonomic units, OTUs). Third, we evaluated the morphological characters of each OTU. Finally, we associated short barcodes from types with the delineated OTUs. In this step, we employed various supervised methods, including distance‐based, tree‐based and character‐based. We recovered 658 bp barcode sequences from 194 of 215 fresh specimens and recovered an average of 141 bp from 33 of 42 holotypes. We observed strong congruence among all methods and good correspondence with morphology. We demonstrate potential pitfalls with tree‐, distance‐ and character‐based approaches when associating sequences of varied length. Our results suggest that sequences as short as 56 bp can often provide valuable taxonomic information. The results support significant taxonomic oversplitting of species in the Elachista dispunctella complex.  相似文献   

18.
We contrast three methods for measuring selection at sequential fitness components (here called the additive, changing variance, and independent methods). The independent method (Koenig and Albano, 1987; Conner, 1988) describes the relationship between a phenotypic character and one fitness component independent of other components. This method is appropriate when the question is whether or not a character has fitness consequences independent of selection at other stages. The additive (Arnold and Wade, 1984a) and changing variance (Kalisz, 1986; Koenig and Albano, 1987) methods measure selection via one component of fitness, taking into consideration constraints imposed by selection via earlier components in the sequence. These methods therefore more accurately track selection over a sequence of fitness components. Of these latter two methods, the changing variance method yields erratic results in simulation studies and is not recommended in its unmodified form. The additive method (equivalent to the changing variance method weighted as described in Wade and Kalisz [1989]) explicitly partitions selection into additive components and is useful for measuring selection taking into account the constraints imposed by selection acting via prior fitness components. The methods often yield very different estimates of the relative degree to which the mean of a character is changed by selection acting via a particular component of fitness (the “strength” of selection). However, neither the additive nor independent method is inherently superior to the other; rather, these measures are complementary.  相似文献   

19.
A new method for the choice of variables with the greatest discriminatory power in the location model for mixed variable discriminant analysis is presented in the paper. The procedure based on the multivariate discriminatory measure enables a simultaneous reduction of the number of discrete and continuous variables. The introduced criterion can be used for both optimal or step-wise selection of variable subset. As an example the results of the stepwise variable selection for some medical data are presented in the paper.  相似文献   

20.
Abstract The evolution of premating isolation after secondary contact is primarily considered in the guise of reinforcement, which relies on low hybrid fitness as the driving force for mating preference divergence. Here I consider two additional forces that may play a substantial role in the adaptive evolution of premating isolation, direct selection on preferences and indirect selection against postmating, prezygotic incompatibilities. First, I argue that a combination of ecological character displacement and sensory bias can cause direct selection on preferences that results in the pattern of reproductive character displacement. Both analytical and numerical methods are then used to demonstrate that, as expected from work in single populations, such direct selection will easily overwhelm indirect selection due to low hybrid fitness as the primary determinant of preference evolution. Second, postmating, prezygotic incompatibilities are presented as a driving force in the evolution of premating isolation. Two classes of these mechanisms, those increasing female mortality after mating but before producing offspring and those reducing female fertility, are shown to be identical in their effects on preference divergence. Analytical and numerical techniques are then used to demonstrate that postmating, prezygotic factors may place strong selection on preference divergence. These selective forces are shown to be comparable if not greater than those produced by the low fitness of hybrids.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号