首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
以555份芒(Miscanthus sinensis)种质资源为研究对象,根据26个表型性状数据,按地理来源、植物区系和单一性状进行分组,分别采用简单比例法、平方根法和多样性指数法确定组内取样数,再根据聚类和随机2种方法进行组内个体选择。依照上述方案共构建出19个具有代表性的芒初选核心种质样本库。通过平均相似系数、性状符合度、数量性状变异系数和遗传多样性指数等4项检测指标对上述19种构建方案进行比较,最终确定了按"植物区划分组+多样性指数确定取样数+聚类选择个体"为芒初级核心种质构建的最佳方案。通过此方法建立起的芒初级核心种质资源共83份,占总资源的14.95%,且新构建的初级种质资源与总资源性状符合度达到100%。  相似文献   

2.
A strategy was proposed for constructing core collections by least distance stepwise sampling (LDSS) based on genotypic values. In each procedure of cluster, the sampling is performed in the subgroup with the least distance in the dendrogram during constructing a core collection. Mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR) and variable rate of coefficient of variation (VR) were used to evaluate the representativeness of core collections constructed by this strategy. A cotton germplasm collection of 1,547 accessions with 18 quantitative traits was used to construct core collections. Genotypic values of all quantitative traits of the cotton collection were unbiasedly predicted based on mixed linear model approach. By three sampling percentages (10, 20 and 30%), four genetic distances (city block distance, Euclidean distance, standardized Euclidean distance and Mahalanobis distance) combining four hierarchical cluster methods (nearest distance method, furthest distance method, unweighted pair-group average method and Ward’s method) were adopted to evaluate the property of this strategy. Simulations were conducted in order to draw consistent, stable and reproducible results. The principal components analysis was performed to validate this strategy. The results showed that core collections constructed by LDSS strategy had a good representativeness of the initial collection. As compared to the control strategy (stepwise clusters with random sampling strategy), LDSS strategy could construct more representative core collections. For LDSS strategy, cluster methods did not need to be considered because all hierarchical cluster methods could give same results completely. The results also suggested that standardized Euclidean distance was an appropriate genetic distance for constructing core collections in this strategy.  相似文献   

3.
The aim of the present investigation was to develop a novel dosage form of rifampicin and isoniazid to minimize degradation of rifampicin in acidic medium and to modulate the release of rifampicin in the stomach and isoniazid in the intestine. Gastroretentive tablets of rifampicin (150 mg) were prepared by the wet granulation method using hydroxypropyl methylcellulose, calcium carbonate, and polyethylene glycol 4000. The granules and tablets of rifampicin were characterized. Hard gelatin capsules (size 4) containing a compacted mass of isoniazid (150 mg) and dicalcium phosphate (75 mg) were enteric coated. Two tablets of rifampicin and 1 capsule (size 4) of isoniazid were put into a hard gelatin capsule (size 00). The in vitro drug release and in vitro drug degradation studies were performed. Rifampicin was released over 4 hours by zero-order kinetics from the novel dosage form. More than 90% of isoniazid was released in alkaline medium in 30 minutes. The results of dissolution studies with the US Pharmacopeia XXIII method revealed that a substantial amount of rifampicin was degraded from the immediate release capsule containing rifampicin and isoniazid powder owing to drug accumulation in the dissolution vessel and also to the presence of isoniazid. The degradation of rifampicin to 3-formyl rifampicin SV (3FRSV) was arrested (3.6%–4.8% degradation of rifampicin at 4 hours) because of the minimization of physical contact between the 2 drugs and controlled release of rifampicin in acidic medium in the modified Rossett-Rice apparatus. This study concludes that the problem of rifampicin degradation can be alleviated to a certain extent by this novel dosage form. Published: August 24, 2007  相似文献   

4.
A near-infrared (NIR) spectroscopic method to determine content uniformily of a large, thick tablet using an approach that could facilitate future validations has been developed. A CT ibuprofen 800-mg tablet weighs about 1150 mg and is about 18.6 mm wide and 7.6 mm thick. The FT NIR spectrometer was optimized for transmission spectra of the tablets by moving it to the sample compartment and placing it immediately behind the tablet. In spite of this dedicated setup, the transmission spectra obtained were very poor, indicating that the NIR radiation was not reaching the detector. The spectra of the tablet improved with use of a simple preparation in which a flat-face die applies pressure of 20 000 psi to the tablet, this reduced the thickness of the tablet from 7.6 mm to 3.6 mm. A calibration model was developed for tablets with drug content ranging from 70% to 130% of label. The calibration model was tested using a validation set of tablets with a drug content of 752, 800, and 848 mg. The results obtained were within 1.5% of the known drug content of the validation set, tablets. Even with the sample preparation, the content uniformity results of 10 tablets could be determined using this method in less than 1 hour. The approach described in this article could also be used to validate NIR content uniformity methods for orther formulations. Published: July 12, 2001.  相似文献   

5.
New probability matrices for identification of Streptomyces   总被引:3,自引:0,他引:3  
The character state data obtained for clusters defined in a previous phenetic classification were used to construct two probabilistic matrices for Streptomyces species. These superseded an original published identification matrix by exclusion of other genera and the inclusion of more Streptomyces species. Separate matrices were constructed for major and minor clusters. The minimum number of diagnostic characters for each matrix was selected by computer programs for determination of character separation indices (CHARSEP) and a selection of group diagnostic properties (DIACHAR). The resulting matrices consisted of 26 phena x 50 characters (major clusters) and 28 phena x 39 characters (minor clusters). Cluster overlap (OVERMAT program) was small in both matrices. Identification scores were used to evaluate both matrices. The theoretically best scores for the most typical example of each cluster (MOSTTYP program) were all satisfactory. Input of test data for randomly selected cluster representatives resulted in correct identification with high scores. The major cluster matrix was shown to be practically sound by its application to 35 unknown soil isolates, 77% of which were clearly identified. The minor cluster matrix provides tentative probabilistic identifications as the small number of strains in each cluster reduces its ability to withstand test variation. A diagnostic table for single-membered clusters, constructed using the CHARSEP and DIACHAR programs, was also produced.  相似文献   

6.
Classification is a data mining task the goal of which is to learn a model, from a training dataset, that can predict the class of a new data instance, while clustering aims to discover natural instance-groupings within a given dataset. Learning cluster-based classification systems involves partitioning a training set into data subsets (clusters) and building a local classification model for each data cluster. The class of a new instance is predicted by first assigning the instance to its nearest cluster and then using that cluster’s local classification model to predict the instance’s class. In this paper, we present an ant colony optimization (ACO) approach to building cluster-based classification systems. Our ACO approach optimizes the number of clusters, the positioning of the clusters, and the choice of classification algorithm to use as the local classifier for each cluster. We also present an ensemble approach that allows the system to decide on the class of a given instance by considering the predictions of all local classifiers, employing a weighted voting mechanism based on the fuzzy degree of membership in each cluster. Our experimental evaluation employs five widely used classification algorithms: naïve Bayes, nearest neighbour, Ripper, C4.5, and support vector machines, and results are reported on a suite of 54 popular UCI benchmark datasets.  相似文献   

7.
基因型值多次聚类法构建作物种质资源核心库   总被引:22,自引:2,他引:20  
采用合适的遗传模型无偏预测基因型值,用基因型值进行聚类分析,采用马氏距离计算遗传材料间的遗传距离,并用不加权类平均法(UPGMA)进行聚类,根据树型图,从遗传变异相似的每组二个遗传材料中随机选取一个遗传材料,如组内只有一个遗传材料,则选取该遗传材料,对所取的所有遗传材料再次聚类、取样,直至所取遗传材料的数量为总遗传的20% ̄30%,这些遗传材料作迷为核心聚类。用方差同质性测验、均值t测验评核心资源  相似文献   

8.
MOTIVATION: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. RESULTS: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set. AVAILABILITY: Available on request from the authors.  相似文献   

9.
A transmission near infrared (NIR) spectroscopic method has been developed for the nondestructive determination of drug content in tablets with less than 1% weight of active ingredient per weight of formulation (m/m) drug content. Tablets were manufactured with drug concentrations of ∼0.5%, 0.7%, and 1.0% (m/m) and ranging in drug content from 0.71 to 2.51 mg per tablet. Transmission NIR spectra were obtained for 110 tablets that constituted the training set for the calibration model developed with partial least squares regression. The reference method for the calibration model was a validated UV spectrophotometric method. Several data preprocessing methods were used to reduce the effect of scattering on the NIR spectra and base the calibration model on spectral changes related to the drug concentration changes. The final calibration model included the spectral range from 11 216 to 8662 cm−1 the standard normal variate (SNV), and first derivative spectral pretreatments. This model was used to predict an independent set of 48 tablets with a root mean standard error of prediction (RMSEP) of 0.14 mg, and a bias of only −0.05 mg per tablet. The study showed that transmission NIR spectroscopy is a viable alternative for nondestructive testing of low drug content tablets, available for the analysis of large numbers of tablets during process development and as a tool to detect drug agglomeration and evaluate process improvement efforts. Published: March 24, 2006  相似文献   

10.
11.
Linseed is one of the most important oil seed crop in the central highlands of Ethiopia for which yield enhancement is the major breeding purposes and genotypic variability is important for selection in any breeding programs. However, shortage of improved varieties’ that provides optimum seed yield is one of the major constraints of the crop. Therefore, this study was carried out to assess the genetic variability and association among quantitative traits of 36 linseed genotypes. The experiment was conducted in 2018 main cropping season by using simple lattice design. The analysis of variances reveled highly significant difference among the genotype for most of traits considered in present study. High phenotypic and genotypic coefficient of variation was recorded for tiller per plant, harvest index, oil yield (kg ha−1), and seed yield (ton ha-1) number of capsules per plant. High heritability along with genetic advance was observed for seed yield (tones ha-1), oil yield (kg ha-1) harvest index which indicates selection of these traits at early generation would be effective. Oil yield (kg ha−1) harvest index and number of capsules plant −1 showed highly significant positive with seed yield (ton ha−1). Cluster analysis revealed that 36 linseed genotypes were grouped into two clusters and four genotypes remain ungrouped. The maximum inter clusters distance was observed between clusters II and the local check. The data set was reduced into four significant principal components (PCs) that comprise (80%) of the variance. The first PC accounted for 34% of the variances that implies greater proportion of variable information explained by PC1. The traits, which contributed more to PC1, were seed yield per plant, primary branches per plant, secondary branches per plant and plant height showed positive association and had positive direct effect on seed yield. This indicates that any improvement of oil yield and harvest index would result in substantial increase on seed.  相似文献   

12.
王小鹏  赵成章  王继伟  赵连春  文军 《生态学报》2018,38(11):3943-3951
植物种群的多尺度集聚与聚块特征变化是植物对环境协同适应的结果。运用成对相关函数与零模型方法,依据微地形的土壤盐分分布规律设置4个样地,在2 m×2 m的样方内设置400个小格子记录植物株数并取土样,分析了兰州秦王川盐沼湿地土壤盐分梯度下角果碱蓬种群(Suaeda corniculata)种群空间格局的集聚分布内在特征。结果表明:角果碱蓬种群集聚分布范围内呈现2个关键尺度的集聚现象,小尺寸聚块的集聚或叠加形成复合大尺度聚集,整体表现为嵌套双聚块分布;随着土壤盐分递减,种群集群分布中大聚块尺寸趋于增大,小聚块尺寸差异不明显;大尺寸聚块的数量明显减少,小聚块数量随种群植株数量变化,整个梯度呈现降低趋势;小聚块中植株个体平均数逐渐减少,复合大聚块中包含小尺度聚块平均数量呈降低趋势。内陆盐沼湿地土壤环境异质性背景下角果碱蓬种群集群分布中聚块特征的梯度变化,是植株个体形态与构成的调整适应,种群正负向生态关系、庇护与自疏效应梯度转换的结果。  相似文献   

13.
It is becoming increasingly common for the design of a clinical study to involve cluster samples. Very few researches investigated the appropriate number of clusters. None of them treat cluster size and the number of clusters as random variables. In reality, the recruitment of clusters can not be reached at one time and the cluster sizes are usually random. The longer the recruitment takes the more expensive the total study costs will be. This paper provides a strategy for sequential recruitment of clusters, which can minimize the total study cost. By treating the number of additional observational subjects required at each time point as a Markov Chain, we derive an iterative procedure for optimal strategy and study the property of this strategy, especially the duration of the cluster recruitment. This strategy is also extended to search for an optimal number of centers in a multi‐center clinical trial.  相似文献   

14.
The Cytophaga-Flavobacterium group is known to be abundant in aquatic ecosystems and to have a potentially unique role in the utilization of organic material. However, relatively little is known about the diversity and abundance of uncultured members of this bacterial group, in part because they are underrepresented in clone libraries of 16S rRNA genes. To circumvent a suspected bias in PCR, a primer set was designed to amplify 16S rRNA genes from the Cytophaga-Flavobacterium group and was used to construct a library of these genes from the Delaware Estuary. This library had several novel Cytophaga-like 16S rRNA genes, of which about 40% could be grouped together into two clusters (DE clusters 1 and 2) defined by sequences initially observed only in the Delaware library; the other 16S rRNA genes were classified into an additional four clades containing sequences from other environments. An oligonucleotide probe was designed for the cluster with the most clones (DE cluster 2) and was used in fluorescence in situ hybridization assays. Bacteria in DE cluster 2 accounted for about 10% of the total prokaryotic abundance in the Delaware Estuary and in a depth profile of the Chukchi Sea (Arctic Ocean). The presence of DE cluster 2 in the Arctic Ocean was confirmed by results from 16S rRNA clone libraries. The contribution of this cluster to the total bacterial biomass is probably larger than is indicated by the abundance of its members, because the average cell volume of bacteria in DE cluster 2 was larger than those of other bacteria and prokaryotes in the Delaware Estuary and Chukchi Sea. DE cluster 2 may be one of the more abundant bacterial groups in the Delaware Estuary and possibly other marine environments.  相似文献   

15.
PCR was used to amplify DNA-dependent RNA polymerase gene sequences specifically from the cyanobacterial population in a seawater sample from the Sargasso Sea. Sequencing and analysis of the cloned fragments suggest that the population in the sample consisted of two distinct clusters of Prochlorococcus-like cyanobacteria and four clusters of Synechococcus-like cyanobacteria. The diversity within these clusters was significantly different, however. Clones within each Synechococcus-like cluster were 99 to 100% identical, while each Prochlorococcus-like cluster was only 91% identical at the nucleotide level. One Prochlorococcus-like cluster was significantly more closely related to a Mediterranean Sea (surface) Prochlorococcus isolate than to the other cluster, showing the highly divergent nature of this group even in one sample. The approach described here can be used as a general method for examining cyanobacterial diversity, while an oligotrophic ocean ecosystem such as the Sargasso Sea may be an ideal model for examining diversity in relation to environmental parameters.  相似文献   

16.
17.
MOTIVATION: Bioinformatics clustering tools are useful at all levels of proteomic data analysis. Proteomics studies can provide a wealth of information and rapidly generate large quantities of data from the analysis of biological specimens. The high dimensionality of data generated from these studies requires the development of improved bioinformatics tools for efficient and accurate data analyses. For proteome profiling of a particular system or organism, a number of specialized software tools are needed. Indeed, significant advances in the informatics and software tools necessary to support the analysis and management of these massive amounts of data are needed. Clustering algorithms based on probabilistic and Bayesian models provide an alternative to heuristic algorithms. The number of clusters (diseased and non-diseased groups) is reduced to the choice of the number of components of a mixture of underlying probability. The Bayesian approach is a tool for including information from the data to the analysis. It offers an estimation of the uncertainties of the data and the parameters involved. RESULTS: We present novel algorithms that can organize, cluster and derive meaningful patterns of expression from large-scaled proteomics experiments. We processed raw data using a graphical-based algorithm by transforming it from a real space data-expression to a complex space data-expression using discrete Fourier transformation; then we used a thresholding approach to denoise and reduce the length of each spectrum. Bayesian clustering was applied to the reconstructed data. In comparison with several other algorithms used in this study including K-means, (Kohonen self-organizing map (SOM), and linear discriminant analysis, the Bayesian-Fourier model-based approach displayed superior performances consistently, in selecting the correct model and the number of clusters, thus providing a novel approach for accurate diagnosis of the disease. Using this approach, we were able to successfully denoise proteomic spectra and reach up to a 99% total reduction of the number of peaks compared to the original data. In addition, the Bayesian-based approach generated a better classification rate in comparison with other classification algorithms. This new finding will allow us to apply the Fourier transformation for the selection of the protein profile for each sample, and to develop a novel bioinformatic strategy based on Bayesian clustering for biomarker discovery and optimal diagnosis.  相似文献   

18.
Multiple test procedures are usually compared on various aspects of error control and power. Power is measured as some function of the number of false hypotheses correctly identified as false. However, given equal numbers of rejected false hypotheses, the pattern of rejections, i.e. the particular set of false hypotheses identified, may be crucial in interpreting the results for potential application.In an important area of application, comparisons among a set of treatments based on random samples from populations, two different approaches, cluster analysis and model selection, deal implicitly with such patterns, while traditional multiple testing procedures generally focus on the outcomes of subset and pairwise equality hypothesis tests, without considering the overall pattern of results in comparing methods. An important feature involving the pattern of rejections is their relevance for dividing the treatments into distinct subsets based on some parameter of interest, for example their means. This paper introduces some new measures relating to the potential of methods for achieving such divisions. Following Hartley (1955), sets of treatments with equal parameter values will be called clusters. Because it is necessary to distinguish between clusters in the populations and clustering in sample outcomes, the population clusters will be referred to as P -clusters; any related concepts defined in terms of the sample outcome will be referred to with the prefix outcome. Outcomes of multiple comparison procedures will be studied in terms of their probabilities of leading to separation of treatments into outcome clusters, with various measures relating to the number of such outcome clusters and the proportion of true vs. false outcome clusters. The definitions of true and false outcome clusters and related concepts, and the approach taken here, is in the tradition of hypothesis testing with attention to overall error control and power, but with added consideration of cluster separation potential.The pattern approach will be illustrated by comparing two methods with apparent FDR control but with different ways of ordering outcomes for potential significance: The original Benjamini-Hochberg (1995) procedure (BH), and the Newman-Keuls (Newman, 1939; Keuls, 1952) procedure (NK).  相似文献   

19.
花椰菜自交系主要形态性状的主成分分析和聚类分析   总被引:6,自引:0,他引:6  
通过对54个花椰菜自交系材料进行主成分分析和聚类分析研究,以期为花椰菜育种中亲本选配提供帮助。结果表明:在主成分分析中,选取方差累积贡献率为70.024%的前6个主成分来评价花椰菜自交系资源;现球期、采收期、叶长、叶宽、株高、花球纵径、花球横径、球重、球形、球紧实度、叶色、蜡粉、内叶数、始花期和株幅是花椰菜亲本选择的主要形态指标。进一步通过系统聚类,将54个花椰菜自交系分为3类:第Ⅰ类表现为早熟、株幅小、叶片狭窄、蜡粉较少、球重和紧实度中等;第Ⅱ类表现为中熟、株幅中等、叶片灰绿、蜡粉较厚和花球半圆紧实且重;第Ⅲ类表现为晚熟、株幅大、株形高、叶片宽阔、蜡粉中等和花球扁圆。3个类群自交系性状之间的差异较为明显,有利于杂交育种亲本材料的选择。  相似文献   

20.
In comparison to conventional marker-assisted selection (MAS), which utilizes only a subset of genetic markers associated with a trait to predict breeding values (BVs), genome-wide selection (GWS) improves prediction accuracies by incorporating all markers into a model simultaneously. This strategy avoids risks of missing quantitative trait loci (QTL) with small effects. Here, we evaluated the accuracy of prediction for three corn flowering traits days to silking, days to anthesis, and anthesis-silking interval with GWS based on cross-validation experiments using a large data set of 25 nested association mapping populations in maize (Zea mays). We found that GWS via ridge regression-best linear unbiased prediction (RR-BLUP) gave significantly higher predictions compared to MAS utilizing composite interval mapping (CIM). The CIM method may be selected over multiple linear regression to decrease over-estimations of the efficiency of GWS over a MAS strategy. The RR-BLUP method was the preferred method for estimating marker effects in GWS with prediction accuracies comparable to or greater than BayesA and BayesB. The accuracy with RR-BLUP increased with training sample proportion, marker density, and heritability until it reached a plateau. In general, gains in accuracy with RR-BLUP over CIM increased with decreases of these factors. Compared to training sample proportion, the accuracy of prediction with RR-BLUP was relatively insensitive to marker density.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号