首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
At the Wageningen Laboratory of Plant Breeding, a software package has been developed to query a simple structured database with variety pedigree data. The package, called Peditree, creates a tree-shaped representation of pedigree information and has several visualization and lookup options. Estimates of inbreeding coefficient within a pedigree or coefficients of coancestry among pedigrees can be obtained. Furthermore trait data--if available--can be linked, displayed within the pedigree tree, and used to highlight pedigree entries that comply with set criteria.  相似文献   

2.
Topic models and neural networks can discover meaningful low-dimensional latent representations of text corpora; as such, they have become a key technology of document representation. However, such models presume all documents are non-discriminatory, resulting in latent representation dependent upon all other documents and an inability to provide discriminative document representation. To address this problem, we propose a semi-supervised manifold-inspired autoencoder to extract meaningful latent representations of documents, taking the local perspective that the latent representation of nearby documents should be correlative. We first determine the discriminative neighbors set with Euclidean distance in observation spaces. Then, the autoencoder is trained by joint minimization of the Bernoulli cross-entropy error between input and output and the sum of the square error between neighbors of input and output. The results of two widely used corpora show that our method yields at least a 15% improvement in document clustering and a nearly 7% improvement in classification tasks compared to comparative methods. The evidence demonstrates that our method can readily capture more discriminative latent representation of new documents. Moreover, some meaningful combinations of words can be efficiently discovered by activating features that promote the comprehensibility of latent representation.  相似文献   

3.
Pasekov VP 《Genetika》2000,36(2):257-265
A method for calculation of inbreeding coefficient F in a numerical pedigree with no reference to its graphic representation is suggested. For calculation of F, a formula that does not take into account inbreeding coefficients of common ancestors and admits intersections in a loop is used. An advantage of this method is that it automatically finds all loops formed by paths to common ancestors. Detecting these loops via their tracing in a graphic pedigree with intersecting lines of descent creates a possibility of errors. A criterion of existence of at least one common link for two numerical paths is presented. It enables one to exclude pairs of paths to common ancestors that do not form loops. The methods considered for computing F in a given pedigree give exact values of the inbreeding coefficient for autosomal and sex-linked loci and generalize the known approximate approaches. The methods are illustrated by examples.  相似文献   

4.
We describe a pedigree of 71 individuals from the Republic of Cameroon in which at least 33 individuals have a clinical diagnosis of persistent stuttering. The high concentration of stuttering individuals suggests that the pedigree either contains a single highly penetrant gene variant or that assortative mating led to multiple stuttering-associated variants being transmitted in different parts of the pedigree. No single locus displayed significant linkage to stuttering in initial genome-wide scans with microsatellite and SNP markers. By dividing the pedigree into five subpedigrees, we found evidence for linkage to previously reported loci on 3q and 15q, and to novel loci on 2p, 3p, 14q, and a different region of 15q. Using the two-locus mode of Superlink, we showed that combining the recessive locus on 2p and a single-locus additive representation of the 15q loci is sufficient to achieve a two-locus score over 6 on the entire pedigree. For this 2p + 15q analysis, we show LOD scores ranging from 4.69 to 6.57, and the scores are sensitive to which marker is chosen for 15q. Our findings provide strong evidence for linkage at several loci.  相似文献   

5.

Background

Plant breeders use an increasingly diverse range of data types to identify lines with desirable characteristics suitable to be taken forward in plant breeding programmes. There are a number of key morphological and physiological traits, such as disease resistance and yield that need to be maintained and improved upon if a commercial variety is to be successful. Computational tools that provide the ability to integrate and visualize this data with pedigree structure, will enable breeders to make better decisions on the lines that are used in crossings to meet both the demands for increased yield/production and adaptation to climate change.

Results

We have used a large and unique set of experimental barley (H. vulgare) data to develop a prototype pedigree visualization system. We then used this prototype to perform a subjective user evaluation with domain experts to guide and direct the development of an interactive pedigree visualization tool called Helium.

Conclusions

We show that Helium allows users to easily integrate a number of data types along with large plant pedigrees to offer an integrated environment in which they can explore pedigree data. We have also verified that users were happy with the abstract representation of pedigrees that we have used in our visualization tool.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-259) contains supplementary material, which is available to authorized users.  相似文献   

6.
SUMMARY: We developed a collaborative pedigree environment called CoPE. This environment includes a Java program for drawing pedigrees and a standardized system for pedigree storage. Unlike other existing pedigree programs, this software is particularly intended for epidemiologists in the sense that it allows customized automatic drawing of large numbers of pedigrees and remote and distributed consultation of pedigrees. AVAILABILITY: At http://www.infobiogen.fr/services/CoPE  相似文献   

7.
Identifying marker typing incompatibilities in linkage analysis.   总被引:3,自引:3,他引:0       下载免费PDF全文
A common problem encountered in linkage analyses is that execution of the computer program is halted because of genotypes in the data that are inconsistent with Mendelian inheritance. Such inconsistencies may arise because of pedigree errors or errors in typing. In some cases, the source of the inconsistencies is easily identified by examining the pedigree. In others, the error is not obvious, and substantial time and effort are required to identify the responsible genotypes. We have developed two methods for automatically identifying those individuals whose genotypes are most likely the cause of the inconsistencies. First, we calculate the posterior probability of genotyping error for each member of the pedigree, given the marker data on all pedigree members and allowing anyone in the pedigree to have an error. Second, we identify those individuals whose genotypes could be solely responsible for the inconsistency in the pedigree. We illustrate these methods with two examples: one a pedigree error, the second a genotyping error. These methods have been implemented as a module of the pedigree analysis program package MENDEL.  相似文献   

8.
Tasmanian devils have experienced an 85% population decline since the emergence of an infectious cancer. In response, a captive insurance population was established in 2006 with a subpopulation later introduced onto Maria Island, Tasmania. We aimed to (1) examine the genetic parameters of the Maria Island population as a stand-alone site and within its broader metapopulation context, (2) assess the efficacy of assisted colonisations, and (3) inform future translocations. This study reconstructs the pedigree of 86 island-born devils using 31 polymorphic microsatellite loci. Combined molecular and pedigree analysis was used to monitor change in population genetic parameters in 4 years since colonisation. Molecular analysis alone revealed no significant change in genetic diversity, while DNA-reconstructed pedigree analysis revealed a statistically significant increase in inbreeding due to skewed founder representation. Pedigree modelling predicted that gene diversity would only be maintained above the threshold of 95% for a further 2 years, dropping to 77.1% after 40 years. Modelling alternative supplementation strategies revealed introducing eight new founders every 3 years will enable the population to retain 95% gene diversity until 2056, provided the translocated animals breed; to ensure this we recommend introducing ten new females every 3 years. We highlight the value of combining pedigree analyses with molecular data, from both a single-site and metapopulation viewpoint, for analysing changes in genetic parameters within populations of conservation concern. The importance of post-release genetic monitoring in an established population is emphasised, given how quickly inbreeding can accumulate and gene diversity be lost.  相似文献   

9.
MOTIVATION: Network-centered studies in systems biology attempt to integrate the topological properties of biological networks with experimental data in order to make predictions and posit hypotheses. For any topology-based prediction, it is necessary to first assess the significance of the analyzed property in a biologically meaningful context. Therefore, devising network null models, carefully tailored to the topological and biochemical constraints imposed on the network, remains an important computational problem. RESULTS: We first review the shortcomings of the existing generic sampling scheme-switch randomization-and explain its unsuitability for application to metabolic networks. We then devise a novel polynomial-time algorithm for randomizing metabolic networks under the (bio)chemical constraint of mass balance. The tractability of our method follows from the concept of mass equivalence classes, defined on the representation of compounds in the vector space over chemical elements. We finally demonstrate the uniformity of the proposed method on seven genome-scale metabolic networks, and empirically validate the theoretical findings. The proposed method allows a biologically meaningful estimation of significance for metabolic network properties.  相似文献   

10.
Lones MA  Tyrrell AM 《Bio Systems》2004,76(1-3):229-238
This paper describes recent insights into the role of implicit context within the representations of evolving artefacts and specifically within the program representation used by enzyme genetic programming. Implicit context occurs within self-organising systems where a component's connectivity is both determined implicitly by its own definition and is specified in terms of the behavioural context of other components. This paper argues that implicit context is an important source of evolvability and presents experimental evidence that supports this assertion. In particular, it introduces the notion of variation filtering, suggesting that the use of implicit context within representations leads to meaningful variation filtering whereby inappropriate change is ignored and meaningful change is encouraged during evolution.  相似文献   

11.
Captive breeding programmes aim to provide an insurance against extinction in the wild and a source for re-introductions making it essential to minimise genetic threats, and maximise representation of wild adaptive genetic diversity. As such, genetic assessments of captive breeding programmes are increasingly common. However, these rarely include comprehensive comparisons with wild populations and typically neutral, rather than adaptive, genetic diversity is assayed. Moreover, genetic data are rarely integrated with studbook information, which enables the most robust assessments. Here we use the European captive African wild dog (Lycaon pictus) population to demonstrate the utility of this combined approach. Specifically, we combined studbook pedigree information with genetic assessments of captive and wild samples at both neutral markers and a locus thought to be important for adaptation (a gene at the Major Histocompatibility Complex, MHC). With these data we were able to evaluate founder origin and representation, as well as the distribution and origin of genetic variation within the captive population. We found discrepancies between diversity metrics derived from neutral and adaptive markers and pedigree versus genetic derived inbreeding estimates. Overall, however, we found a large proportion of genetic diversity from wild populations to be conserved in the captive population, much of which can be attributed to recent imports from outside of the European breeding programme. Nonetheless, we also found a high incidence of inbreeding and very skewed founder contributions. Based on these results, we proposed and implemented a genetic management plan to prevent further losses of diversity and reduce inbreeding.  相似文献   

12.
Cullis BR  Smith AB  Beeck CP  Cowling WA 《Génome》2010,53(11):1002-1016
Exploring and exploiting variety by environment (V × E) interaction is one of the major challenges facing plant breeders. In paper I of this series, we presented an approach to modelling V × E interaction in the analysis of complex multi-environment trials using factor analytic models. In this paper, we develop a range of statistical tools which explore V × E interaction in this context. These tools include graphical displays such as heat-maps of genetic correlation matrices as well as so-called E-scaled uniplots that are a more informative alternative to the classical biplot for large plant breeding multi-environment trials. We also present a new approach to prediction for multi-environment trials that include pedigree information. This approach allows meaningful selection indices to be formed either for potential new varieties or potential parents.  相似文献   

13.
14.
OBJECTIVES: This study was aimed at performing a segregation analysis of total serum immunoglobulin E (tIgE) in an isolated population using maximal genealogical information permitted by current software and computer capacities, while assessing the reliability of the best-fitting model of inheritance for tIgE through simulations. METHODS: All current Tangier Island, VA, residents (n = 664) belonged to one large extended pedigree (n = 3,501) spanning 13 generations, with an average inbreeding coefficient of 0.009. Phenotype data were obtained on 453 (68.2%) of the residents using a population-based recruitment scheme. Due to computational limitations resulting from the extremely complex pedigree structure, analysis on only two pedigree reconstructions was feasible: a reduced pedigree retaining all phenotyped individuals and their parents as 57 distinct families, and 922 nuclear families. RESULTS: Familial correlations and heritability calculations reveal a significant genetic component to tIgE in these data (heritability = 26%). The most parsimonious model to explain tIgE distribution indicated by the reduced pedigree structure was a two-distribution Mendelian model. However, larger and non-genetic models could not be rejected. Simulations over 200 replicates performed to evaluate the reliability of this model, indicated that using restricted genealogical information had minimal impact on results of segregation analyses performed here.  相似文献   

15.
MOTIVATION: Haplotype reconstruction is an essential step in genetic linkage and association studies. Although many methods have been developed to estimate haplotype frequencies and reconstruct haplotypes for a sample of unrelated individuals, haplotype reconstruction in large pedigrees with a large number of genetic markers remains a challenging problem. METHODS: We have developed an efficient computer program, HAPLORE (HAPLOtype REconstruction), to identify all haplotype sets that are compatible with the observed genotypes in a pedigree for tightly linked genetic markers. HAPLORE consists of three steps that can serve different needs in applications. In the first step, a set of logic rules is used to reduce the number of compatible haplotypes of each individual in the pedigree as much as possible. After this step, the haplotypes of all individuals in the pedigree can be completely or partially determined. These logic rules are applicable to completely linked markers and they can be used to impute missing data and check genotyping errors. In the second step, a haplotype-elimination algorithm similar to the genotype-elimination algorithms used in linkage analysis is applied to delete incompatible haplotypes derived from the first step. All superfluous haplotypes of the pedigree members will be excluded after this step. In the third step, the expectation-maximization (EM) algorithm combined with the partition and ligation technique is used to estimate haplotype frequencies based on the inferred haplotype configurations through the first two steps. Only compatible haplotype configurations with haplotypes having frequencies greater than a threshold are retained. RESULTS: We test the effectiveness and the efficiency of HAPLORE using both simulated and real datasets. Our results show that, the rule-based algorithm is very efficient for completely genotyped pedigree. In this case, almost all of the families have one unique haplotype configuration. In the presence of missing data, the number of compatible haplotypes can be substantially reduced by HAPLORE, and the program will provide all possible haplotype configurations of a pedigree under different circumstances, if such multiple configurations exist. These inferred haplotype configurations, as well as the haplotype frequencies estimated by the EM algorithm, can be used in genetic linkage and association studies. AVAILABILITY: The program can be downloaded from http://bioinformatics.med.yale.edu.  相似文献   

16.
Pasekov VP 《Genetika》2000,36(2):249-256
A method for collecting genealogical data with respect to an individual, a family, and members of the whole population is suggested. The essence of vertical pedigree construction consists of the same type of steps for filling in data (in the fixed order which excludes skips in the enumeration of lines of descent) about the father and the mother of the next ancestor. Each number in the received ordered list of ancestors uniquely determines a path (line of descent) to the given pedigree member. The path is explicitly described by a sequence of digits 0 and 1 (that corresponds to the sequence of fathers and mothers in the line of descent) at binary notation of this number. As a result, a pedigree is presented as a set of numbered rows that contain information, which uniquely identifies direct ancestors as individual persons. Results of joining separate pedigrees are recorded as a family list that contains lists of children for each parental pair. A pair of parents (more exactly, pointers of their families in the previous generation and numbers of pair members in their families) plays the role of the family "heading." Such a family list permits one to trace lines of descent and relationships for any population members presented in the list. It contains all genealogical information within the bounds of the study in a compact form. Here the process of collection requires considerably less time than traditional graphic representation of pedigrees. In addition, due to repeated checks of data during accumulation of material, error is minimized. Using pedigrees that have been collected, it is possible to calculate the coefficient of inbreeding manually. In connection with the wide prevalence of personal computers at present, it is also important that the data received are in fact ready to direct input to a computer for further automated data processing.  相似文献   

17.
The contractile system is a highly non-ideal solution. The activities of its components must be determined in order to achieve a meaningful representation of cross-bridge kinetics and of chemio-mechanical transduction. Osmotic techniques may help in this respect. A few examples are presented. Protein osmotic pressure influences cross-bridges by determining (1) their free energy minimum, (2) their stiffness and (3) their contractile force.  相似文献   

18.
19.
A C May 《Proteins》1999,37(1):20-29
Recently, several hierarchical classifications of protein three-dimensional (3D) structures have been published. However, none of them provides any assessment of the validity of a hierarchical representation or test individual clusters contained within. In fact, testing here of published trees reveals that they vary in meaning. Protein structure similarity measures are then assessed in terms of the robustness of the resulting trees for 24 protein families. A meaningful tree is defined as one in which all the clusters are found to be reliable according to a jackknife test. With the use of this criterion, a previously published similarity measure described as a "better RMS" is shown in fact to be usually less suited to protein fold classification than normal RMS after superposition. Here the "best" protein structure similarity measure for hierarchical classification-in terms of that which after clustering produces the highest number of meaningful trees, 20, for the 24 families-is found to be a new one. This measure includes information on the relationship of a distance at a given aligned position in a pair to the rest of the unique distances at that position in a protein family. There are only 2 families of the 24 tested, the globins (3 trees) and Kazal-type serine proteinase inhibitors (21 trees), in which the topology (branching order) of the meaningful 3D structure-based trees is constant. Thus, a new view of protein family sequence-structure relationships is afforded by comparing meaningful trees for each family. More generally, there is a need for care in interpretation of the results of those molecular biology algorithms that force a tree structure on data without assessing its applicability. Proteins 1999;37:20-29.  相似文献   

20.
Recent advances in genomics resources and tools are facilitating quantitative trait locus mapping. We developed a crossbreed pedigree for mapping quantitative trait loci for hip dysplasia in dogs by crossing dysplastic Labrador Retrievers and normal Greyhounds. We show that one advantage to using a crossbreed pedigree is the increased marker informativeness in the backcross/F2 population relative to the founder populations. We also discuss three factors that affect the detection power in the context of this crossbreed pedigree: being able to detect and correct genotyping errors, increasing marker density for chromosomes with a sparse coverage, and adding individuals to the mapping population as soon as they become available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号