首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Computational constraints currently limit exact multipoint linkage analysis to pedigrees of moderate size. We introduce new algorithms that allow analysis of larger pedigrees by reducing the time and memory requirements of the computation. We use the observed pedigree genotypes to reduce the number of inheritance patterns that need to be considered. The algorithms are implemented in a new version (version 2.1) of the software package GENEHUNTER. Performance gains depend on marker heterozygosity and on the number of pedigree members available for genotyping, but typically are 10-1,000-fold, compared with the performance of the previous release (version 2.0). As a result, families with up to 30 bits of inheritance information have been analyzed, and further increases in family size are feasible. In addition to computation of linkage statistics and haplotype determination, GENEHUNTER can also perform single-locus and multilocus transmission/disequilibrium tests. We describe and implement a set of permutation tests that allow determination of empirical significance levels in the presence of linkage disequilibrium among marker loci.  相似文献   

2.
In complex disease studies, it is crucial to perform multipoint linkage analysis with many markers and to use robust nonparametric methods that take account of all pedigree information. Currently available methods fall short in both regards. In this paper, we describe how to extract complete multipoint inheritance information from general pedigrees of moderate size. This information is captured in the multipoint inheritance distribution, which provides a framework for a unified approach to both parametric and nonparametric methods of linkage analysis. Specifically, the approach includes the following: (1) Rapid exact computation of multipoint LOD scores involving dozens of highly polymorphic markers, even in the presence of loops and missing data. (2) Non-parametric linkage (NPL) analysis, a powerful new approach to pedigree analysis. We show that NPL is robust to uncertainty about mode of inheritance, is much more powerful than commonly used nonparametric methods, and loses little power relative to parametric linkage analysis. NPL thus appears to be the method of choice for pedigree studies of complex traits. (3) Information-content mapping, which measures the fraction of the total inheritance information extracted by the available marker data and points out the regions in which typing additional markers is most useful. (4) Maximum-likelihood reconstruction of many-marker haplotypes, even in pedigrees with missing data. We have implemented NPL analysis, LOD-score computation, information-content mapping, and haplotype reconstruction in a new computer package, GENEHUNTER. The package allows efficient multipoint analysis of pedigree data to be performed rapidly in a single user-friendly environment.  相似文献   

3.
Wang J 《Genetics》2001,157(2):867-874
An approach to the optimal utilization of marker and pedigree information in minimizing the rates of inbreeding and genetic drift at the average locus of the genome (not just the marked loci) in a small diploid population is proposed, and its efficiency is investigated by stochastic simulations. The approach is based on estimating the expected pedigree of each chromosome by using marker and individual pedigree information and minimizing the average coancestry of selected chromosomes by quadratic integer programming. It is shown that the approach is much more effective and much less computer demanding in implementation than previous ones. For pigs with 10 offspring per mother genotyped for two markers (each with four alleles at equal initial frequency) per chromosome of 100 cM, the approach can increase the average effective size for the whole genome by approximately 40 and 55% if mating ratios (the number of females mated with a male) are 3 and 12, respectively, compared with the corresponding values obtained by optimizing between-family selection using pedigree information only. The efficiency of the marker-assisted selection method increases with increasing amount of marker information (number of markers per chromosome, heterozygosity per marker) and family size, but decreases with increasing genome size. For less prolific species, the approach is still effective if the mating ratio is large so that a high marker-assisted selection pressure on the rarer sex can be maintained.  相似文献   

4.
Assessing the genetic diversity in small farm animal populations   总被引:1,自引:0,他引:1  
Genetic variation is vital for the populations to adapt to varying environments and to respond to artificial selection; therefore, any conservation and development scheme should start from assessing the state of variation in the population. There are several marker-based and pedigree-based parameters to describe genetic variation. The most suitable ones are rate of inbreeding and effective population size, because they are not dependent on the amount of pedigree records. The acceptable level for effective population size can be considered from different angles leading to a conclusion that it should be at least 50 to 100. The estimates for the effective population size can be computed from the genealogical records or from demographic and marker information when pedigree data are not available. Marker information could also be used for paternity analysis and for estimation of coancestries. The sufficient accuracy in marker-based parameters would require typing thousands of markers. Across breeds, diversity is an important source of variation to rescue problematic populations and to introgress new variants. Consideration of adaptive variation brings new aspects to the estimation of the variation between populations.  相似文献   

5.
Inherited diseases commonly emerge within pedigree dog populations, often due to use of repeatedly bred carrier sire(s) within a small gene pool. Accurate family records are usually available making linkage analysis possible. However, there are many factors that are intrinsically difficult about collecting DNA and collating pedigree information from a large canine population. The keys to a successful DNA collection program include (1) the need to establish and maintain support from the pedigree breed clubs and pet owners; (2) committed individual(s) who can devote the considerable amount of time and energy to coordinating sample collection and communicating with breeders and clubs; and (3) providing means by which genotypic and phenotypic information can be easily collected and stored. In this article we described the clinical characteristics of inherited occipital hypoplasia/syringomyelia (Chiari type I malformation) in the cavalier King Charles spaniel and our experiences in establishing a pedigree and DNA database to study the disease.  相似文献   

6.
The prediction of identity by descent (IBD) probabilities is essential for all methods that map quantitative trait loci (QTL). The IBD probabilities may be predicted from marker genotypes and/or pedigree information. Here, a method is presented that predicts IBD probabilities at a given chromosomal location given data on a haplotype of markers spanning that position. The method is based on a simplification of the coalescence process, and assumes that the number of generations since the base population and effective population size is known, although effective size may be estimated from the data. The probability that two gametes are IBD at a particular locus increases as the number of markers surrounding the locus with identical alleles increases. This effect is more pronounced when effective population size is high. Hence as effective population size increases, the IBD probabilities become more sensitive to the marker data which should favour finer scale mapping of the QTL. The IBD probability prediction method was developed for the situation where the pedigree of the animals was unknown (i.e. all information came from the marker genotypes), and the situation where, say T, generations of unknown pedigree are followed by some generations where pedigree and marker genotypes are known.  相似文献   

7.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

8.
Here, we report genotyping conditions for 434 new polymorphic pig microsatellite markers containing trinucleotide and tetranucleotide repeat motifs in pig. Microsatellite sequences were detected in silico from bacterial artificial chromosome (BAC) clone end sequences and mapped to the pig genome. A set of 22 microsatellites is described, which can be separated in a simultaneous electrophoresis by multiplexing across a large size range, in combination with 4-colour labelling. Marker information content and false pedigree exclusion probabilities are documented in five purebred populations, allowing assessment of this panel in pig parentage testing applications. Combined exclusion probabilities >99.7% were achieved in all pedigree test cases.  相似文献   

9.
The problem of ascertainment in segregation analysis arises when families are selected for study through ascertainment of affected individuals. In this case, ascertainment must be corrected for in data analysis. However, methods for ascertainment correction are not available for many common sampling schemes, e.g., sequential sampling of extended pedigrees (except in the case of "single" selection). Concerns about whether ascertainment correction is even required for large pedigrees, about whether and how multiple probands in the same pedigree can be taken into account properly, and about how to apply sequential sampling strategies have occupied many investigators in recent years. We address these concerns by reconsidering a central issue, namely, how to handle pedigree structure (including size). We introduce a new distinction, between sampling in such a way that observed pedigree structure does not depend on which pedigree members are probands (proband-independent [PI] sampling) and sampling in such a way that observed pedigree structure does depend on who are the probands (proband-dependent [PD] sampling). This distinction corresponds roughly (but not exactly) to the distinction between fixed-structure and sequential sampling. We show that conditioning on observed pedigree structure in ascertained data sets obtained under PD sampling is not in general correct (with the exception of "single" selection), while PI sampling of pedigree structures larger than simple sibships is generally not possible. Yet, in practice one has little choice but to condition on observed pedigree structure. We conclude that the problem of genetic modeling in ascertained data sets is, in most situations, literally intractable. We recommend that future efforts focus on the development of robust approximate approaches to the problem.  相似文献   

10.
Captive breeding programs are an important tool for the conservation of endangered species. These programs are commonly managed using pedigrees containing information about the history of each individual's family, such as breeding pairs and parentage. However, there are some species that are kept in groups where it is hard to distinguish between particular individuals within the group, making it very difficult to record any information at an individual level. Currently, software and methods commonly used for registering and analyzing pedigrees to help manage populations at an individual level are not adequate for managing these group‐living species. Therefore, there is a need to further develop these tools and methodologies for pedigree analysis to better manage group‐living species. PMx is a program used for the management of ex situ populations in zoos and aquariums. We adapted the pedigree analysis method implemented in PMx to analyze pedigrees (records of descendant lineages) of group‐living species. In addition, we developed a group pedigree data entry sheet and group2PMx, a converter program that enables group datasets to be imported into PMx. We show how pedigree analysis of a group‐living species can be used for population management using the studbook of the endangered Texas blind cave salamander Eurycea rathbuni. Such analyses of the pedigree of groups can improve the management of group‐living species in ex situ breeding programs. Firstly, it enables better management decisions based on more accurate genetic measures between groups, allowing for greater control of inbreeding. Secondly, it can improve the conditions in which group‐living species are held by adapting husbandry practices to better reflect conditions of these species living in the wild. The use of the spreadsheet and group2PMx extends the application of PMx, allowing conservation managers and other institutions outside the zoo and aquarium community to easily import and analyze their pedigree data.  相似文献   

11.
Wijsman EM 《Human genetics》2012,131(10):1555-1563
Rare variation is the current frontier in human genetics. The large pedigree design is practical, efficient, and well-suited for investigating rare variation. In large pedigrees, specific rare variants that co-segregate with a trait will occur in sufficient numbers so that effects can be measured, and evidence for association can be evaluated, by making use of methods that fully use the pedigree information. Evidence from linkage analysis can focus investigation, both reducing the multiple testing burden and expanding the variants that can be evaluated and followed up, as recent studies have shown. The large pedigree design requires only a small fraction of the sample size needed to identify rare variants of interest in population-based designs, and many highly suitable, well-understood, and available statistical and computational tools already exist. Samples consisting of large pedigrees with existing rich phenotype and genome scan data should be prime candidates for high-throughput sequencing in the search of the determinants of complex traits.  相似文献   

12.
To examine constraints on evolution of larger body size in two stunted populations of brook charr (Salvelinus fontinalis) from a single river in Cape Race, Newfoundland, Canada, we measured viability selection acting on length-at-age traits, and estimated quantitative genetic parameters in situ (following reconstruction of pedigree information from microsatellite data). Furthermore we tested for phenotypic differentiation between the populations, and for association of high juvenile growth with early maturity that is predicted by life history theory. Within each population, selection differentials and estimates of heritabilities for length-at-age traits suggested that evolution of larger size is prevented by both selective and genetic constraints. Between the populations, phenotypic differentiation was found in length-at-age and age of maturation traits, whereas early maturation was associated with increased juvenile growth (relative to adult growth) both within and between populations. The results suggest an adaptive plastic response in age of maturation to juvenile growth rates that have a largely environmental basis of determination.  相似文献   

13.
Two functions for pedigree-drawing available in R (http://www.r-project.org): plot.pedigree in kinship and pedtodot in gap are described. The latter requires graphviz (http://www.graphviz.org). They can produce many pedigree diagrams quickly into a single file, serving as alternatives to programs that only offer interactive use. Availability: Packages kinship and gap are available from http://cran.r-project.org.  相似文献   

14.
Estimates of population size are critical for conservation and management, but accurate estimates are difficult to obtain for many species. Noninvasive genetic methods are increasingly used to estimate population size, particularly in elusive species such as large carnivores, which are difficult to count by most other methods. In most such studies, genotypes are treated simply as unique individual identifiers. Here, we develop a new estimator of population size based on pedigree reconstruction. The estimator accounts for individuals that were directly sampled, individuals that were not sampled but whose genotype could be inferred by pedigree reconstruction, and individuals that were not detected by either of these methods. Monte Carlo simulations show that the population estimate is unbiased and precise if sampling is of sufficient intensity and duration. Simulations also identified sampling conditions that can cause the method to overestimate or underestimate true population size; we present and discuss methods to correct these potential biases. The method detected 2–21% more individuals than were directly sampled across a broad range of simulated sampling schemes. Genotypes are more than unique identifiers, and the information about relationships in a set of genotypes can improve estimates of population size.  相似文献   

15.

Background

Genotype imputation can help reduce genotyping costs particularly for implementation of genomic selection. In applications entailing large populations, recovering the genotypes of untyped loci using information from reference individuals that were genotyped with a higher density panel is computationally challenging. Popular imputation methods are based upon the Hidden Markov model and have computational constraints due to an intensive sampling process. A fast, deterministic approach, which makes use of both family and population information, is presented here. All individuals are related and, therefore, share haplotypes which may differ in length and frequency based on their relationships. The method starts with family imputation if pedigree information is available, and then exploits close relationships by searching for long haplotype matches in the reference group using overlapping sliding windows. The search continues as the window size is shrunk in each chromosome sweep in order to capture more distant relationships.

Results

The proposed method gave higher or similar imputation accuracy than Beagle and Impute2 in cattle data sets when all available information was used. When close relatives of target individuals were present in the reference group, the method resulted in higher accuracy compared to the other two methods even when the pedigree was not used. Rare variants were also imputed with higher accuracy. Finally, computing requirements were considerably lower than those of Beagle and Impute2. The presented method took 28 minutes to impute from 6 k to 50 k genotypes for 2,000 individuals with a reference size of 64,429 individuals.

Conclusions

The proposed method efficiently makes use of information from close and distant relatives for accurate genotype imputation. In addition to its high imputation accuracy, the method is fast, owing to its deterministic nature and, therefore, it can easily be used in large data sets where the use of other methods is impractical.  相似文献   

16.
Lee SH  Van der Werf JH  Tier B 《Genetics》2005,171(4):2063-2072
A linkage analysis for finding inheritance states and haplotype configurations is an essential process for linkage and association mapping. The linkage analysis is routinely based upon observed pedigree information and marker genotypes for individuals in the pedigree. It is not feasible for exact methods to use all such information for a large complex pedigree especially when there are many missing genotypic data. Proposed Markov chain Monte Carlo approaches such as a single-site Gibbs sampler or the meiosis Gibbs sampler are able to handle a complex pedigree with sparse genotypic data; however, they often have reducibility problems, causing biased estimates. We present a combined method, applying the random walk approach to the reducible sites in the meiosis sampler. Therefore, one can efficiently obtain reliable estimates such as identity-by-descent coefficients between individuals based on inheritance states or haplotype configurations, and a wider range of data can be used for mapping of quantitative trait loci within a reasonable time.  相似文献   

17.
With the widespread availability of SNP genotype data, there is great interest in analyzing pedigree haplotype data. Intermarker linkage disequilibrium for microsatellite markers is usually low due to their physical distance; however, for dense maps of SNP markers, there can be strong linkage disequilibrium between marker loci. Linkage analysis (parametric and nonparametric) and family-based association studies are currently being carried out using dense maps of SNP marker loci. Monte Carlo methods are often used for both linkage and association studies; however, to date there are no programs available which can generate haplotype and/or genotype data consisting of a large number of loci for pedigree structures. SimPed is a program that quickly generates haplotype and/or genotype data for pedigrees of virtually any size and complexity. Marker data either in linkage disequilibrium or equilibrium can be generated for greater than 20,000 diallelic or multiallelic marker loci. Haplotypes and/or genotypes are generated for pedigree structures using specified genetic map distances and haplotype and/or allele frequencies. The simulated data generated by SimPed is useful for a variety of purposes, including evaluating methods that estimate haplotype frequencies for pedigree data, evaluating type I error due to intermarker linkage disequilibrium and estimating empirical p values for linkage and family-based association studies.  相似文献   

18.
Fan R  Jung J 《Human heredity》2003,56(4):166-187
This paper proposes variance component models for high resolution joint linkage disequilibrium (LD) and linkage mapping of quantitative trait loci (QTL) based on sibship data; this can include population data if independent individuals are treated as single sibships. One application of these models is late onset complex disease gene mapping, when parental data are not available. The models simultaneously incorporate both LD and linkage information. The LD information is contained in mean coefficients of sibship data. The linkage information is contained in the variance-covariance matrices of trait values for sibships with at least two siblings. We derive formulas for calculating the probability of sharing two trait alleles identical by descent (IBD) for sibpairs in interval mapping of QTL; this is the coefficient of dominant variance of the trait covariance of sibpairs on major QTL. To investigate the performance of the formulas, we calculate the numerical values via the formulas and get satisfactory approximations. We compare the power and sample sizes for both LD and linkage mapping. By simulation and theoretical analysis, we compare the results with those of Fulker and Abecasis "AbAw" approach. It is well known that the resolution of linkage analysis can be low for complex disease gene mapping. LD mapping, on the other hand, can increase mapping precision and is useful in high resolution mapping. Linkage analysis is less sensitive to population subdivisions and admixtures. The level of LD is sensitive to population stratification which may easily lead to spurious association. Performing a joint analysis of LD and linkage mapping can help to overcome the limits of both approaches. Moreover, the advantages of the two complementary strategies can be utilized maximally. In practice, linkage analysis may be performed using pedigree data to identify suggestive linkage between markers and trait loci based on a sparse marker map. In the presence of linkage, joint LD and linkage mapping can be carried out to do fine gene mapping based on a dense genetic map using both pedigree and population data. Population and pedigree data of any type can be combined to perform a joint analysis of high resolution LD and linkage mapping of QTL by generalizing the method.  相似文献   

19.
The dog is a valuable model species for the genetic analysis of complex traits, and the use of genotype imputation in dogs will be an important tool for future studies. It is of particular interest to analyse the effect of factors like single nucleotide polymorphism (SNP) density of genotyping arrays and relatedness between dogs on imputation accuracy due to the acknowledged genetic and pedigree structure of dog breeds. In this study, we simulated different genotyping strategies based on data from 1179 Labrador Retriever dogs. The study involved 5826 SNPs on chromosome 1 representing the high density (HighD) array; the low‐density (LowD) array was simulated by masking different proportions of SNPs on the HighD array. The correlations between true and imputed genotypes for a realistic masking level of 87.5% ranged from 0.92 to 0.97, depending on the scenario used. A correlation of 0.92 was found for a likely scenario (10% of dogs genotyped using HighD, 87.5% of HighD SNPs masked in the LowD array), which indicates that genotype imputation in Labrador Retrievers can be a valuable tool to reduce experimental costs while increasing sample size. Furthermore, we show that genotype imputation can be performed successfully even without pedigree information and with low relatedness between dogs in the reference and validation sets. Based on these results, the impact of genotype imputation was evaluated in a genome‐wide association analysis and genomic prediction in Labrador Retrievers.  相似文献   

20.
Quantitative trait loci (QTL) affecting the phenotype of interest can be detected using linkage analysis (LA), linkage disequilibrium (LD) mapping or a combination of both (LDLA). The LA approach uses information from recombination events within the observed pedigree and LD mapping from the historical recombinations within the unobserved pedigree. We propose the Bayesian variable selection approach for combined LDLA analysis for single-nucleotide polymorphism (SNP) data. The novel approach uses both sources of information simultaneously as is commonly done in plant and animal genetics, but it makes fewer assumptions about population demography than previous LDLA methods. This differs from approaches in human genetics, where LDLA methods use LA information conditional on LD information or the other way round. We argue that the multilocus LDLA model is more powerful for the detection of phenotype–genotype associations than single-locus LDLA analysis. To illustrate the performance of the Bayesian multilocus LDLA method, we analyzed simulation replicates based on real SNP genotype data from small three-generational CEPH families and compared the results with commonly used quantitative transmission disequilibrium test (QTDT). This paper is intended to be conceptual in the sense that it is not meant to be a practical method for analyzing high-density SNP data, which is more common. Our aim was to test whether this approach can function in principle.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号