首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 42 毫秒
1.
Genetic markers are important resources for individual identification and parentage assessment. Although short tandem repeats (STRs) have been the traditional DNA marker, technological advances have led to single nucleotide polymorphisms (SNPs) becoming an attractive alternative. SNPs can be highly multiplexed and automatically scored, which allows for easier standardization and sharing among laboratories. Equine parentage is currently assessed using STRs. We obtained a publicly available SNP dataset of 729 horses representing 32 diverse breeds. A proposed set of 101 SNPs was analyzed for DNA typing suitability. The overall minor allele frequency of the panel was 0.376 (range 0.304–0.419), with per breed probability of identities ranging from 5.6 × 10?35 to 1.86 × 10?42. When one parent was available, exclusion probabilities ranged from 0.9998 to 0.999996, although when both parents were available, all breeds had exclusion probabilities greater than 0.9999999. A set of 388 horses from 35 breeds was genotyped to evaluate marker performance on known families. The set included 107 parent–offspring pairs and 101 full trios. No horses shared identical genotypes across all markers, indicating that the selected set was sufficient for individual identification. All pairwise comparisons were classified using ISAG rules, with one or two excluding markers considered an accepted parent–offspring pair, two or three excluding markers considered doubtful and four or more excluding markers rejecting parentage. The panel had an overall accuracy of 99.9% for identifying true parent–offspring pairs. Our developed marker set is both present on current generation SNP chips and can be highly multiplexed in standalone panels and thus is a promising resource for SNP‐based DNA typing.  相似文献   

2.
In this study, we identified porcine single nucleotide polymorphisms (SNPs) by aligning eight sequences generated with two approaches: amplification of 665 intronic regions using one sample from each of eight breeds, including three East Asian pigs, and amplification of 289 3'-UTR regions using two samples from each of four major commercial breeds. The 1,760 and 599 SNPs were validated using two 384-sample DNA panels by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. The phylogenetic tree and Structure analyses classified the pigs into two large clusters: Euro-American and East Asian populations. The membership proportions, however, differed between inferred clusters for K = 2 generated by the two approaches. With intronic SNPs, Euro-American breeds constituted about 100% of the Euro-American cluster, but with 3'-UTR SNPs, about 17% of the East Asian cluster comprised five Euro-American breeds. The differences in the SNP discovery panels may affect population structure found in study panels of large samples.  相似文献   

3.
Single nucleotide polymorphisms (SNPs) have become an important type of marker for commercial diagnostic and parentage genotyping applications as automated genotyping systems have been developed that yield accurate genotypes. Unfortunately, allele frequencies for public SNP markers in commercial pig populations have not been available. To fulfil this need, SNP markers previously mapped in the USMARC swine reference population were tested in a panel of 155 boars that were representative of US purebred Duroc, Hampshire, Landrace and Yorkshire populations. Multiplex assay groups of 5-7 SNP assays/group were designed and genotypes were determined using Sequenom's massarray system. Of 80 SNPs that were evaluated, 60 SNPs with minor allele frequencies >0.15 were selected for the final panel of markers. Overall identity power across breeds was 4.6 x 10(-23), but within-breed values ranged from 4.3 x 10(-14) (Hampshire) to 2.6 x 10(-22) (Yorkshire). Parentage exclusion probability with only one sampled parent was 0.9974 (all data) and ranged from 0.9594 (Hampshire) to 0.9963 (Yorkshire) within breeds. Sire exclusion probability when the dam's genotype was known was 0.99998 (all data) and ranged from 0.99868 (Hampshire) to 0.99997 (Yorkshire) within breeds. Power of exclusion was compared between the 60 SNP and 10 microsatellite markers. The parental exclusion probabilities for SNP and microsatellite marker panels were similar, but the SNP panel was much more sensitive for individual identification. This panel of SNP markers is theoretically sufficient for individual identification of any pig in the world and is publicly available.  相似文献   

4.
Commercial single nucleotide polymorphism (SNP) arrays have been recently developed for several species and can be used to identify informative markers to differentiate breeds or populations for several downstream applications. To identify the most discriminating genetic markers among thousands of genotyped SNPs, a few statistical approaches have been proposed. In this work, we compared several methods of SNPs preselection (Delta, Fst and principal component analyses (PCA)) in addition to Random Forest classifications to analyse SNP data from six dairy cattle breeds, including cosmopolitan (Holstein, Brown and Simmental) and autochthonous Italian breeds raised in two different regions and subjected to limited or no breeding programmes (Cinisara, Modicana, raised only in Sicily and Reggiana, raised only in Emilia Romagna). From these classifications, two panels of 96 and 48 SNPs that contain the most discriminant SNPs were created for each preselection method. These panels were evaluated in terms of the ability to discriminate as a whole and breed-by-breed, as well as linkage disequilibrium within each panel. The obtained results showed that for the 48-SNP panel, the error rate increased mainly for autochthonous breeds, probably as a consequence of their admixed origin lower selection pressure and by ascertaining bias in the construction of the SNP chip. The 96-SNP panels were generally more able to discriminate all breeds. The panel derived by PCA-chrom (obtained by a preselection chromosome by chromosome) could identify informative SNPs that were particularly useful for the assignment of minor breeds that reached the lowest value of Out Of Bag error even in the Cinisara, whose value was quite high in all other panels. Moreover, this panel contained also the lowest number of SNPs in linkage disequilibrium. Several selected SNPs are located nearby genes affecting breed-specific phenotypic traits (coat colour and stature) or associated with production traits. In general, our results demonstrated the usefulness of Random Forest in combination to other reduction techniques to identify population informative SNPs.  相似文献   

5.
The International Society of Animal Genetics (ISAG) has chosen nine microsatellites (international marker set) as a standard that should be included in all cattle parentage studies. They are BM1824, BM2113, INRA023, SPS115, TGLA122, TGLA126, TGLA227, ETH10, and ETH225. We decided to ascertain whether this microsatellite set could be used to determine ancestral proportions in individual animals of synthetic breeds produced by crossing zebu and taurine cattle. Since the genotypes of these markers are routinely available, this would constitute a practical and cost-free method to estimate the ancestry of synthetic breed animals. Genotypes of 100 Gir and 100 Holstein animals were examined for this ISAG marker set. As expected, there were very significant allele frequency differences between the two breeds at most loci. We also typed 20 Girolando animals for which there was complete genealogical information. "Structure" software easily distinguished Holstein and Gir animals based on their microsatellite genotypes; it also attributed the genomic proportion of zebu and taurine of each of the 20 Girolando animals. The proportion of Holstein ancestry was then regressed on the genealogical data; there was a highly significant correlation (r = 0.84, P < 0.0001). The nine microsatellites that compose the ISAG international marker set were capable of estimating the ancestral Gir and Holstein genomic proportions in individual Girolando animals within narrow confidence limits. This microsatellite set might also be useful for estimating the proportions of taurine and zebu origins in commercial meat products.  相似文献   

6.
7.
The use of high‐throughput, low‐density sequencing approaches has dramatically increased in recent years in studies of eco‐evolutionary processes in wild populations and domestication in commercial aquaculture. Most of these studies focus on identifying panels of SNP loci for a single downstream application, whereas there have been few studies examining the trade‐offs for selecting panels of markers for use in multiple applications. Here, we detail the use of a bioinformatic workflow for the development of a dual‐purpose SNP panel for parentage and population assignment, which included identifying putative SNP loci, filtering for the most informative loci for the two tasks, designing effective multiplex PCR primers, optimizing the SNP panel for performance, and performing quality control steps for downstream applications. We applied this workflow to two adjacent Alaskan Sockeye Salmon populations and identified a GTseq panel of 142 SNP loci for parentage and 35 SNP loci for population assignment. Only 50–75 panel loci were necessary for >95% accurate parentage, whereas population assignment success, with all 172 panel loci, ranged from 93.9% to 96.2%. Finally, we discuss the trade‐offs and complexities of the decision‐making process that drives SNP panel development, optimization, and testing.  相似文献   

8.
The Korean Hanwoo cattle have been intensively selected for production traits, especially high intramuscular fat content. It is believed that ancient crossings between different breeds contributed to forming the Hanwoo, but little is known about the genomic differences and similarities between other cattle breeds and the Hanwoo. In this work, cattle breeds were grouped by origin into four types and used for comparisons: the Europeans (represented by six breeds), zebu (Nelore), African taurine (N'Dama) and Hanwoo. All animals had genotypes for around 680 000 SNPs after quality control of genotypes. Average heterozygosity was lower in Nelore and N'Dama (0.22 and 0.21 respectively) than in Europeans (0.26–0.31, with Shorthorn as outlier at 0.24) and Hanwoo (0.29). Pairwise FST analyses demonstrated that Hanwoo are more related to European cattle than to Nelore, with N'Dama in an intermediate position. This finding was corroborated by principal components and unsupervised hierarchical clustering. Using genome‐wide smoothed FST, 55 genomic regions potentially under positive selection in Hanwoo were identified. Among these, 29 were regions also detected in previous studies. Twenty‐four regions were exclusive to Hanwoo, and a number of other regions were shared with one or two of the other groups. These regions overlap a number of genes that are related to immune, reproduction and fatty acid metabolism pathways. Further analyses are needed to better characterize the ancestry of the Hanwoo cattle and to define the genes responsible to the identified selection peaks.  相似文献   

9.
We propose the use of single nucleotide polymorphisms (SNPs) instead of polymorphic microsatellite markers for individual identification and parentage control in cattle. To this end, we present an initial set of 37 SNP markers together with a gender-specific SNP for identity control and parentage testing in the Holstein, Fleckvieh and Braunvieh breeds. To obtain suitable SNPs, a total of 91.13 kb of random genomic DNA was screened yielding 531 SNPs. These, and 43 previously identified SNPs, were subjected to the following selection criteria: (1) the frequency of the minor allele must be larger than 0.1 in at least two of the three examined breeds, and (2) markers should not be linked closely. Allele frequencies were estimated by analysing sequencing traces of pooled DNA or by genotyping individual DNA samples. The selected SNP loci were physically mapped by radiation hybrid mapping or by fluorescence in situ hybridization, and tested against the neutral mutation hypothesis. The presented marker set theoretically allows probabilities of identity less than 10(-13) for individual verification and exclusion powers exceeding 99.99% for parentage testing.  相似文献   

10.
Knowledge about genetic diversity and population structure is useful for designing effective strategies to improve the production, management and conservation of farm animal genetic resources. Here, we present a comprehensive genome-wide analysis of genetic diversity, population structure and admixture based on 244 animals sampled from 10 cattle populations in Asia and Africa and genotyped for 69 903 autosomal single-nucleotide polymorphisms (SNPs) mainly derived from the indicine breed. Principal component analysis, STRUCTURE and distance analysis from high-density SNP data clearly revealed that the largest genetic difference occurred between the two domestic lineages (taurine and indicine), whereas Ethiopian cattle populations represent a mosaic of the humped zebu and taurine. Estimation of the genetic influence of zebu and taurine revealed that Ethiopian cattle were characterized by considerable levels of introgression from South Asian zebu, whereas Bangladeshi populations shared very low taurine ancestry. The relationships among Ethiopian cattle populations reflect their history of origin and admixture rather than phenotype-based distinctions. The high within-individual genetic variability observed in Ethiopian cattle represents an untapped opportunity for adaptation to changing environments and for implementation of within-breed genetic improvement schemes. Our results provide a basis for future applications of genome-wide SNP data to exploit the unique genetic makeup of indigenous cattle breeds and to facilitate their improvement and conservation.  相似文献   

11.
Humped African cattle, which are differentiated into zebu and sanga types, have traditionally been classified as Bos indicus . This paper discusses existing evidence and presents new evidence supporting the classification of southern African sangas as Bos taurus and East African zebus as ' taurindicus '. Classification is based on karyotype, frequencies of DNA markers and protein polymorphisms. The Boran, an East African zebu, has an acrocentric Y chromosome typical of Bos indicus . The southern African sanga breeds have a submetacentric Y chromosome typical of Bos taurus . Frequencies of four DNA markers support the hypothesis that the Tuli, a southern African sanga, had taurine ancestors and the Boran had both taurine and indicine ancestors. Frequencies for several protein polymorphisms strongly suggest that southern African sangas have more in common with taurine than with indicine breeds, while East African zebus are an admixture of African taurine and Asian indicine breeds.  相似文献   

12.

Background

Parentage control is moving from short tandem repeats- to single nucleotide polymorphism (SNP) systems. For SNP-based parentage control in cattle, the ISAG-ICAR Committee proposes a set of 100/200 SNPs but quality criteria are lacking. Regarding German Holstein-Friesian cattle with only a limited number of evaluated individuals, the exclusion probability is not well-defined. We propose a statistical procedure for excluding single SNPs from parentage control, based on case-by-case evaluation of the GenCall score, to minimize parentage exclusion, based on miscalled genotypes. Exclusion power of the ISAG-ICAR SNPs used for the German Holstein-Friesian population was adjusted based on the results of more than 25 000 individuals.

Results

Experimental data were derived from routine genomic selection analyses of the German Holstein-Friesian population using the Illumina BovineSNP50 v2 BeadChip (20 000 individuals) or the EuroG10K variant (7000 individuals). Averages and standard deviations of GenCall scores for the 200 SNPs of the ISAG-ICAR recommended panel were calculated and used to calculate the downward Z-value. Based on minor allelic frequencies in the Holstein-Friesian population, one minus exclusion probability was equal to 1.4×10−10 and 7.2×10−26, with one and two parents, respectively. Two monomorphic SNPs from the 100-SNP ISAG-ICAR core-panel did not contribute. Simulation of 10 000 parentage control combinations, using the GenCall score data from both BeadChips, showed that with a Z-value greater than 3.66 only about 2.5% parentages were excluded, based on the ISAG-ICAR recommendations (core-panel: ≥ 90 SNPs for one, ≥ 85 SNPs for two parents). When applied to real data from 1750 single parentage assessments, the optimal threshold was determined to be Z = 5.0, with only 34 censored cases and reduction to four (0.2%) doubtful parentages. About 70 parentage exclusions due to weak genotype calls were avoided, whereas true exclusions (n = 34) were unaffected.

Conclusions

Using SNPs for parentage evaluation provides a high exclusion power also for parent identification. SNPs with a low GenCall score show a high tendency towards intra-molecular secondary structures and substantially contribute to false exclusion of parentages. We propose a method that controls this error without excluding too many parent combinations from the evaluation.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0085-1) contains supplementary material, which is available to authorized users.  相似文献   

13.
DNA analysis of microsatellite markers has become a common tool for verifying parentage in breed registries and identifying individual animals that are linked to a database or owner. Panels of markers have been developed in canines, but their utility across and within a wide range of breeds has not been reported. The American Kennel Club (AKC) authorized a study to determine the power to exclude non-parents and identify individuals using DNA genotypes of 17 microsatellite markers in two panels. Cheek swab samples were voluntarily collected at Parent Breed Club National Specialty dog shows and 9561 samples representing 108 breeds were collected, averaging 88.5 dogs per breed. The primary panel of 10 markers exceeded 99% power of exclusion for canine parentage verification of 61% of the breeds. In combination with the secondary panel of seven markers, 100% of the tested breeds exceeded 99% power of exclusion. The minimum probability match rate of the first panel was 3.6 x 10(-5) averaged across breeds, and with the addition of the second panel, the probability match rate was 3.2 x 10(-8); thus the probability of another random, unrelated dog with the same genotype is very low. The results of this analysis indicated that, on average, the primary panel meets the AKC's needs for routine parentage testing, but that a combination of 10-15 genetic markers from the two panels could yield a universal canine panel with enhanced processing efficiency, reliability and informativeness.  相似文献   

14.
Although genomic selection offers the prospect of improving the rate of genetic gain in meat, wool and dairy sheep breeding programs, the key constraint is likely to be the cost of genotyping. Potentially, this constraint can be overcome by genotyping selection candidates for a low density (low cost) panel of SNPs with sparse genotype coverage, imputing a much higher density of SNP genotypes using a densely genotyped reference population. These imputed genotypes would then be used with a prediction equation to produce genomic estimated breeding values. In the future, it may also be desirable to impute very dense marker genotypes or even whole genome re‐sequence data from moderate density SNP panels. Such a strategy could lead to an accurate prediction of genomic estimated breeding values across breeds, for example. We used genotypes from 48 640 (50K) SNPs genotyped in four sheep breeds to investigate both the accuracy of imputation of the 50K SNPs from low density SNP panels, as well as prospects for imputing very dense or whole genome re‐sequence data from the 50K SNPs (by leaving out a small number of the 50K SNPs at random). Accuracy of imputation was low if the sparse panel had less than 5000 (5K) markers. Across breeds, it was clear that the accuracy of imputing from sparse marker panels to 50K was higher if the genetic diversity within a breed was lower, such that relationships among animals in that breed were higher. The accuracy of imputation from sparse genotypes to 50K genotypes was higher when the imputation was performed within breed rather than when pooling all the data, despite the fact that the pooled reference set was much larger. For Border Leicesters, Poll Dorsets and White Suffolks, 5K sparse genotypes were sufficient to impute 50K with 80% accuracy. For Merinos, the accuracy of imputing 50K from 5K was lower at 71%, despite a large number of animals with full genotypes (2215) being used as a reference. For all breeds, the relationship of individuals to the reference explained up to 64% of the variation in accuracy of imputation, demonstrating that accuracy of imputation can be increased if sires and other ancestors of the individuals to be imputed are included in the reference population. The accuracy of imputation could also be increased if pedigree information was available and was used in tracking inheritance of large chromosome segments within families. In our study, we only considered methods of imputation based on population‐wide linkage disequilibrium (largely because the pedigree for some of the populations was incomplete). Finally, in the scenarios designed to mimic imputation of high density or whole genome re‐sequence data from the 50K panel, the accuracy of imputation was much higher (86–96%). This is promising, suggesting that in silico genome re‐sequencing is possible in sheep if a suitable pool of key ancestors is sequenced for each breed.  相似文献   

15.
The genus Agapornis, or lovebirds, are popular pet parrots worldwide. Currently, breeders are dependent on pedigree records as a selection tool as no molecular parentage verification test is available for any of the nine species. The A. roseicollis reference genome was recently assembled. This was followed by the sequencing of the whole genomes of the parents of the reference genome individual at 30× coverage. The parents’ reads were mapped against the reference genome to identify SNPs. Over 1.6 million SNPs, shared between the parents, were discovered using the Genome Analysis Toolkit pipeline. SNPs were filtered to a panel of 480 SNPs based on Genome Analysis Toolkit parameters. The panel of 480 SNPs was genotyped in a population of 960 lovebirds across seven species. A panel of 262 SNPs was compiled that included SNPs successfully amplified across all species. The 262‐SNP panel was reduced based on the observed heterozygosity (HO) and minor allele frequency (MAF) values per SNP to include the lowest number of SNPs with the highest exclusion power for parentage verification. Two smaller panels consisting of 195 SNPs with MAF and HO values >0.1 and 40 SNPs with MAF and HO values >0.3, were constructed. The panels were verified using 43 families from different species with known relationships to evaluate the exclusion power of each panel. The 195 SNP panel with an average exclusion probability of 99.9% and MAF and HO values >0.1 was proposed as the routine Agapornis parentage verification panel.  相似文献   

16.
Over the last 30 years, Hanwoo has been selectively bred to improve economically important traits. Hanwoo is currently the representative Korean native beef cattle breed, and it is believed that it shared an ancestor with a Chinese breed, Yanbian cattle, until the last century. However, these two breeds have experienced different selection pressures during recent decades. Here, we whole-genome sequenced 10 animals each of Hanwoo and Yanbian cattle (20 total) using the Illumina HiSeq 2000 sequencer. A total of approximately 3.12 and 3.07 billion sequence reads were mapped to the bovine reference sequence assembly (UMD 3.1) at an average of approximately 10.71- and 10.53-fold coverage for Hanwoo and Yanbian cattle, respectively. A total of 17,936,399 single nucleotide polymorphisms (SNPs) were yielded, of which 22.3% were found to be novel. By annotating the SNPs, we further retrieved numerous nonsynonymous SNPs that may be associated with traits of interest in cattle. Furthermore, we performed whole-genome screening to detect signatures of selection throughout the genome. We located several promising selective sweeps that are potentially responsible for economically important traits in cattle; the PPP1R12A gene is an example of a gene that potentially affects intramuscular fat content. These discoveries provide valuable genomic information regarding potential genomic markers that could predict traits of interest for breeding programs of these cattle breeds.  相似文献   

17.

Background

The Bovine HapMap Consortium has generated assay panels to genotype ~30,000 single nucleotide polymorphisms (SNPs) from 501 animals sampled from 19 worldwide taurine and indicine breeds, plus two outgroup species (Anoa and Water Buffalo). Within the larger set of SNPs we targeted 101 high density regions spanning up to 7.6 Mb with an average density of approximately one SNP per 4 kb, and characterized the linkage disequilibrium (LD) and haplotype block structure within individual breeds and groups of breeds in relation to their geographic origin and use.

Results

From the 101 targeted high-density regions on bovine chromosomes 6, 14, and 25, between 57 and 95% of the SNPs were informative in the individual breeds. The regions of high LD extend up to ~100 kb and the size of haplotype blocks ranges between 30 bases and 75 kb (10.3 kb average). On the scale from 1–100 kb the extent of LD and haplotype block structure in cattle has high similarity to humans. The estimation of effective population sizes over the previous 10,000 generations conforms to two main events in cattle history: the initiation of cattle domestication (~12,000 years ago), and the intensification of population isolation and current population bottleneck that breeds have experienced worldwide within the last ~700 years. Haplotype block density correlation, block boundary discordances, and haplotype sharing analyses were consistent in revealing unexpected similarities between some beef and dairy breeds, making them non-differentiable. Clustering techniques permitted grouping of breeds into different clades given their similarities and dissimilarities in genetic structure.

Conclusion

This work presents the first high-resolution analysis of haplotype block structure in worldwide cattle samples. Several novel results were obtained. First, cattle and human share a high similarity in LD and haplotype block structure on the scale of 1–100 kb. Second, unexpected similarities in haplotype block structure between dairy and beef breeds make them non-differentiable. Finally, our findings suggest that ~30,000 uniformly distributed SNPs would be necessary to construct a complete genome LD map in Bos taurus breeds, and ~580,000 SNPs would be necessary to characterize the haplotype block structure across the complete cattle genome.  相似文献   

18.
《Genomics》2020,112(2):1726-1733
The cost of SNP genotyping to screen different breeds and to estimate the exact proportion of ancestry level is quite high, which can be compensated through deriving a small panel of ancestry informative markers (AIMs). Hence, we carried out the present study to provide an insight into ancestry level inferred from a panel of informative markers in the crossbred Vrindavani population developed at ICAR-IVRI, India. We have performed a new method i.e., discriminant analysis of principal components (DAPC) for the first time on the dataset of Vrindavani cattle. To confirm our method, we had performed DAPC on two other well-known crossbred cattle, i.e., Frieswal and Beefmaster. Three sets of panels (500, 1000 and 2000 markers) were tested for clustering of individuals. Among all the panels, we found the panel (1000 markers) with DAPC based contribution method was of the smallest size and comparatively of the highest accuracy.  相似文献   

19.
The recent release of the Bovine HapMap dataset represents the most detailed survey of bovine genetic diversity to date, providing an important resource for the design and development of livestock production. We studied this dataset, comprising more than 30,000 Single Nucleotide Polymorphisms (SNPs) for 19 breeds (13 taurine, three zebu, and three hybrid breeds), seeking to identify small panels of genetic markers that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal Components Analysis and algorithms that we have recently described for the selection of Ancestry Informative Markers from genomewide datasets, we present a decision-tree which can be used to accurately infer the origin of individual cattle. In doing so, we present a thorough examination of population genetic structure in modern bovine breeds. Performing extensive cross-validation experiments, we demonstrate that 250-500 carefully selected SNPs suffice in order to achieve close to 100% prediction accuracy of individual ancestry, when this particular set of 19 breeds is considered. Our methods, coupled with the dense genotypic data that is becoming increasingly available, have the potential to become a valuable tool and have considerable impact in worldwide livestock production. They can be used to inform the design of studies of the genetic basis of economically important traits in cattle, as well as breeding programs and efforts to conserve biodiversity. Furthermore, the SNPs that we have identified can provide a reliable solution for the traceability of breed-specific branded products.  相似文献   

20.
Single nucleotide polymorphisms (SNPs) able to describe population differences can be used for important applications in livestock, including breed assignment of individual animals, authentication of mono-breed products and parentage verification among several other applications. To identify the most discriminating SNPs among thousands of markers in the available commercial SNP chip tools, several methods have been used. Random forest (RF) is a machine learning technique that has been proposed for this purpose. In this study, we used RF to analyse PorcineSNP60 BeadChip array genotyping data obtained from a total of 2737 pigs of 7 Italian pig breeds (3 cosmopolitan-derived breeds: Italian Large White, Italian Duroc and Italian Landrace, and 4 autochthonous breeds: Apulo-Calabrese, Casertana, Cinta Senese and Nero Siciliano) to identify breed informative and reduced SNP panels using the mean decrease in the Gini Index and the Mean Decrease in Accuracy parameters with stability evaluation. Other reduced informative SNP panels were obtained using Delta, Fixation index and principal component analysis statistics, and their performances were compared with those obtained using the RF-defined panels using the RF classification method and its derived Out Of Bag rates and correct prediction proportions. Therefore, the performances of a total of six reduced panels were evaluated. The correct assignment of the animals to its breed was close to 100% for all tested approaches. Porcine chromosome 8 harboured the largest number of selected SNPs across all panels. Many SNPs were included in genomic regions in which previous studies identified signatures of selection or genes (e.g. ESR1, KITL and LCORL) that could contribute to explain, at least in part, phenotypically or economically relevant traits that might differentiate cosmopolitan and autochthonous pig breeds. Random forest used as preselection statistics highlighted informative SNPs that were not the same as those identified by other methods. This might be due to specific features of this machine learning methodology. It will be interesting to explore if the adaptation of RF methods for the identification of selection signature regions could be able to describe population-specific features that are not captured by other approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号