首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Seber GA  Huakau JT  Simmons D 《Biometrics》2000,56(4):1227-1232
In recent years, capture-recapture methods for closed populations have been extensively applied to epidemiology. For example, suppose we have several incomplete lists of diabetics and we wish to estimate the total number of diabetics by estimating the number missing from all the lists. A major problem is that the information about individuals on the lists may have been given incorrectly or the information may have been typed incorrectly so that some list matches are missed. Using the concept of tag loss borrowed from animal population studies, we consider methods for estimating both the probabilities of making list errors and the population size for just two independent lists. The effect of heterogeneity on the errors is examined. The methods are applied to a large data set of diabetic persons consisting of a list obtained from a survey and a list obtained from doctors' records. It was found that the error rates were high and that ignoring the errors led to a gross overestimate of the total number of diabetic persons.  相似文献   

2.
There is an urgent need to develop simple and effective methods for monitoring bird populations that are cheap to deploy in resource-poor countries. This paper describes a newly developed system, provisionally referred to as, Wordbirds, that will provide a platform for the collection, storage and retrieval of new and existing data from bird observations recorded worldwide. This Internet-based global network of databases will capture field lists and ad hoc sightings routinely gathered by individuals observing birds recreationally and professionally. Huge numbers of lists are collected annually and could provide information on population trends spanning many years. By collecting these records, a valuable resource will be secured with the potential to map and monitor bird distributions and estimate trends in species abundance. An erratum to this article is available at .  相似文献   

3.
Industrial epidemiology is a specialized discipline concerned with the study of disease occurrence in specific subgroups of the general population, i.e., of relatively healthy members of the work force for whom adequate records are available. Although the ultimate purpose of industrial epidemiology--the prevention of disease--is a logical extension of programs of industrial medicine and occupational and community health, epidemiologic methods must draw on interdisciplinary skills. The existence of centralized records kept in the course of business may make it easier to collect information about industrial populations than to gather data relative to other population subgroups. Many deficiencies in epidemiologic studies of worker groups, however, can be related to poor methods of data-gathering, inadequate record keeping, and an incomplete data base. Sources of information for epidemiologic studies of worker groups may include personnel and medical records, government reports, insurance files, production records, industrial hygiene measurements, surveys and questionnaires, and an organized follow-up program. In some cases, the ready availability of multiple sources of information may lead to differential information bias, and this should be avoided.  相似文献   

4.
Electronic data linkage is increasingly being used by researchers and health professionals in the birth defects field as a tool for enhancing both research and service/care. However, in many cases, a common pre-existing ID number does not exist across different datasets, and common identifiers, such as names or dates of birth, which could be used to match records, may be known to contain errors or even legitimate differences over time. In such situations, probabilistic matching, which does not require that all identifying fields exactly agree in order for one to conclude that two records belong to the same individual, can be a valuable tool for improving data linkage. However, probabilistic matching is computationally complex and demanding, and not well understood by many who may wish to apply it in their work. Therefore, the purpose of this article is to provide an overview of one approach to probabilistic matching, including the step-by-step procedures involved in the calculation of indices corresponding to the likelihood that two records are a correct match. In addition, the use of multiple iterative protocols, in which several different matching strategies are used in order to maximize the number of linked records, is discussed. Finally, issues related to deduplication and verification of internal-consistency in the linked data set are also reviewed.  相似文献   

5.
Abstract. Numbers of plant species were recorded in species‐rich meadows in the Bílé Karpaty Mts., SE Czech Republic, with the aim to evaluate the sampling error made by well‐trained observers. Five observers recorded vascular plants in seven plots ranging from 9.8 cm2 to 4 m2 independently and were not time‐limited. In larger plots a discrepancy of 10–20% was found between individual estimates, in smaller plots discrepancy increased to 33%, on average. The gain in observed species richness by combining records of individual observers (in comparison with the mean numbers estimated by single observers) decreased from the smallest plot (27–82% for two to five observers) to the largest one (13–25%). However, after misidentified and suspicious records were eliminated, the gain was much lower and became scale‐independent; two observers added 12% species, on average, and the increase by combining species lists made by three or more observers was negligible (3% more on average). It is concluded that most discrepancies between individual observers were caused by misidentification of rare seedlings and young plants. We suggest that in species‐rich meadows plants should be recorded by at least three observers together and that they should consult all problematic plant specimens together in the field, to minimize errors.  相似文献   

6.
Capture-recapture, epidemiology, and list mismatches: several lists   总被引:1,自引:0,他引:1  
Lee AJ  Seber GA  Holden JK  Huakau JT 《Biometrics》2001,57(3):707-713
In applying capture-recapture methods for closed populations to epidemiology, e.g., in the estimation of the size of a diabetes population, one comes up against the problem of list errors due to mistyping or misinformation. This problem has been studied for just two lists by Seber, Huakau, and Simmons (2000, Biometrics 56, 1227 1232) using the concept of tag loss borrowed from animal population studies. In this article, we discuss a similar method that can be extended to an arbitrary number of lists. The methods are applied to an example.  相似文献   

7.
W. R. P. Bourne 《Ibis》1967,109(2):141-167
The records of long-distance vagrancy in the Procellariiformes are listed and re-examined. Some are clearly valid, more are obviously doubtful, many are difficult to confirm. Some records from the last century have failed to be repeated in this, but others have been repeated, sometimes more frequently. Some of them suggest hitherto unrecognized migrations or post-juvenile or post-breeding dispersal. Otherwise in general it appears that the more migratory albatrosses and much less often perhaps the southern fulmars may cross the equator into the opposite hemisphere, shearwaters and storm-petrels may go astray on migration into the wrong ocean, and the gadfly petrels of the genus Pterodroma , though rarely recorded anywhere near land away from the breeding stations, are occasionally capable of prodigious feats of wandering across several oceans and continents. These last records are hard to explain, though some at least must be genuine; the best explanation appears to be that the birds are first displaced from their range by storms, and then have vast powers of endurance so that they are able to wander even further afield in attempting to return to it. They then often come to grief in circumstances where it may be very hard indeed to prove the manner of the appearance beyond all reasonable doubt, and there will always be a large element of uncertainty about many older records, including a number on current national lists, although many cannot be ignored entirely.  相似文献   

8.
The use of genetic information is now fundamental in parasite taxonomy and systematics, for resolving parasite phylogenies, discovering cryptic species, and elucidating patterns of gene flow among parasite populations. The accumulation of available gene sequences per geographical area or per parasite taxonomic group is likely proportional to species richness, but not without some biases. Certain areas and certain taxonomic groups receive more research effort than others, possibly causing a deficit in the relative number of parasite species being characterized genetically in some areas or taxonomic groups. Here, we use data on the number of parasite records per country or helminth family from the London Natural History Museum host-parasite database, and matching data on the number of gene sequences available from the National Center for Biotechnology Information (NCBI) GenBank database, to determine how available gene sequences scale with species richness across countries or parasitic helminth families. Our quantitative analysis identified countries/regions of the world and helminth families that have received the most effort in genetic research. More importantly, it allowed us to generate lists (based on residuals from the statistical model) of the 20 countries/regions and the 20 helminth families with the largest deficit in available gene sequences relative to their helminth species richness. We propose these lists as useful guides toward future allocation of effort to maximise advances in parasite biodiscovery, systematics and population structure.  相似文献   

9.
In epidemiology, capture–recapture models are commonly used to estimate the size of an unknown population based on several incomplete lists of individuals. The method operates under two main assumptions: independence between the lists (local independence) and homogeneity of capture probabilities of individuals. In practice, these assumptions are rarely satisfied. We introduce a multinomial latent class model that can account for both list dependence and heterogeneity. Parameter estimation is performed by maximizing the conditional likelihood function with the use of the EM algorithm. In addition, a new approach for evaluating the standard errors of the parameter estimates is discussed, which considerably reduces the computational burden associated with the evaluation of the variance of the population size estimate.  相似文献   

10.
Regal RR  Hook EB 《Biometrics》1999,55(4):1241-1246
An exact conditional test for an M-way log-linear interaction in a fully observed 2M contingency table is formulated. From this is derived a procedure for interval estimation of the total count N in a 2M contingency table, one of whose entries is unobserved. This procedure has an immediate application to interval estimation of the size of a closed population from incomplete, overlapping lists of records, as in capture-recapture analysis of epidemiological data. Data on the prevalence of spina bifida in live births in upstate New York in 1969-1974 illustrate this application.  相似文献   

11.
ABSTRACT In a review of the horseflies of the Tabanus mandarinus species group in Korea, six species are recognized. Among them, Tabanus nipponicus is newly recorded from Korea. Keys, annotated check lists of domestic records, collection data, and photographs of T. nipponicus are provided.  相似文献   

12.
Multilist population estimation with incomplete and partial stratification   总被引:2,自引:0,他引:2  
Multilist capture-recapture methods are commonly used to estimate the size of elusive populations. In many situations, lists are stratified by distinguishing features, such as age or sex. Stratification has often been used to reduce biases caused by heterogeneity in the probability of list membership among members of the population; however, it is increasingly common to find lists that are structurally not active in all strata. We develop a general method to deal with cases when not all lists are active in all strata using an expectation maximization (EM) algorithm. We use a flexible log-linear modeling framework that allows for list dependencies and differential probabilities of ascertainment in each list. Finally, we apply our method of estimating population size to two examples.  相似文献   

13.
Multiple primary malignant neoplasms in England and Wales, 1971-1981   总被引:1,自引:0,他引:1  
In the period 1971-81, more than 1.9 million persons were registered with a malignant neoplasm among the 49.2 million population of England and Wales. For 63,536 people, two or more tumor registrations (multiple tumor records) have arisen in that period. Because of the structure of the National Cancer Registration scheme, some errors in registration are inevitable, particularly duplicate registration of a single tumor by adjacent regional cancer registries. A pilot study showed that 61 percent of multiple records would represent true multiple primary malignancy, and that these records could be readily separated from registration errors. After abstraction of identifying codes from each tumor, 129,047 tumors involved in 63,536 multiple records were matched to the national cancer file, and the full data set extracted for successfully matched tumors. Person-years data were extracted for the 1.8 million tumors not involved in a multiple record. Eleven percent of multiple records were not completely matched, and a further 16 percent were excluded on SEER criteria, or as probable registration errors, leaving 46,155 multiple primary tumors for further analysis. Over 3 million person-years at risk of a second tumor were accrued. The overall risk of a second tumor at any site before age 85 was 0.77 for males and 0.80 for females, after exclusion of second tumors observed within 12 months of the first. The risk of a new primary apparently decreased with increasing duration of survival, a trend which may be due in part to under-registration of second tumors in the early 1970s and an improvement in linkage since 1971.  相似文献   

14.
Median ranked set sampling may be combined with size biased probability of selection. A two-phase sample is assumed. In the first phase, units are selected with probability proportional to their size. In the second phase, units are selected using median ranked set sampling to increase the efficiency of the estimators relative to simple random sampling. There is also an increase in the efficiency relative to ranked set sampling (for some probability distribution functions). There will be a loss in efficiency depending on the amount of errors in ranking the units, the median ranked set sampling can be used to reduce the errors in ranking the units selected from the population. Estimators of the population mean and the population size are considered. The median ranked set sampling with probability proportion to size and with errors in ranking is considered and compared with ranked set sampling with errors in ranking. Computer simulation results for some probability distributions are also given.  相似文献   

15.
Shepherd BE  Yu C 《Biometrics》2011,67(3):1083-1091
A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan.  相似文献   

16.
Watch lists of invasive species that threaten a particular land management unit are useful tools because they can draw attention to invasive species at the very early stages of invasion when early detection and rapid response efforts are often most successful. However, watch lists typically rely on the subjective selection of invasive species by experts or on the use of spotty occurrence records. Further, incomplete records of invasive plant occurrences bias these watch lists towards the inclusion of invasive plant species that may already be present in a land management unit, because the occurrences have not been formally integrated into publicly accessible biodiversity databases. However, these problems may be overcome by an iterative approach that guides more complete detection and compilation of invasive plant species records within land management units. To address issues from unobserved or unrecorded occurrences, we combined predicted suitable habitat from species distribution models and aggregated invasive plant occurrence records to develop ranked watch lists of 146 priority invasive plant species on >4000 land management units from five different administrative types within the United States. Based on this analysis, we determined that on average 84% of priority invasive plants with suitable habitat within a given land management unit were as yet unobserved, and that 41% of those were ‘doorstep species’ – found within 50 miles of the unit boundary yet not detected within the unit. Two case studies, developed in collaboration with staff at U.S. Fish and Wildlife Service Refuges, showed that by combining both habitat suitability models and invasive plant occurrence records, we could identify additional problematic invasive plants that had been previously overlooked. Model-based watch lists of ‘doorstep species’ are useful tools because they can objectively alert land managers to threats from invasive plants with high likelihood of establishment.  相似文献   

17.
An annotated checklist of 1016 species of fungi (Ascomycota and Basidiomycota), which have been recorded in 95 different localities, from 1990 to 2015, is presented for Umbria (Italy). The checklist was compiled from records of Umbrian fungi in scientific publications, unpublished lists and personal observations. This work represents the first comprehensive checklist of macrofungi for Umbria. Even if not complete, an exhaustive overview of the current knowledge of the mycobiota of Umbria is presented. Although a large amount of the regional territory has still to be explored for mycological diversity, this study offers an important support in compiling red-lists of endangered macrofungi, as well as to identify indicator species of particular habitats to be considered for wildlife reserves, as is currently done in many European countries.  相似文献   

18.
Monitoring biodiversity is necessary but difficult to achieve in practice, in part because standardized field work is often demanding for volunteer field workers. Collecting opportunistic data on presence and absence of species is much less demanding, but such data may suffer from a number of biases, such as variation in observation effort over time. Here we explore whether site-occupancy models may be helpful to reduce such biases in opportunistic data, especially those caused by temporal variation of observation effort and by incomplete reporting of sightings. Site-occupancy models represent a generalisation of classical metapopulation models to account for imperfect detection; they estimate the probability of sites to be occupied (and of the rates of change, colonisation and extinction rates) while taking into account imperfect detection of a species. The models require so-called presence–absence data from replicated visits for a number of sites (e.g., 20–50). We tested whether these models provide reliable trend estimates if collectors of opportunistic data do not report all species detected. We applied the models to three opportunistic datasets of dragonfly species (1999–2007) in the Netherlands: (1) one-species records, (2) short daily species lists and (3) comprehensive daily species lists. Trend estimates based on a fourth dataset from a standardized monitoring scheme were used as a yardstick to judge the results.The analyses showed that occupancy trends based on comprehensive daily species lists in combination with site-occupancy models were generally similar to those based on the monitoring scheme. But trends based on one-species records and short daily lists were too imprecise to be very useful. In addition, site-occupancy models lead to more realistic occupancy estimates than those obtained from conventional logistic regression analysis. We conclude that comprehensive daily species lists can be useful surrogates for monitoring schemes to assess distributional trends.  相似文献   

19.
20.
The typical habitat of the rare, endemic South African mudsnail Hydrobia knysnaensis is the leaves of upper-shore Zostera capensis within high-salinity salt-marsh pools and channels. In the Knysna estuarine system, it is the numerically dominant member of a guild of six small microphytophagous gastropods; it is absent from lower level and more exposed Zostera meadows, where its place is taken by Rissoa pinna. H. knysnaensis is also here recorded from the nearby Swartvlei Estuary, in the same habitat type. This unusual habitat for a Hydrobia may in part account for the failure of earlier surveys to detect its presence, notwithstanding that it may well locally be the most numerous gastropod in each of these systems. Generally, however, it (and probably other small gastropods) seem to have been confused in estuarine fauna lists with Assiminea. Experiments show that the rate of feeding in H. knysnaensis is curtailed at population densities exceeding 2000–4000m?2 and in salinities below some 10psu. The proportion of non-feeding snails also increases at high population densities and in low salinities. The bearing of these results on whether H. knysnaensis is likely to be the 'Hydrobia sp.' recorded from some other South African localities and on the causes of its rarity are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号