Matched case-control sampling with stata download

A stratified version of nested case control sampling which we call countermatching is presented. For each treated case medcalc will try to find a control case with matching age and. Stata module to match cases and controls using specified variables, statistical software components s457372, boston college department of economics, revised 27 jan 2015. Author support program editor support program teaching with stata examples and datasets web resources training stata conferences.

Ors were calculated to assess the association between different sociodemographic factors with residential fire fatalities using conditional logistic. In an important article on casecontrol studies, rodrigues and kirkwood described three designs to sample the participants. Click continue in the additional output, then click ok in the case control matching dialog box to run the program. Analysis of casecontrol studies the odds ratio or is used in casecontrol studies to estimate the strength of the association between exposure and outcome. Apr 27, 2012 from this table, it is clear that matched case control articles published in the british medical journal bmj were consistently analyzed with correct statistical techniques. How to analyze matched casecontrol data in spss stack overflow. How do i conduct this in stata 9 the codes and come up with cases and controls to answer my question of interest if the odds in my cases is higher than the controls. The findings of this study raise concern that the majority of matched case control studies report results that are derived from improper statistical analyses. The first is a non matched case control study in which we enroll controls without regard to the number or characteristics of the cases. The goal of a case control study is the same as that of cohort studies, i. For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact. Nov 19, 2001 published formulas for case control designs provide sample sizes required to determine that a given diseaseexposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction. Logistic regression for unmatched casecontrol studies.

Calculates sample size or power for matched casecontrol studies. The calculations require the usual alpha and beta values, a possible alternative odds ratio the null is 1, phi the correlation of exposure between pairs in the case. Unmatched studies the procedures for analyzing the results of casecontrol studies differ depending on whether the cases and controls are matched or unmatched. Nested case control sampling macro user guide description. Let n denote the observed number of matched pairs of binomial events a and bwhere. The following code for creating a data set can be copied to stata dofile editor and be executed within the dofile editor. Epi info 7 allows users to rapidly develop questionnaires, customize data entry, analyze data and create custom reports.

Power analysis for matched casecontrol studies stata. The pool for selecting matched controls is an excel file that i can easily copy and paste into stata. Matched subjects designs are often used in education, giving researchers a useful way to compare treatments without having to use huge and randomized groups for example, a study to compare two new methods for teaching reading uses a matched subject research program. Openepi sample size for unmatched casecontrol studies. We demonstrate how these calculations can be carried out in stata using the example of calculating power and sample size for casecontrol studies. Statistical inference in matched casecontrol studies of. This module should be installed from within stata by typing ssc install ccmatch. How to create matched sample for sample selection method to. Analysis of case control studies the odds ratio or is used in case control studies to estimate the strength of the association between exposure and outcome. Mounib thiru satchi blue cross and blue shield of massachusetts, boston, massachusetts abstract a casecontrol study is an observational study in which subjects are classified according to the presence cases or absence controls of the outcome of interest. Stata module to calculate sample size or power for matched casecontrol studies.

Stata module to calculate sample size or power for. Stratified analysis of casecontrol data in stata youtube. Sample size determination in epidemiologic studies priya j. Design is a retrospective casecontrol matched analysis. Analytic methods for non matched case control studies include. Assume that sampling to casecontrol study does only depend on. Casecontrol studies are therefore placed low in the hierarchy of evidence.

An introduction to categorical analysis by alan agresti. After you perform the matching you obtain an id variable for treated and control cases that were successfully matched. There are other ways to use propensity scores at its heart, its a way to characterize the probability of being exposed given covariates. Sample size requirements for casecontrol study designs. For repeated measures, our cluster was the subject. Next, you want to find, download, and store the relevant files to make them. Furthermore, the bmj was responsible for publishing the greatest number of articles in this series of matched case control studies 737 or 19%. Logistic regression in case control study using a statistical tool satish gupta 2. Generating a nested casecontrol study is very easy in stata.

Matched sample in panel data statalist the stata forum. In fact, the more standard analysis may not only be valid but may be much easier in practice, and yield bet ter statistical precision. In a matched subjects designs, researchers attempt to emulate some of the strengths of within subjects designs and between subjects designs. This module calculates sample size for an unmatched case control study.

Can anyone advise me on the matched case control study, sampling, matching or analysis. The standard formulas used to calculate sample size for an individually matched casecontrol study assume a constant probability of exposure throughout the pool of possible controls. We propose new formulas that allow for heterogeneity in the probability of exposure among controls in different matched sets. This design uses data available for all cohort members to obtain a sample for collecting additional information in a case control substudy. Logistic regression for matched casecontrol studies stata textbook examples. In this study design, the number of controls does not necessarily equal the number of cases. In this paper i explore and illustrate these problems using a hypothetical pair matched case control study. A matched subject design uses separate experimental groups for each particular treatment, but relies upon matching every subject in one group with an equivalent in another. Power calculations for matched casecontrol studies. A discussion of statistical methods for matched data analysis.

Malonza, md, mph department of reproductive health and research world health organization. That is, the resulting casecontrol sample is matched with respect to analysis timethe time scale used to compute risk sets. The r statistical programming language is a free open source package. Interpretation of the fitted logistic regression model. Sample size for matched casecontrol studies statsdirect. Current status november 20, 2014 categorical data analysis, aut 2014 4 special case.

Question on matching in a nested case control study statalist. Textbook examples applied logistic regression david hosmer. Analysis of data from countermatched studies may be performed using standard conditional logistic likelihood software which allows for an offset in the model. The formulas are extended from one control per case to f controls per case and adjusted for a potential multicategory confounder in unmatched or matched designs. This module calculates sample size for an unmatched casecontrol study. A counter matched design is a matched case control design in which information on exposure or a proxy is used to improve statistical efficiency by maximizing the exposure variation within matched sets. However, extensions of our method to other types of casecontrol study designs, such as matched casecontrol sampling. Personally i prefer the latter because it is somewhat more straight forward. Power calculations for matched casecontrol studies 1161 4.

In these situations a case control design offers an alternative that is much more efficient. One of the most significant triumphs of the casecontrol study was the demonstration of the link between tobacco smoking and lung cancer, by richard doll and bradford hill. Note that it is not possible to estimate the incidence of disease from a case control study unless the study is population based and all cases in a defined population are obtained. Furthermore, the bmj was responsible for publishing the greatest number of articles in this series of matched casecontrol studies 737 or 19%. Logistic regression for matched casecontrol studies idre stats. You enter the desired confidence level, power, a hypothetical percentage of exposure among the controls, and either an odds ratio or a hypothetical percentage of exposure among the cases. Can anyone advise me on the matched case control study, sampling, matching. Matched pair case control study calculates the statistical relationship between exposures and the likelihood of becoming ill in a given patient population. A discussion of statistical methods for matched data analysisfor matched data analysis mingfu liu. Cohort analysis, risk sets, the nested casecontrol and casecohort.

A sample size of 90 cases and 90 matched controls was estimated to detect a thirty percent difference in the breastfeeding rates between the cases. However, it is lesser known in epidemiologic literature that the partial maximum likelihood estimator of a common hr conditional on matched pairs is written in a simple form, namely, the ratio of the numbers of two pairtypes. When each case is matched to one control, we say that the study is 1. When the resulting dataset is analyzed as a matched casecontrol study, odds ratios will estimate. The casecontrol design is indispensable if the disease is rare or assessment of the exposure is expensive, and in situations where results are needed quickly to inform public health policy. Matched designs and causal diagrams international journal. Is there a stata command to generate a sample of matched pairs based on the age frequency distribution for cases that have already been randomly selected. If you use the fuzzy extension command to create the case control matches, it can create a dataset of the matched pairs. Sample size determination is an important issue in epidemiologic studies. The data files used for the examples in this text can be downloaded in a zip.

In terms of application, stata has facilities for nearest neighbore matching with nnmatch and pscore2. Hi, i am new to this forum but have been using stata for a while for very basic statistical analysis. Generating a matched pair sample for a casecontrol study. Analytic methods for nonmatched casecontrol studies include. A population based case control study was conducted in northern norway and central sweden in order to study the associations of several potential risk factors with thyroid cancer. The sttocc command selects controls randomly from those members of the cohort who are at risk at the failure time of the case with replacement. Nested casecontrol design ncc is the most common of the two. Results are presented using methods of kelsey, fleiss, and fleiss with a continuity. Matched casecontrol studies dependency within matched paircluster in general, anywhere you have clusters of observations statisticians say that observations areobservations are nested within these clusters. For each case, the controls are chosen randomly from those members of. The treated cases are coded 1, the controls are coded 0. Modelbuilding strategies and methods for logistic regression. Once you have obtained an acceptable number of matches, you can move to the next steps. Research article open access breastfeeding and the risk.

Matching cases and controls sas support communities. The calculations require the usual alpha and beta values, a possible alternative odds ratio the null is 1, phi the correlation of exposure between. Can anyone advise me on the matched case control study. Sample size calculations for main effects and interactions in case. Cases and controls were individually matched and the information on the factors under study was provided by means of a selfcompleted questionnaire. The theory behind this command is described in dupont 1988 power calculations for matched casecontrol studies, biometrics. Nov 23, 2015 epi info 7 allows users to rapidly develop questionnaires, customize data entry, analyze data and create custom reports. Published formulas for casecontrol designs provide sample sizes required to determine that a given diseaseexposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction. Applied logistic regression, second edition, by hosmer and lemeshow chapter 7. Open source web tool that provides additional epidemiologic statistics. How to find the controls from a subset of cases in an already case. Identifying sociodemographic risk factors associated with. A discussion of statistical methods for matched data. A simple extension of the method is given which allows for nonrepresentative sampling of failures.

I wish to match based on a propensity score generate via a logit model. Installation guide updates faqs documentation register stata technical services. A propensity score isnt just a way of matching groups. I simplified the dataset in my explanation for simplicitity sake however its has cases and controls and their responses to survey questions. You can do this even if the cases and controls are in the same dataset. Logistic regression for matched casecontrol studies stata textbook examples the data files used for the examples in this text can be downloaded in a zip file from the wiley publications website. On hazard ratio estimators by proportional hazards models in. The particular set you get will depend on a randomizaton of the selection order. An introduction to categorical analysis by alan agresti chapter 9. I am attempting to find a program that will let me conduct cox regression on my matched casecontrol dataset. Stata module to calculate sample size or power for matched casecontrol studies, statistical software components s456423, boston college department of economics, revised 03 jan 2006. Cohort analysis, risk sets, the nested casecontrol and case. Nested casecontrol and casecohort studies an introduction and some new developments. Stata module to match cases and controls using specified variables, statistical software components s457372.

I have a dataset that has cases and controls matched on age, gender and number of years. However, case control studies employ a different sampling strategy that gives them greater efficiency. The 2,988 breast cancer cases were linked to the drivers license file to determine whether cases matched a record from the masterfile of drivers sampling frame for controls. The statistical analysis of matchedpairs studies must make allowance for the dependency in the data introduced by the matching. Download your copy by clicking our link to the right handside of the page. Sample size determination in epidemiologic studies priya. We used the computerbased command for sample size for matched case control studies using stata version 9. A matched pair design is used, in which patients are matched on age and clinical stage of disease, with one patient in a matched pair assigned to treatment a and the other to treatment b. How to analyze matched casecontrol data in spss stack. Assumed casecontrol study with controls per case suspect industry case nondiseased yes 118 no 257 for different values of we get the following ef. For big datasets where sampling needs to be performed many times, the log window may fill up repeatedly causing the need for users to manually clear it. The first is a nonmatched casecontrol study in which we enroll controls without regard to the number or characteristics of the cases.

A population based casecontrol study was conducted in northern norway and central sweden in order to study the associations of several potential risk factors with thyroid cancer. Risk factors for staphylococcus capitis pulsotype nrcsa. Cohort analysis, risk sets, the nested casecontrol and. Standard methods for determining sample size in cohort and casecontrol studies have generally been restricted to dichotomous disease and exposure variables and discrete confounding variables, and are based on simplifying assumptions that could often be unrealistic. Jul 28, 20 logistic regression in case control study using a statistical tool satish gupta 2. Sample size requirements for casecontrol study designs bmc. Moreover, because hr is a noncollapsible measure and its constancy. To give an introduction to sampling and analysis of casecohort studies. Otherwise join the selected controls to the case data using match files with a table join. Four controls per case were randomly matched by gender and age. To give an introduction to risk set sampling and analysis of nested casecontrol ncc design. Sample size for individually matched casecontrol studies. Contributed packages expand the functionality to cutting edge research. Matched subjects design matched sampling explorable.

From this table, it is clear that matched casecontrol articles published in the british medical journal bmj were consistently analyzed with correct statistical techniques. To give an introduction to risk set sampling and analysis of nested casecontrol ncc design to give an introduction to sampling and analysis of casecohort studies to compare similarities and differences between these study designs, in terms of sampling, disease measures, analysis and statistical efficiency 2 aim of this lecture. Using the isographs figures 27 present sample size isographs for paired casecontrol studies that were derived using equation 7. The formulas are extended from one control per case to f controls per case and adjusted for a potential multicategory confounder in. The language is very powerful for writing programs. Statistical considerations in the analysis of matched case. Also, take a look at analysis of matched cohort data from the stata. Standard methods for determining sample size in cohort and case control studies have generally been restricted to dichotomous disease and exposure variables and discrete confounding variables, and are based on simplifying assumptions that could often be unrealistic.

How to conduct conditional cox regression for matched casecontrol study. The case control matching procedure is used to randomly match cases and controls based on specific criteria. Application of logistic regression with different sampling models. How to conduct conditional cox regression for matched case. For example, we may enroll 105 cases and 178 controls. Note that it is not possible to estimate the incidence of disease from a casecontrol study unless the study is population based and all cases in a defined population are obtained. Stata s data management features give you complete control.

If you are willing to have the same control matched to more than one case control sampling with replacement, then its very easy from here. Methods in this matched casecontrol study, residential fire fatalities cases, n850 age above 19 years old were identified in the national register on fatal fires. When this is adjusted for in any one of a number of ways including matching you theoretically break one of the conditions necessary for confounding. Stata is a complete, integrated software package that provides all your data science needsdata manipulation, visualization, statistics, and reproducible reporting. Using propensity scores to reduce casecontrol selection bias. I am conducting a nested case control study in which 3 controls are matched to cases on visit number and ethnicity. The value of 4 is constant in each graph and equals.

1531 1391 846 626 137 1195 1351 1592 1420 461 1035 14 970 514 609 1352 1532 1120 1200 558 1437 629 491 1228 1590 426 881 626 650 1513 563 1494 1033 718 1548 251 1264 1200 1037 197 289 114 713 576