Peer-reviewed journal articles published
Abstract:
Assessments of studies meant to evaluate the effectiveness of interventions, programs, and policies can serve an important role in the interpretation of research results. However, evidence suggests that available quality assessment tools have poor measurement characteristics and can lead to opposing conclusions when applied to the same body of studies. These tools tend to (a) be insufficiently operational, (b) rely on arbitrary post-hoc decision rules, and (c) result in a single number to represent a multidimensional construct. In response to these limitations, a multilevel and hierarchical instrument was developed in consultation with a wide range of methodological and statistical experts. The instrument focuses on the operational details of studies and results in a profile of scores instead of a single score to represent study quality. A pilot test suggested that satisfactory between-judge agreement can be obtained using well-trained raters working in naturalistic conditions. Limitations of the instrument are discussed, but these are inherent in making decisions about study quality given incomplete reporting and in the absence of strong, contextually based information about the effects of design flaws on study outcomes. © 2008 American Psychological Association.