![option alpha pdf option alpha pdf](https://s3.amazonaws.com/images.ecwid.com/images/6372061/1115101727.jpg)
![option alpha pdf option alpha pdf](https://assets-global.website-files.com/5fba23eb8789c3c7fcfb5f31/5fdb90bac2ba8bc082e0b053_Link%20Box.png)
The present article takes the form of a methodological critique, focused on one measure commonly associated with instrument reliability in science education research (Cronbach’s alpha). A day, week, or month later, the student may answer the same questions differently for no other reason than that responding to the original test provided a learning experience. So, a student may answer a set of questions, and that very activity may set in chain thinking processes that lead to new insights or further integration of knowledge. In educational research, it may be quite difficult to test the reliability of an instrument such as an attitude scale or a knowledge test by simply undertaking repeated readings because human beings are constantly changing due to experiences between instrument administrations, and also because they may undergo changes due to the experience of the measurement process itself. However, when an instrument does not give reliable readings, it may be difficult to distinguish genuine changes in what we are seeking to measure from changes in readings that are an artefact of the unreliability of the instrument. A high reliability does not ensure accuracy (for example, an ammeter which has not been properly calibrated may give very consistent repeat readings, without these being accurate) but does provide a basis for making inferences about changes (an increase in the reading on an ammeter which is poorly calibrated but has shown to give repeatable readings can be inferred to indicate an increased current). In a physical sciences context, we might expect to be able to test reliability by taking repeated measurements to see how consistent the readings are. Quality may traditionally be understood in terms of such notions as validity (the extent to which an instrument measures what it claims to measure, rather than something else) and reliability (the extent to which an instrument can be expected to give the same measured outcome when measurements are repeated) (Taber, 2013a). When choosing an instrument, or developing a new instrument, for a study, a researcher is expected to consider the relevance of the instrument to particular research questions (National Research Council Committee on Scientific Principles for Educational Research, 2002) as well as the quality of the instrument. Tests are here considered to measure cognitive features such as knowledge and understanding of science concepts and topics. Scales are here considered to measure constructs in the affective domain, such as attitudes. In the present paper, two particular types of instrument are considered, scales and tests. Science education research often involves the adoption of existing, or the development of new, instruments to measure phenomena of interest. Guidance is offered to authors reporting, and readers evaluating, studies that present Cronbach’s alpha statistic as evidence of instrument quality. It is argued that a high value of alpha offers limited evidence of the reliability of a research instrument, and that indeed a very high value may actually be undesirable when developing a test of scientific knowledge or understanding. Alpha is also sometimes inappropriately used to claim an instrument is unidimensional. More seriously, illustrative examples from the science education literature demonstrate that alpha may be acceptable even when there are recognised problems with the scales concerned. Those authors who do offer readers qualitative descriptors interpreting alpha values adopt a diverse and seemingly arbitrary terminology. Authors often cite alpha values with little commentary to explain why they feel this statistic is relevant and seldom interpret the result for readers beyond citing an arbitrary threshold for an acceptable value. This article explores how this statistic is used in reporting science education research and what it represents. Cronbach’s alpha is regularly adopted in studies in science education: it was referred to in 69 different papers published in 4 leading science education journals in a single year (2015)-usually as a measure of reliability. Cronbach’s alpha is a statistic commonly quoted by authors to demonstrate that tests and scales that have been constructed or adopted for research projects are fit for purpose.