Literature Review
Literature Review
Preamble
On this page, members of the project team are iteratively developing the literature review for this project. Please note that the work you can see here is 'work in progress'.The completed literature review for this project will not provide a comprehensive overview, summary and analysis of all the work available in the fields of formative assessment, e-assessment and formative e-assessment. Instead, it will attempt to highlight the main issues raised in some of the most pertinent texts in the field and to provide an analysis and critique of them. At this stage of the process the literature review comprises responses by individual team members to selected readings. At a later stage in the project, these individual reflections will be woven together to a more structured text.
Full bibliographic records for all papers are listed on bibsonomy
Introductory statement
There is a lot of slippage in much of the literature between terms like 'assessment' and 'learning' and 'formative' and 'summative' (especially in papers exploring computer-based assessment tools - Elliott 2008, Winkley et al 2008). Widely varying ideas about what formative assessment is in practice were illustrated at our first Practical Enquiry Day, which focused on presentations of assessment practices involving technologies (see presentations at http://snipurl.com/feasst). One participant finished with "Is what I'm doing formative or summative?" That is not surprising, considering the differing theoretical emphases in the literature.Not surprisingly, the more we consider the literature on formative assessment, the more the domain grows complex and contentious and that is before thinking about "e-assessment". We think it is important to establish some core meanings of "formative assessment" as a starting point for what should be included in models which capture the processes of formative e-assessment.
Nearly all the studies and theoretical papers we have looked at set out with their own 'definition' of formative assessment, many of them basing this on Black and Wiliam (1998). We have selected just a few examples which illustrate the key differences in understanding within the area, and which have implications for what we work with as examples of formative e-assessment and what might feed into decisions about relevant models for the project.
Some perspectives establish particular relationships between formative and summative assessment:
The purpose of formative assessment is to help with learning, while that of summative assessment is to judge what has been learnt. It is possible for an assessment to have both formative and summative aspects (as in some forms of coursework assessment) (Russell et al, 2006 p. 467).
Cox et al (2008) assert that formative assessment is 'essentially a qualitative exercise' (p. 34) and cite Black and Wiliam (1998), stating it is 'a feedback process that provides information that can be used to fine-tune or modify what has already been done'. They do not say who does the fine-tuning. This has emerged as a core issue in defining processes of formative assessment in terms of learner and teacher roles.
Others conflate formative and summative assessment, such as Elliott's (2008) argument that Web 2.0 technologies bring about the potential for 'modernising' assessment, where it seems the need to distinguish the two becomes less relevant. The two are never distinguished in one overall definition of assessment as a practice which is centred around evidence of learning:
Assessment is the process of generating evidence of student learning and then making a judgement about that evidence (p.179).
Others conflate formative and summative assessment, within a view of "adaptivity" as a core component of e-assessment processes. E.g. Winkley et al (2008) argue that formative assessment is a key component of 'The Learning Journey', but it is not defined as having particular processes which are different from serial summative testing: "Formative assessment takes place regularly to review progress against the learning plan." Such definitions establish differences in the relationship between formative assessment and learning, some of them fairly subtle, but some not. It involves being able to identify the differences between describing 'an assessment' and 'a practice'. The task of the project is not to argue for a definitive version of formative assessment using technologies. The domain includes a wide variety of perspectives on formative assessment which prioritise different beliefs and educational goals. Formative assessment has been argued to achieve a wide variety of things: ideological and learning-oriented goals; more effective high stakes assessment; inclusion and wider participation, and more efficient management of high volume and distributed student bodies. These different perspectives on the purposes of formative assessment need to be considered in how the models are derived and what they are able to reflect.
Some questions which emerge
- How will the models capture teacher and learner roles in processes of formative e-assessment?
- Are there core elements here which have precedence over others, e.g. is the teacher's modification of her practice a core requirement of formative assessment? Is a discernible change in the learner's understanding a core requirement, in other words, at what point is any process open to an evaluation of its effectiveness? Can these processes have clear starting and end points?
- How will core process elements emerge over others in a widely disparate domain?
- What will be the problems of synthetic models for practitioner input as the project develops - they will certainly involve a considerable element of synthesis?
A computer science perspective
Computational learning theory, or the mathematics of formative assessment
Computational learning theory attempts to provide mathematical and algorithmic models of learning. The qualifier "computational" is misleading; it suggests that these models only apply to machines designed to learn in some sense. In fact, the theory applies to any cognitive agent - natural or artificial, human animal, and to an extent even to non-animates such as plants, microbes and organisations. The stress is on learning as a computational process by which an organism improves its performance or predictive power. The nature of that process may vary with respect to the organism, the environment, or the methods of encoding knowledge. Some observations are generic, for example - the findings regarding the limits on learnability. Some are specific to a particular setting or method, yet can still inspire a new perspective on questions in other contexts.Computational learning distinguishes between several fundamental modes, the most notable are unsupervised, supervised and reinforcement learning.
Unsupervised learning refers to an agent's attempt to make sense of "the massive flow of sensory information that occurs without any associated rewards or punishments" (Barlow, 1999). Such approaches mainly focus on identifying patterns and similarities in the environment. Supervised learning is defined as "algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances." (Kotsiantis, 2007). Typically, the agent is attempting to learn a classification, or labelling, of objects from a set of examples provided by a teacher. Reinforcement learning is "learning what to do - how to map situations to actions - so as to maximize a numerical reward signal" (Sutton and Barto, 1998). The agent in this model performs some actions and receives a "reward" or "penalty" value. The complexity of the problem arises from the fact that the feedback is not necessarily immediate and can depend on any combination of the agent's past actions.
Whatever our definition of formative assessment, it includes an element of feedback or guidance. Thus, the supervised and reinforcement model of computational learning would appear to be relevant.
(Jordan & Rumelhart 1992), in one of the seminal texts on neural networks, consider an abstract model of learning which includes the learner's intentions and actions, and the environment's outcomes in response to these actions. Intentions are seen as inputs to the learning system, where the learner transforms these intentions into actions, which are transformed by the environment into outcomes. This environment can include a human teacher, an automated tutor or any system the learner interacts with. As an example, they suggest a basketball player trying to improve his throw. The player's intentions are to get the ball in the basket. Her actions are the contracting and expanding of various muscles in concert. The outcomes are the observed trajectory of the ball with respect to the basket.
The learner in such a scenario needs to contend with two simmultanous tasks: mapping actions to outcomes, and by induction mapping intentions to actions.
Intelligent tutoring systems (ITS) and Artificial Intelligence (AIED), or serendipitous formative e-Assessment
The science of assessment engineering
Notes by paper
Almond, Steinberg and Mislevy (2002)
One important paper for this JISC project in relation to the delineation of a process model for assessment seems to be Almond, Steinberg and Mislevy (2002). The question arises whether a description of the "persistent elements" of the design and delivery of educational assessment and their relationship is instructive for the task of attempting to outline process models governing formative e-assessment. Almond et al (2002) propose what they call a "four-process architecture" comprising the following elements: activity, selection, presentation, response processing and summary scoring. The authors note that their architecture is based on the evidence-centred assessment design (ECD) framework which "describes a process that begins by defining the decisions to be made based upon an assessment and then works backwards to develop tasks, delivery mechanisms, scoring procedures, and feedback mechanisms that provide evidence that informs the pre-defined purposes".Additional thoughts on this text are included in the attachment - LitReviewhgmv1.doc.
Scheuermann & Guimares Pereira (eds) (2008)
This collection of articles considers a range of perspectives on and issues in computer-based assessment, in particular methodologies, tools and implementation. Like most other publications this report also does not specifically deal with formative dimensions of assessment. The following is a brief summary of the main relevant points made by some of the contributors.Martin, R.: New possibilities and challenges for assessment through the use of technology
Martin (p. 6) uses as his reference point a chapter by Bunderson, Inouye and Olsen (1989) who distinguish the following four generations: - Computerised testing (CT): administering conventional tests by computer
- Computerised adaptive testing (CAT): tailoring the difficulty or contents of the next piece presented or an aspect of the timing of the next item on the basis of examinees' responses
- Continuous measurement (CM): using calibrated measures embedded in a curriculum to continuously and unobstrusively estimate dynamic changes in the student's achievementtrajectory and profile as a learner
- Intelligent measurement (IM): producing intelligent scoring, interpretation of individual profiles, and advice to learners and teachers, by means of knowledge bases and inferencing procedures
The collection of user behaviour data during test execution is seen as one obvious innovative element of computer based testing, particular if they yield additional information about the cognitive functioning or on processing strategies of test takers (p. 7).
Martin stresses that besides technological development, the implementation of generations 3 and 4 also depend on advances in the field of cognitive sciences and psychometrics and that we might need a better and deeper understanding of the exact processes underlying assessment tasks in order for it to be possible to make correct interpretations.
He points out that the computer offers new possibilities, for example in relation to multimodal and -sensory stimuli (p. 8), and allows feedback that goes beyond hand-written notes, e.g. through the recording of complex interactions the test taker has with the computer.
The integration of learning and testing environments are seen as adding considerable value and it is thought that it will lead to a greater demand for diagnostic methods that allow qualitative judgements about the current knowledge state of a learner. One aspect that, according to Martin, is not currently satisfactory is a conceptualisation of the learning progress as a merely quantitative progresson one or many latent traits.
Instead of huge item banks of pre-calibrated items for a merely quantitative monitoring of the learning process, Martin argues with reference to Williamson, Mislevy and Bejar (2006) for the development of new automated scoring algorithms for complex computer-based tasks (p. 9). This, he argues, would allow for immediate qualitative and quantitative feedback on the learning progress.
Whitelock, D.: Accelerating the assessment agenda: thinking outside the black box
Whitelock argues that assessment has not kept pace with learning and teaching in that the former have increasingly embraced social constructivist and situated learning research whereas the latter has remained largely transmission orientated (p. 15). Implicit in her argument is that a view of knowledge and understanding as being constituted in and through interaction need to be reflected in assessment. (p. 17)Her main argument is for the development of new forms of e-assessment driven by sound pedagogy rather than state-of-the-art technological know-how. The former she sees intimately linked to students actively decoding feedback information, internalising it and using it to make judgements. (p. 15) She also posits that cognitive benefits in assessment are highly dependent on emotional and motivational factors. (p. 16)
Ripley, M.: Technology in the service of 21st century learning and assessment
Ripley notes that there is a danger of technology "dumbing down" education and learning. And, he stresses hat assessment embodies what is valued in education. (p. 22) Of most interest for our purposes is the list of reasons he gives why educationalists should consider e-assessment (pp. 25-6):- It has a positive effect on motivation and performance;
- It frees up teacher time;
- It provides access to high quality assessment resources;
- It provides rich diagnostic information;
- It is flexible and easy to use;
- It links learning and assessment, empowering the learner;
- It allows for the assessment of higher order thinking skills in ways not possible with paper-and-pencil testing;
- It is inevitable.
Meijer, R.: Stimulating innovative item use in assessment
Meijer posits that the use of feedback, task authenticity and the adaptation of teaching based on assessment outcomes are key characteristics contributing significantly to learning. (p. 30) With reference to Bennett (1998), he (p. 30) distinguishes the following three generations of computer-based assessment:1. The 1st generation automates the existing process without reconceptualising it.
- The 2nd generation uses multimedia technology to assess skills in ways that were not previously possible.
- With 3rd generation assessment will become indivisible from instruction, with high stakes decisions being made on many assessments.
Milbradt, A: Quality criteria in open source software for computer-based assessment
With reference to ISO/IEC 9126, Milbradt (p. 53) reports the following six major quality criteria of software:- functionality;
- reliability;
- usability;
- efficiency;
- maintainability; and
- portability.
- suitability;
- security;
- fault tolerance;
- learnability;
- understandability;
- time behaviour;
- resource utilization;
- analyzability;
- stability; and
- installability.
Reflection on this report:
One key implication from this report, although none of the contributors really engage with formative dimensions of e-assessment, is the assertion that assessment should not be viewed as separate in terms of technical tools and standards but that it needs to be integrated into those tools used for teaching and learning.Cox, Schleyer, Johnson, Eaton and Reynolds (2008)
In their piece on e-assessment on dentistry courses, the authors list the following benefits of computer-assisted summative assessment (p. 36). The question arises to what extent these are transferable/apply to formative e-assessment.Benefits:
- large numbers of assessments can be marked quickly and accurately;
- student responses to questions can be monitored;
- assessments can be provided within and open-access system;
- assessments can be stored and re-used;
- assessment items can be randomly selected to provide a different paper for each student
- adaptive testing can reduce the number of test items required to assess student knowledge;
- immediate feedback on performance and advice can be provided.
Elliott (2008)
Elliott distinguishes between what he calls Assessment 1.0, 1.5 and 2.0.Assessment 1.0 is characterised as mostly paper-based, mostly classroom-based, very formalised, highly synchronised and highly controlled. Assessment 1.5 is understood as computer-based assessment in the form of e-testing and e-portfolios.
With reference to Pellegrino (1999), Elliot argues against decontextualised, external assessments consisting of isolated tasks and performances with zero validity as indices of educational attainment. One problem identified with Assessment 1.0 and 1.5 is its intensely individualistic nature and the attendant notion of competition as opposed to collaboration. Elliot goes on to discuss the characteristics of Web 2.0 (e.g. user-generated content, the power of the crowd, architecture of participation, openness) and attendant learning styles (e.g. skilles use of tools, active learning, authentic learning experiences, task orientation, just in time learning, searching rather than memorising, utilising social networks, knowing where to find answers rather than knowing the answers, use of google rather than libraries and collaboration rather than competition) to point out a growing disconnect between the life of students inside and outside the classroom. Elliot concludes that in order to bridge this gap, assessment with the following characteristics is required (assessment 2.0):
- authentic
- personalised
- negotiated
- problem oriented
- socially constructed
- collaboratively produced
- recognise existing skills
- naturally occuring
- digital
- multimedia
- distributed.
Queries about some premises for Elliott's paper:
What is 'modernising' assessment in terms of formative assessment ('modernising' is a core point of the paper)? There is an implication that Web 2.0 is capable of bringing improvements in how learning is assessed (though the argument is that this is not yet happening because teachers are still in the world of 'traditional' assessment (meaning testing here) for reasons which are not explored). There is a conflation of formative and summative assessment throughout the paper. One overall definition is given of assessment as a practice:Assessment is the process of generating evidence of student learning and then making a judgement about that evidence (p.179)
Two key actions underpin this definition of assessment as a practice
- generating evidence
- making a judgement
Evidence is viewed as pivotal to the assessment process. Identifying the characteristics of assessment is another key point in the paper. This alos informed the ADoM project (Nottingham) on building process models based on 'characteristics'. The type of characteristics may be problematic as a basis for model development. They seem rather large-scale?
Essentailly, processes of assessment are described in terms of Web 2.0 'services' and are ascribed to different parts of the assessment cycle (p. 185) & 45; based on a core principle that 'assessment is about evidence generation'. There is a tacit message that people will learn anyway? It is difficult to ascertain the role of the teacher in this argument; the teacher may be viewed as 'capturer' and recorder of evidence, but not particularly involved in the learning process? A relatively passive role compared with other models? Assessment here is presented without a clear need for a relationship with particular purposes?
Boyle (2005)
Boyle, in an attempt to define sophisticated e-assessment tasks, points to Parshall et al.'s (2000, p. 130) five dimensional framework including the following categories :- item format
- response action
- media inclusion
- level of interactivity
- scoring algorithm.
He reports assertions from the specialist literature which argues that sophisticated tasks enable a range of question styles and multimedia presentations which provide the possibility of feedback in a range of visual styles and modes of interaction, which are likely to be more consistent with students' styles (and experiences) of learning. However, he urges caution in relation to the potential of sophisticated e-assessment as firstly, it is very difficult to develop, and secondly, the skills of good item writing do not necessarily co-occur with the skills of good teaching and that it may not be a technology that teachers can easily take ownership of and, by implication stands in opposition with importance aspects of high quality formative assessment, namely close integration into classroom practice. Boyle also questions whether sufficient evidence is currently available to support assertions that sophisticated e-assessment facilitates good formative assessment as given their inherent complexity, such tasks are likely to be created externally which "flies in the face of the requirement that formative assessment be low-tech and easy to use in everyday classrooms".
Winkley and Osborne (2008)
Winkley and Osborne argue that a clear technical architecture for e-assessment systems and services is emerging involving three key components:
- authoring tools,
- item and test banks and
- delivery systems.
Shute (2008)
In her recent paper, Shute helpfully reviews the corpus on research on formative, task-level feedback, however without reference to e-assessment.She defines formative feedback as "information communicated to the learner that is intended to modify his or her thinking or behavior to improve learning" (p. 153).
Shute notes that formative feedback, which is seen as crucial to improving knowledge and skill acquisition should be non-evaluative, supportive, timely, and specific and that it comes in various types, e.g. verification of response accuracy, explanation of the correct answers, hints, topic contingent, response contingent, attribute isolation, worked examples, partial solutions, to be administered at various times during the learning process.
Shute's paper is motivated by a number of questions, such as what the most powerful and efficient types of formative feedback are or under what conditions the different types help learners.
The literature review notes that formative feedback can have debilitating effects on learning when it is construed as critical or controlling, when students' standing relative to peers is the focus or when students are interrupted in their problem solving by external feedback (p. 156). If negative effects on learning are in evidence, Shute argues, feedback cannot be deemed to be formative.
Shute (p. 157) distinguishes two main functions of feedback: directive and facilitative with the former telling the student what needs to be attended to or revised and the latter providing comments and suggestions to help guide students in their own revision and conceptualisation.
Shute (p. 157) notes that formative feedback can signal a gap between where the student currently is and where she ought to be and, in so doing, reduce uncertainty about how well (or poorly) she is performing on a task. It can also effectively reduce the cognitive load of a learner and provide information that may be useful for correcting inappropriate task strategies, procedural errors or misconceptions.
Shute notes (p. 157) that feedback is significantly more effective when it provides details of how to improve. Feedback lacking in specificity is noted as potentially being counterproductive leading to lower levels of learning.
Effective feedback is noted as providing learners with two types of information: verification and elaboration (p. 158) with the former being defined as judgements about whether an answer is correct and the latter relates to information that provides cues to guide the learner towards a correct answer.
In Table 1, Shute (p. 160) tabulates feedback types by complexity.
One section of Shute's paper discusses goal-directed feedback and motivation. In it she notes that for a learner to remain motivated and engaged there needs to be a close match between her goals and her expectation that they can be met. In order to be challenging for learners, internal or external acquisition (acquire something desirable) or avoidance (avoid something undesirable) goals, must be personally meaningful and easily generated and the learner needs to receive performance feedback which can act as an important mediating factor in a learner's performance. Formative feedback is deemed to be able to influence a learner's goal orientation and shift it from a focus on performing to an emphasis on learning. Shute also refers to research that goal-orientation feedback can help learners realise the importance of practice and effort and that mistakes are part of the skill-acquisition process. (p. 162)
Furthermore, feedback is deemed to be able to serve as cognitive support mechanism. Shute suggests (p. 163) that directive feedback may be more helpful during the early stages, and facilitative feedback later.
Shute also examines the issue of timing, i.e. the effects of immediate versus delayed feedback, and refers to studies supporting both types (pp. 163-166). With reference to studies in the field she posits that delayed feedback may be superior for promoting transfer of learning, particularly in relation to concept-formation tasks, and immediate feedback for procedural skills. She also refers to other studies, which conclude that rather than timing, the nature of the task, and the capability of the learner are crucial variables determining the effectiveness of feedback. (p. 165)
Shute discusses some feedback research studies in detail.
With reference to Kluger and DeNisi's feedback intervention (FI) theory (1996) she notes that FIs can change the learner's attention in relation to task learning, task motivation and metatask processes. According to the authors, formative feedback that focuses the learner on aspects of the task promotes learning and achievement compared to those that draw attention to the self, which can impede learning (Shute, p. 168). Shute also notes (p. 169) that praise was not considered to be as effective a reinforcer as previously believed. Shute concludes (p. 170) that Kluger and DeNisi's work suggests that FIs may be viewed as double-edged sword.
With reference to Bangert-Drowns et al.'s (1991) meta-analysis Shute notes (p. 172) that feedback can promote learning if it is received mindfully. And, it can inhibit learning if it encourages mindlessness or if it doesn't match learners' cognitive needs.
With reference to Narciss and Huth (2004) Shute notes (p. 172) that systematically designed formative feedback has positive effects on achievement and motivation. In order to be effective, the instructional context as well as learner characteristics have to be taken into consideration. Narciss and Huth developed a conceptual framework (Shute, pp. 172-3) for the design of formative feedback, which comprises three interacting factors: instruction, learner and feedback.
With reference to Mason and Bruning (2002), Shute notes (p. 174) that immediate feedback for students with low achievement levels is superior to delayed feedback, whereas delayed feedback is deemed to be superior for students with higher achievement levels, particularly for complex tasks.
By way of a summary and conclusion, Shute notes (p. 175) that formative feedback should address the accuracy of a learner's response to a problem or task and may touch on particular errors and misconceptions. It should also permit the comparison of actual performance with some established standard of performance. She also notes that in technology-assisted instruction formative feedback comprises post hoc information with the purpose of shaping the perception, cognition, or action of the learner with the goal of enhancing learning and/or performance and engendering the formation of accurate, targeted conceptualisation of skills. In her view, effective and useful feedback depends on:
Motive: the learner needs it
Opportunity: the learner receives it in time to use it Means: the learner is able and willing to use it.
By way of recommendation for future research, Shute suggests (p. 176) the examination of the relationship(s) between affective components in feedback and outcome performance. And she argues for a multidimensional view of feedback where situational and individual characteristics of the instructional context and learner are considered along with the nature and quality of a feedback message.
Tables 2 & 3 provide guidelines to enhance learning.
Winkley, Rainbow and Baki (2008)
In their paper outlining the development and trialling of sets of adaptive computer-based and (non-adpative) paper-based tools to assess literacy, language and numeracy skills of adult learners in the UK, the authors inter alia posit the following advantages of e-assessment over paper-based assessment (pp. 5-6): - be adaptive to learner performance;
- be marked automatically;
- produce a detailed profile of a learner's performance available for immediate review;
- be less threatening for learners;
- accurate marking;
- include onscreen embedded tools;
- provide support for people with certain disabilities;
- generate test data easily and efficiently.
Taras (2005)
In her attempt to clarify the definitions of the central terms relating to assessment, namely summative and formative, Taras draws heavily on Scriven (1967), Ramaprasad (1983) and Sadler (1989). Following Scriven, she defines assessment as a judgement which can be justified according to specific weighted goals, yielding wither comparative or numerical ratings (p. 467). She sees it as a judgement according to standards, goals and criteria (p. 468). Formative assessment, according to her, differs from summative assessment in that it requires feedbackwhich indicates the existence of a gap between the actual level of the work being assessed and the required standard as well as an indication of how it can be improved (p. 468). Formative assessment, in her view, has summative judgements as an integral part (p. 468). Following Sadler, she delineates three conditions for effective feedback: 1) knowledge of the standard or goal, 2) skills in making multicriterion comparisons, and 3) the development of ways and means for reducing the discrepancy between what is produced and what is aimed for (p. 471). She posits two views of formative assessment (p. 471): as a process assessing a product or a process assessing a process. With reference to Scriven she notes (pp. 471-2) that the difference between summative and formative assessment is a matter of degree of elaborativeness. She also includes Sadler's definition of formative assessment as being 'concerned with how judgements about the quality of student responses (performance, pieces, or words) can be used to shape and improve the students' competence by short-circuiting the randomness and inefficiency of trial-and-error learning' (p. 472). Therefore, it is important whether the judgements and advice are used and acted upon by the learner.Webb & Cox (2007)
Webb and Cox posit that there is strong evidence that formative assessment, which they seem to view as diagnostic and supporting learners and teachers in deciding next steps in teaching and learning (p. 4), can raise standards of student achievement (p. 1). The authors delineate four main principles of teaching that relate to effective learning (pp. 4-6):- to start from where the learner is and recognising that students have to be active in reconstructing and formulating their ideas; to obtain feedback from individual students to determine what their existing ideas are.
- for students to be active and for teachers to encourage, and listen carefully to a range of responses
- for students to understand the learning goal and what counts as good quality work as well as to have an idea where they stand in relation to the goal; to provide opportunities for students to reflect on their work
- for students to 'talk the talk'.
Roos and Hamilton (2005)
Roos and Hamilton draw attention to different views of humanity which imply different paradigms of learning and, by implication evaluation and assessment. They stress that the dualism formative/summative should not be represented as two sides of the same thing. (p. 9)They also posit that formative teacher assessment is often 'essentially summative' "taking 'snapshots of where the children have 'got to', rather than where they might be going next'. (p. 9)
Ridgway, McCusker and Pead (2004)
In their 'literature review' of e-assessment the authors make the point that research evidence suggests that good assessment practices produce large performance gains and that poor assesment systems have negative rather than neutral effects on the performance of students and that, therefore, the stakes are high. (p. 7)Rovai (2000)
Rovai reminds his readers that assessment should be viewed as integral to teaching and learning rather than as an add-on. He views it as "the process of gathering, describing, or quantifying information about learner performance". (p. 142) He refers to Anderson et al (1975, p. 27) who describe assessment as 'multitrait-multimethod', i.e. as focussing on a number of variables and using a number of techniques to assay them. According to Rovai (p. 143), a variety of assessment tasks is necessary to provide a well-rounded view of what a student knows and can do. Multiple tasks are required according to Rovai to ensure valid, reliable and fairinformation about student achievement.Crisp (2007)
According to Crisp, e-assessment "involves the use of any computer- or web-based method that allows systematic inferences and judgements to be made about the students' skills, knowledge and capabilities" (p.39). He distingishes the following broad categories of e-assessment (p. 39):- diagnostic: an assessment task is used to identify the current knowledge and skill level of students so that learning activites can match student requirements
- formative: an assessment task provides practice for students on their learning in the current course and possible development activities they could undertake in order to improve their level of understanding
- summative: assesment task responses are designed to grade and judge a student's level of understanding and skil development for progression or certification.
- do students reuire access to a synchronous or asynchronous help system during the assessment?
- what level of administrative access should teachers have to the e-assessment engine?
- who administers enrolment and access?
- is the system scalable?
- what are the hardware requirements and is the system based on a client-side or server-side approach?
- how much staff development will be required?
- does the system comply with international interoperability standards?
- is the system commercial or open source?
- how will the system be evaluated once it has been implemented?
- what standards are being applied to the use of an online environment for assessment tasks?
- the initial identification of the purpose of the assessment
- the design of outcomes/assessment methodologies
- the articulation of appropriate methods for preparation and calibration of the assessment process
- the pre-registration process for students
- the distribution of the assessment
- the authentication of candidates
- the deliveryof the assessment
- the return of responses
- the scoring, determination of results and the provision of feedback
- the return of data
- the analysis of student responses to the scores
- the appeals process and final certification of achievement.
Black and Wiliam (1998)
Improving learning through assessment involves key factors:Effective feedback to students
Active involvement of students in their own learning
Adapting teaching in response to assessment information
Recognition of the impact of assessment on students' motivation and self-esteem
Need for students to self-assess and understand how to improve
Factors which are identified as obstacles to learning brought about by assessment practices:
Teachers tend to assess quantity and presentational features of work rather than quality of learning
Grading can have a detrimental effect on student self-esteem and becomes the focus of student attention rather than advice for improvement
Teachers' feedback is often social and managerial rather than learning-oriented
Teachers do not always know enough about the learning needs of their students
Crucially, formative assessment is part of a teacher's overall understanding of learning and teaching, and involves active participation of both students and teachers in understanding and reviewing learning as part of classroom practice. This work has been a catalyst for policy-making and further research across curriculum areas.
Black et al (2003)
This is an expansion of Inside the Black Box and Working Inside the Black Box by same authors. The work is based on research into the practices of schoolteachers in implementing new ideas about formative assessment. It focuses on practice and actual experiences of teachers of English, maths and science. It presents a challenge for modeling in the assertion that assessment for learning is 'ususally informal, embedded in all aspects of teaching and learning, and conducted by different teachers as part of their own diverse and individual teaching styles' (p. 2). This points to something potentially eclectic and highly contextualized, as well as personal and individual in the ways teachers work, and perhaps challenges the systematic capture of practices as opposed to using narrative - use of teachers own words for example - used in the study.There is a clear focus on what makes assessment formative, and it is centred on the teacher as instigator of change - the catalyst in the process:
An assessment activity can help learning if it provides information to be used as feedback by teachers, and by their students in assessing themselves and each other, to modify the teaching and learning activities in which they are engaged. Such assessment becomes formative assessment when the evidence is used to adapt the teaching work to meet learning needs (p. 2).
Although it may involve several methods 'it has to be in the control of the individual teacher and, for this reason, change in formative assessment practice is an integral and intimate part of a teacher's daily work' (p. 2) - it implies a constant synthesis of information about how the learners are doing to which certain individual judgements are brought. We need to know whether this can be replicated in AI as well as other e-learning contexts. It is one source of challenge with some of the versions of 'formative assessment' found in the literature.
The work argues strongly that formative assessment makes significant improvements in summative assessment i.e. 'teaching well' is totally compatible with 'getting good results' (p. 29).
Key components of a feedback system are identified:
- Data on the level of some measurable attribute
- Data on the desirable level of that attribute
- A mechanism for comparing the two levels and assessing the gap between them
- A mechanism by which the information can be used to alter the gap
Bull and McKenna (2004)
A distinction is made between the idea that CAA can actually be formative in itself, and the idea that it has a role to play in formative assessment. The notion that formative and summative assessment become 'blurred' is again important: In this sense, perhaps CAA offers a sort of bridge between formative and summative assessment … Brown (1999) suggests that the line between formative and summative assessment is a blurred one which is more to do with when the assessments are delivered and what is done with the marking and feedback rather than a precise difference in kind (p. 12). Is the project about establishing models for a different type of assessment altogether? This is one possibility as an outcome from the literature - e-assessment practices do not sit comfortably with concepts of formative assessment as they have developed in educational research. We may need a different stance on this. Nb. "With CAA, a useful term coined by Mackenzie (1999) is 'scored formative', which describes computerized coursework for which numerical scores are automatically assigned and recorded" (p. 13). Is this really just trying to circumvent a key obstacle to using the term 'formative' in test/scoring contexts, which in fact do not share the pedagogical conditions around teacher-student roles which constitute formative assessment in the educational literature (Black and Wiliam, 1998)? Definition of formative assessment offered here has no reference to teacher adaptation of pedagogy: Assessments which assist learning by giving feedback which indicates how the student is progressing in terms of knowledge, skills and understanding of a subject. In CAA this often takes the form of objective questions with feedback given to the student either during or immediately after the assessment. Formative assessment may be monitored by the tutor, used purely for self-assessment, or used to contribute marks to a module grade. (p. xiv) The role of the tutor here is potentially non-active and certainly non-adaptive. 'Monitoring' implies a particular distance for of engagement in the process itself, with implications for the types of model which could describe processes. The list of typical CAA activities on p. 4 confirms this, mostly based on taking a variety of forms of tests at different intervals, based on the desirability of giving students increased feedback. The implication is that 'more is better'?Nicol (2007)
Nicol delineates the following ten principles of good assessment and feedback:- Help clarify what good performance is (goals, criteria, standards)
- Encourage 'time and effort' on challenging learning tasks
- Deliver high quality feedback information that helps learners self-correct.
- Encourage positive motivational beliefs and self-esteem
- Encourage interaction and dialogue around learning (peer and teacher-student)
- Facilitate the development of self-assessment and reflection in learning
- Give learners choice in assessment – content and processes
- Involve students in decision-making about assessment policy and practice
- Support the development of learning communities
- Help teachers adapt teaching to student needs
AFLA (2003)
The Australian Flexible Learning Framework refers in their guide to assessment and online teaching to a paper by McLoughlin and Luca (2001) who, with reference to Laurillard’s framework, map interactive assessment activities onto the following modes of student interaction: attending (intra-team peer review of design and development of peer websites), practicing (intra-team peer review of student posts to weekly problems), discussing (intra-team peer review of weekly journals posted by students) and articulating (critiquing and peer feedback on e-portfolios; negotiating roles in relation to team contract).JISC (2007)
Figure 1 of the Effective Practice JISC report tries to represent diagrammatically the integration of assessments, feedback and learning resources.Conole and Warburton (2005)
Conole and Warburton posit that assessment is a critical catalyst for student learning and note that technology is often considered to be able to increase productivity and increase assessment frequency by automating assessment tasks (p. 17).They use the term ‘Computer-based assessment’ (CAA) which, according to them, can be subdivided into stand-alone applications on a single computer, applications that work on private networks and those over public networks (p.18).
They categorise assessment in relation to Bloom’s (1956) taxonomy of cognitive learning outcomes: knowledge recall, comprehension, application, analysis, synthesis and evaluation (p. 19).
With reference to relevant background literature they warn against direct translation of paper-based assessment into online assessment and wonder whether item-based testing is appropriate fro examining the full range of learning outcomes (p. 21).
Finally, they note the shift from assessment of products or outputs to the process of learning (p. 28).
Nicol and Macfarlane-Dick (2006)
The authors, who draw on Butler and Winnie (1995), assert that formative assessment can be used to empower students as self-regulated learners (p. 199). They consider self-regulated learners to be more effective learners (p. 205) Intelligent self-regulation, they argue, requires that a student has some goals which s/he wants to achieve and against which performance can be compared and assessed. Feedback, they go on to argue, is information about how the student’s present state relates to these goals. And, students generate internal feedback as they monitor their engagement with learning activities and tasks, and assess progress towards these goals. (p. 200)Nicol and Macfarlane-Dick argue that formative assessment and feedback should be seen as the responsibility of the students and, with reference to Sadler (1998), Boud (2000) and Yorke (2003), that feedback should not be conceptualised as transmission as other wise it is difficult to see, in their view, how students can become empowered and develop skills of self-regulation. (p. 200)
They go on to argue that students require opportunities to actively construct an understanding of feedback given to them, e.g. through discussion, and remind their readers of the link between external feedback and motivation. (p. 201)
They refer to Sadler’s (1989) three conditions for effective feedback; the students must know:
- what good performance is
- how current performance relates to good performance; and
- how to act to close the gap between the two.
Also, they argue that the development of self-regulation in students can be facilitated by structuring learning environments in ways that make learning processes explicit, through meta-cognitive training, self-monitoring and by providing opportunities to practise self-regulation. (p. 205)
Walker, Topping and Rodrigues (2008)
The authors note that students’ expectations and perceptions of e-assessment are under-researched and that their learning strategies are often unclear (p. 221).On the basis of their empirical study the authors conclude that (pp. 232-233)
- use of feedback was found not to be restricted to questions that were answered incorrectly: the majority of students used formative e-assessment to pro-actively identify areas of strength and weakness with a view to directing their revision
- the majority of students valued detailed explanatory feedback and the absence of sufficient feedback was a source of frustration among students
Black and Wiliam (2009 forthcoming)
for a copy of this paper please contact c.daly@ioe.ac.uk
Notes towards a theory of formative assessment’ The paper develops a theory of formative assessment within a broader theory of pedagogy, to avoid formative assessment becoming ‘a theory of everything’. It is located in face to face classroom practices around forms of interaction which have formative effects. It is based on the previous ten years of development of practices around formative assessment and widespread adoption and adaptation of core aspects of it. It argues that any theory of formative assessment must include three ‘spheres’ in relationship with each other: the teacher’s agenda, the internal world of each student, and the inter-subjective, ‘these between them map the territory’. The paper has five aims, to:- ‘provide a unifying basis for the diverse practices which are said to be formative’.
- ‘help to define the precise location of formative action within a comprehensive theory of pedagogy’.
- ‘show how the concept of formative interaction can be enriched and contextualised by drawing on such theories as might be relevant to it’.
- ‘suggest further and new lines of enquiry’.
- ‘suggest ways to extend and/or improve (classroom) practices arising from further theoretical reflection’.
In revisiting and analyzing their original definition of formative assessment (1998), Black and Wiliam identify its essence as located in ‘moments of contingency’, which, in terms of the project, are crucial to capturing and designing for formative assessment processes. The moments of contingency are scrutinized for how they bring the three spheres into relation with each other. 'it is clear that formative assessment is concerned with the creation of, and capitalization upon, ‘moments of contingency’ in instruction for the purpose of the regulation of learning processes…whilst this focus is narrow, its impact is broad, since how teachers, learners, and their peers create and capitalize on these moments of contingency entails considerations of instructional design, curriculum, pedagogy, psychology and epistemology.' Feedback is one focus, within this broader conceptual scope, and its treatment is based on face to face interactive contexts which affect motivation: 'For feedback to move learning forward, students must engage with the feedback, thus connecting to work on student affect, on the way that students respond to criticism, and on the relative benefits of different kinds of feedback’. The paper then deals with a more complex theoretical context which locates learners’ responses to ‘feedback’ in psychological and inter-subjective perspectives about factors affecting interaction and change in the learning behaviours of individuals.
The authors state that a ‘formative interaction’ as one which must essentially influence cognition, i.e. ‘it is an interaction between external stimulus and feedback, and internal production of the individual learner’. They offer a model of this, which captures three aspects, the external, the internal and their interactions in which ‘The teacher addresses to the learner a task, perhaps in the form of a question, the learner responds to this, and the teacher then composes a further intervention, in the light of that response’. ‘In a formative mode the teacher’s initial prompt is designed to encourage more thought; the learner is more actively involved, and the teacher’s work is far less predictable: formative interaction is contingent’. A key challenge here for the teacher is being able to provide formative feedback which affects cognition, ‘their feedback needs to be constructed in the light of some insight into the mental life that lies behind the student’s utterances’. This challenge they describe as ‘formidable’. In this context, the next stage of their theorization pinpoints ‘self-regulated learning’ (SRL). They work with a review of this field (Boekaerts et al., 2005) which defined SRL as follows: Self-regulation can be defined as a multi-component, multi-level, iterative self-steering process that target’s one’s own cognitions, affects and action, as well as features of the environment for modulation in the service of one’s goals. (p.150)
They draw on Boekaerts and Corno (2005) for a more detailed examination of what is involved in SRL and its complex psychological aspects in terms of learners engaged with processes of ‘mastery’ or ‘well-being’ (or veering between the two).
Gibbs and Simpson (2004)
Gibbs and Simpson (2004) start out by asserting the dominant influence of the (perceived) demands of assessment on what students ‘attended to, how much they did and how they went about their studying’ (p. 4). With reference to Miller & Parlett (1974) they distinguish ‘cue seekers’, ‘cue conscious’ and ‘cue deaf’. They also note (p. 6) that students “are strategic in their use of time and ‘selectively negligent’ in avoiding content that they believe is not likely to be assessed.” One of the key points they make in their introductory sections is that “the trick when designing assessment regimes is to generate engagement with learning tasks without generating piles of marking” (p. 8). They argue that the most powerful single influence on student achievement is feedback (p. 9). Yet, the extent to which students engage with, and understand feedback varies (p. 10); and there are problems around the impact and effectiveness of marks and comments in terms of self-efficacy and affect (p. 11). Gibbs and Simpson also refer to a list of effects of formative assessment by Crooks (1988) based on Gagne (1977) (pp. 11-12):
- Reactivating or consolidating prerequisite skills or knowledge prior to introducing the new material
- Focusing attention on important aspects of the subject
- Encouraging active learning strategies
- Giving students opportunities to practise skills and consolidate learning
- Providing knowledge of results and corrective feedback
- Helping students to monitor their own progress and develop skills of self-evaluation
- Guiding the choice of further instructional or learning activities to increase mastery
- Helping students to feel a sense of accomplishment
- Sufficient assessed tasks are provided for students to capture sufficient study time
- Use assessment tasks orientating students to allocate appropriate amounts of time and effort to the most important aspects of teaching
- Tackling the assessed task engages students in productive learning activity of an appropriate kind
- Sufficient feedback is provided, both often enough and in enough detail
- Feedback focuses on students’ performance, on their learning and on actions under the students’ control, rather than on the students themselves and on their characteristics
- Feedback is timely in that it is received by the student while it still matters to them and in time for them to pay attention to further learning or to receive further assistance
- Feedback is appropriate to the purpose of the assignment and to its criteria for success
- Feedback is appropriate in relation to students’ understanding of what they are supposed to be doing
- Feedback is received and attended
- Feedback is acted upon by the student
Hattie and Timperley (2007)
Hattie and Timperley assert that feedback is among the jor influences impacting on learning and achievement, types of feedback and the way it is given can be differentially effective (p. 81).They conceptualise feedback as "information provided by an agent (e.g. teacher, peer, book, parent, self, experience [interesting to note that technology isn't explicitly mentioned in this list, NP]) regarding aspects of one's performance or understanding. … Feedback is thus a 'consequence' of performance." (p. 81) Hattie and Timperley propose a continuum between providing instruction and providing feedback and note that in order to perform an instructional function, feedback needs to "provide information specifically relating to the task or process of learning that fills a gap between what is understood and what it is aimed to be understood". (p. 82) They also note that this may be through affective or cognitive processesas well as that feedback has no effect outside a learning context to which it is addressed. The authors stress that feedback is most powerful when it addresses faulty interpretation rather than a total lack of understanding. (p. 82) The most effective forms of feedback are deemed to be those providing cues or reinforcement to learners and feedback is considered to be more effective when it provides information on correct rather than incorrect responses and when it builds on changes from previous trails (pp. 82&85). The impact of feedback was thought to be influenced by goal and task difficulty with most impact occuring when goals are specific and challenging but task complexity is low (pp. 85-6). According to the authors praise for task performance appears to be ineffective (p. 86). The authors also differentiate between 'contingencies to activities' and feedback; rewards are viewed as falling into the former category (p. 84).
Hattie and Timperley see the reduction of discrepancies between current understandings and performance and a goal as the main purpose of feedback and posit that effective feedback must answer the following three questions asked by a student or teacher (p. 86):
- Where am I going (What are the goals?)
- How amd I going? (What progress is being made towards the goal?)
- Where to next? (What activities need to be undertaken to make better progress?)
One important function of feedback for the authors is to support students in increasing their effort, in particular in relation to tackling more challenging tasks or appreciating higher quality experiences rather than doing 'more' (p. 86).
In Figure 1, Hattie and Timperley delineate their model of feedback to enhance learning.
In the rest of their paper, the authors discuss the various elements of their model in some detail. For the purposes of this review, only a few aspects are focused on here. For example:Hattie and Timperley note that too much feedback only at the task level may encourage students to focus on the immediate goal and not on the strategies to attain the goal (p. 91).
With reference to Winnie and Butler (1994) the authors argue that the benefits of feedback about the task depend heavily on learners “(a) being attentive to the varying importance of the feedback information during study of the task, (b) having accurate memories of those features when outcome feedback is provided at the task’s conclusion, and (c) being sufficiently strategic to generate effective internal feedback about predictive validities” (p. 91).
With reference to Black & Wiliam (1998) they note that there is considerable evidence that providing written comments is more effective than providing grades (p. 92).
With reference to Balzer et al. (1989) that feedback at the process level appears to be more effective than at the task level for enhancing deeper learning.
They delineate six aspects mediating the effectiveness of feedback about self-regulation (pp 94ff): + capability to create internal feedback and to self-assess + willingness to invest effort into seeking and dealing with feedback information + degree of confidence or certainty in the correctness of the response + attributions about success or failure + level of proficiency at seeking help.
Feedback about the self as a person is deemed to be the least effective in part because it often contains little task-related information (p. 96).
Finally, Hattie and Timperley discuss the following commonly debated issues around feedback: its timing, effects of positive and negative feedback, the demands of feedback on teachers and the importance of classroom climate as well as feedback and the design of assessment tasks (pp. 98-102)
