SABES Logo HomeSystem for Adult Basic Education SupportSystem for Adult Basic Education SupportSABES Contact Us
AssessmentCurriculumLicensureWorkforce Development & Community PlanningSABES Calendar
Administration & Organizational DevelopmentTechnologyLinks Beyond SABESStudent LeadershipResources
SABES Home> Resources> Publications> Adventures


[Adventures in Assessment logo]

Volume 14 Spring 2002

PDF version

CONTENTS

Introduction: Volume 14:
Examining Performance
Marie Cora, Editor

Fair Assessment Practices: Giving Students Equitable Opportunities to Demonstrate Learning
Linda Suskie

Assessing Oral Communication at the Community Learning Center: Development of the Oral Profiency Test
Joanne Hartel and Mina Reddy

So What IS a BROVI, Anyway?
And how it can change your (assessing) life?

Betty Stone and Vicki Halal

A Writing Rubric to Assess ESL Student Performance
Inaam Mansoor and Suzanne Grant

Illuminating Understanding: Performance Assessment in Mathematics
Tricia Donovan

Student Health Education Teams in Action
Mary Dubois

Involving Learners in Assessment Research
Kermit Dunkelberg

WMass Assessment Group:
Tackling the Sticky Issues

Patricia Mew and Paul Hyry

 


Search Our Site!
 

A Writing Rubric to Assess ESL Student Performance

Inaam Mansoor and Suzanne Grant

The Challenge

Performance-based assessments are popular because they are often program-based and learner-centered; however, funders tend to question their credibility. We challenged ourselves to address this issue by finding a way to satisfy technical quality issues, such as validity and reliability, while also keeping in mind how assessment influences learning. We believed that this approach would facilitate reporting student achievement both fairly and credibly.

Who We Are

The Arlington Education and Employment Program (REEP) is an adult English as a Second Language (ESL) program administered through the Arlington Public Schools in Arlington, Virginia. Because of its close proximity to our nation's capitol, the area draws large numbers of immigrants attracted by job opportunities in the service industry and a large number of national and international organizations. Nine levels of ESL instruction are offered, including workplace literacy and computer-assisted instruction. There are some 6,000 enrollment slots at 8-10 locations throughout Arlington County. There are 55 trained and experienced ESL teachers, who are supported by 5 coordinators. In addition, more than 100 volunteers support instruction.

Our Story

In 1995, REEP staff developed a writing rubric. A rubric is a scoring device that specifies performance expectations and the various levels at which learners can perform a particular skill. By articulating what our adult ESL learners could do at various proficiency levels, we hoped to fine-tune placement of learners into appropriate class levels and monitor their progress. Our rubric was developed by collecting writing samples from each class level and analyzing them. We found that although we had nine instructional levels, our students' writing fell into six distinct writing
performance levels. The differences in these levels could be articulated using five characteristics (learning targets) of our learners' writing: content and vocabulary, organization and development, structure, mechanics, and voice (See REEP Writing Rubric attached). As part of our work with the What Works Literacy Partnership (WWLP: a group of adult basic education programs from across the country building their capacity to effectively use data for program improvement and decision-making. For more on WWLP, please go to www.wwlp.org), we designed and implemented a study to determine the effectiveness of using the REEP Writing Rubric to measure progress. With support from WWLP, we developed pre- and post-test writing tasks to assess writing gains.

Developing writing tasks that could be used for program-wide testing of beginning through advanced level students was challenging. To be fair, the tasks needed to generate a wide variety of responses and enable students at different levels to demonstrate their abilities and life experiences. We decided that the performance task of writing a letter of advice based on their own experiences would meet the above criteria and be consistent with skills that students were practicing in class. Moreover, we structured the testing process to mirror instructional practice by engaging students in warm-up activities prior to the actual writing test.

What Works

Reliability of test data is extremely important in the context of program-wide assessment, especially when the assessments are reported to funders.
To maximize the reliability of our results, WWLP researchers provided extensive guidance on field-testing, test administration procedures, scoring, performance task development, and rater training.As a result, we implemented the following:

  • Field-testing.

    Before administering the pre- and post- writing tests to hundreds of
    students, we conducted field-testing
    to answer the following questions:

    1. Can we expect measurable progress within the specified test interval, that is, 120-180 hours
    of instruction?

    2. Can beginning through advanced level students demonstrate their writing skills in response to our writing tasks?

    3. Are the pre- and post-test tasks equivalent, that is, do they represent the same level of difficulty?

To answer questions 1 and 2, a small group of experienced teachers administered the pre-test to five students from each class level at the beginning of an instructional cycle. At the end of the cycle, the teachers administered the post-test to the same group. Students were asked for feedback and they said they felt that they were able to demonstrate their writing skills with these tests. Teachers also thought that the tests demonstrated the students' writing abilities. Experienced readers scored the tests, and then a WWLP researcher analyzed the results. The analysis showed that significant gains could be measured and that reliable results could be achieved using the scoring procedures we had implemented. We were ready for large-scale testing.

To answer question 3, the same group of students representing all class levels was given the pre-test followed by the post-test within a three-day period. A WWLP researcher analyzed the results and found no difference between students' pre- and post-test scores, which demonstrated that the two tasks represented the same level of difficulty. One of the key elements in achieving equivalence was the use of the letter genre and parallel warm-up activities for both the pre- and post-tests.

  • Test Administration.

    Prior to each test administration, testers participated in trainings on ground rules and how to administer the test, for example, time limits, no dictionaries, and how to conduct warm-up activities developed for the particular writing task. This ensured that all students completed the pre-writing activities and the test in a uniform way.

  • Scoring Procedures.

    Each of the five writing characteristics receives a score between 0 and 6, with 6 the highest. The total score is determined by adding each characteristic score and dividing by 5. A sample scoring grid follows.

  Content & Vocabulary Organization & Development Structure Mechanics Voice Total (5 subsections)
Pretest Score 3 4 3 4 3 3.4
(17/5)
Post-Test Score 4 4 4 4 3 3.8
(19/5)
  • Building scoring consensus.

    Reep staff were trained to the writing rubric to score the two ( pre- and post) performance tasks. Developed readers scored a range of essays. scores for each writing characteristic were charted out as shown above, and the scoring rationale was discussed. This enabled the trainers to see how consistently the rubric was being interpreted, to pinpoint areas of discrepancy, and build scoring consensus.

    A shortened version of this process was repeated prior to each scoring session to ensure continued consistency in rubric interpretation and scoring. Consistency among the readers was tracked to determine how many tests needed a third reader.

    Each test was scored by two readers, and a third reader was used if the total score was more than one point different. The second reader did not know how the first reader had scored the test. In this way, the firstr reader's score did not influrncre the second reader. Similiarly, students' class levels were not indicated on the test paper.

    Scoring of the tests occured in group sesssions of no longer than two hours each. ?this seemed to be the point at which readers began to "burn out."

    The training and scoring procedures described above resulted in an inter-rater reliability of 98%. Only 2% of the tests needed a third reader.

Lessons Learned

REEP teachers were involved in every step: developing writing tasks and warm-up activities, administering tests, developing scoring procedures, scoring tests, and analyzing data. Through this involvement, teachers developed a deeper appreciation of testing. They used their students' test results to inform their instruction so that they could better meet the needs of their students. Scoring tests written by beginning to advanced level students gave them a broader picture of writing levels within the program and informed their decisions about subsequent class placements.

Teachers shared the writing rubric with their students, giving them a better sense of how they were being evaluated. Students at all levels started paying more attention to their writing as a result of the more formalized writing test. Many began to embrace writing instruction in the classroom. Learning English now meant more than learning to "speak" English.

We have all gained a greater understanding of the testing process and its need to be both fair and credible to all stakeholders. By participating in the test development process, teachers have developed skills and knowledge that will enable them to develop performance-based classroom assessments which meet this criteria as well. These skills enable us to feel more confident about accepting and reporting gains derived by performance based assessments.

A Word to the Wise

Developing and using a performance- based assessment requires tremendous time and financial commitment as well as access to the expertise of researchers. This commitment must be weighed against the outcomes, and in our case, the results for the program were significant and extremely positive.

We had hoped to demonstrate that a performance-based assessment could be a potentially superior instrument for measuring learner gains and thereby gain credibility with funders. Indeed, our work with WWLP gave us access to researchers who both guided us through the testing process and provided feedback on quality issues. At this writing, we are pleased to report that our WWLP researcher has concluded that "the REEP Writing Rubric is a carefully designed and validated instrument with sufficiently high reliability." We were fortunate in having access to the WWLP project and the professional support it provided. Practitioners need opportunities like this in the future if performance-based assessments are to become accepted measurement instruments.

Top of page

Contact

Inaam Mansoor, Director
Arlington Education and Employment Program (REEP)
2801 Clarendon Boulevard, #218
Arlington, Virginia 22201
Tel.: (703) 228-4200
Fax: (703) 527-6966
Imansoor@arlington.k12.va.us
http://www.arlington.k12.va.us/departments/adulted/REEP

REEP Writing Rubric

 

Originally published in Adventures in Assessment, Volume 14 (Spring 2002), SABES/World Education, Boston, MA, Copyright 2002.

Funding support for the publication of this document on the Web provided in part by the Ohio State Literacy Resource Center as part of the LINCS Assessment Special Collection.

 

Boston CRC Central Northeast Southeast West
SABES is funded by Massachusetts Department of Education : :|: : Creative Commons Copyright Info.: :| : Webmaster : :| : :Site Map : : Last Modified 05/01/06