SABES Logo HomeSystem for Adult Basic Education SupportSystem for Adult Basic Education SupportSABES Contact Us
AssessmentCurriculumLicensureWorkforce Development & Community PlanningSABES Calendar
Administration & Organizational DevelopmentTechnologyLinks Beyond SABESStudent LeadershipResources
SABES Home> Resources> Publications> Adventures


[Adventures in Assessment logo]

Volume 11 Winter 1998

CONTENTS

Introduction: Volume 11:
Aspects, Levels, and Perspectives
Alison Simmons, Editor

Evaluation that Looks at Achievement Realistically
Marie F. Hassett, Ph.D.

Are We Practicing What We Preach?
Caroline Gear

This is Only a Test…
Janet Isserlis

Reflections at the End of an ESL Day
Joanna Scott

The More Things Change, the More They Seem to Stay the Same
Maria Elena González

Is Ongoing Assessment Fully Learner-Centered?
Linda A. Gosselin

Assessment and Accountability:
A Modest Proposal

Heide Spruck Wrigley

Tips on Conferencing
Judy Hofer

Authentic and Learner-Centered Assessment in the Beginning ESOL Classroom
Glen Cotten

Reflections on Meeting the Challenge of Assessment with Beginning Students
Cheryl Gant

Learning from Experience:
Action Research

Diane Lizotte

Review:
New Ways of Classroom Assessment

Nancy Pendleton, Mary Haynes, Nancy Karam, Lezlie S. Rocka, Kathryn Carpenter, Karyn V.K. Vitali, Joanna C. Piantes, Jayne Bissonnette, Phyllis Lee



Search Our Site!
 

Assessment and Accountability

A Modest Proposal

Heide Spruck Wrigley
Aguirre International, San Mateo, CA

At times it seems that everything there is to say about testing and assessment in adult literacy has been said. By now, practitioners and administrators alike can cite the shortcomings of standardized tests using multiple choice formats and are familiar with the inadequacy of grade levels as indicators of what adult learners know and are able to do.

Yet, pencil and paper, multiple-choice tests continue to be used not only as placement instruments but as measures of learner gains and evidence of program success. Given current reporting requirements, their use is likely to increase, at least in the near future.

From the perspective of programs, there seem few viable alternatives that would meet the information needs of funders interested in reliable data that indicate how a program is doing overall. Portfolio approaches, for example -- considered the last great hope a few years back -- have not quite matured to the level where they might be used as a means to report and aggregate learner gains by group (although they are invaluable as evidence of individual learner progress), largely because the field has not invested in the development of benchmarks and rubrics.

Local approaches have remained just that, local approaches, primarily for two reasons: 1) there has not been enough field testing to establish the reliability of these measures and 2) there have not been sufficient efforts to implement alternative assessments across programs. At this time, it is easy to see how even programs that have been enthusiastic about developing an assessment system that captures what they consider worthwhile outcomes are becoming distressed about the prospects of an alternative system being able to rival the standardized tests currently in fashion.

All is Not Lost

Yet, the picture is not as dim and grim as it might first appear. Indeed, it may be premature to give in to cynicism ("it's all a sham and no one really cares"), paranoia ("next year, all funding will be tied to the results of standardized tests"), and paralysis ("in the end, no one will care about alternative assessment, so let's just sit and wait to see what comes down the pike"). Since a Pollyanna attitude does not appear to be justified either, given recent legislation, perhaps it is time to take an existentialist perspective where we commit ourselves to forge ahead although (and even because) life in adult literacy does not always make sense, but what else are we going to do to stay sane?

Let's ask then if there is anything positive happening in assessment, and how we can help shape new directions on the national or state level, while continuing to strive for sane assessments within and across local programs.

The Federal Outcome Reporting System

You may have heard that the U.S. Department of Education has mandated a uniform outcome-based reporting system that requires that all states send data for all programs funded under Adult Basic Education (ABE) to the Department of Education in Washington. Assessments for capturing outcomes must be "valid and reliable." In other words, they must either be in the form of a standardized test (considered reliable by definition) or by some other means that meet these requirements. States (and the programs they fund) will be asked to report "learner gains" in reading, writing, speaking, and listening (and possibly additional skills related to workforce development) and show that learners are advancing across levels, such as the Student Performance Levels (SRL) established for ESL. These are minimal requirements and individual states can define progress in various ways or even suggest additional outcomes as evidence of literacy progress and program success.

To understand the thinking behind the initiative, it is important to keep in mind that the primary focus is neither curriculum reform, nor program improvement (although new assessment systems are often used for these purposes), but rather an accountability measure to bring adult literacy in line with the requirements of GPRA -- the Government Performance and Results Act. GPRA requires that all federal agencies have to show that they, as well as the agencies and programs they fund, are achieving results or else risk loss of funding. Since the focus of GPRA is on the performance of the overall system (made up of thousands of programs), nei ther the federal government nor the states are likely to pay a great deal of attention to the progress made by any given learner at any given site, although site performance will be open for review (think standardized testing in K-12). Rather, funders will want to know how a program is doing overall (that is, whether it is positively affecting literacy skills), and they expect to see numbers in aggregate (summarized) form.

While in many ways, documenting the kinds of outcomes required by the new reporting system are "doable" (at least for programs that have long reported literacy gains for a sample of their students), two dangers loom as programs try to show gains for all students (not just a sample) and as results are increasingly tied to funding. There is a risk that programs will be a) tempted to manipulate assessment results in their favor and b) succumb to a practice known as "creaming".

Manipulating Assessment Results

Any time success (and subsequent funding) is determined by the data a program reports, there are concerns about administrators "fudging the data." For example, programs have long known that the trick to increasing test scores is to NOT prepare students for the test, but rather to assess them as soon as they walk in the door. This keeps baseline scores artificially low and progress is inflated, since gains are due to increases in test-wiseness, rather than any real gains in literacy skills. Although this kind of manipulation is considered unethical, since the resulting data "lack integrity". The practice is nevertheless quite commonplace among programs pressured to demonstrate learner progress in short amounts of time.

(Clearly, this trick only works once for each set of students, since the effects tend to level off after subsequent administrations of the test).

Top of Page

The Dangers of Creaming

It is an unfortunate fact of adult literacy that programs that help those "hardest to serve" (for example, learners who are both new to English and new to literacy) have the greatest difficulties showing gains, not only because their learners need a great deal of time until progress is evident, but because the kind of progress they are making is not easily captured by standardized multiple choice, paper and pencil tests. In addition, programs who serve these students (often community-based organizations) don't have the resources to set up testing alternatives appropriate for a low literacy population.

There is a danger, then, that programs not fully committed to serving learners who need both special support and extended time will decide to focus their efforts instead on those students who most easily advance, since the incremental progress of "slower" students only makes the program "look bad."

Thinking along those lines, ESL programs, for example, might decide to focus the curriculum on immigrants with higher levels of education, rather than serving ESL literacy students. This process of focusing on participants who are easy to serve is known as "creaming" and has long been decried as an unintended outcome of programs that have signed performance-based contracts (where funds are linked to learner outcomes and program impacts, such as job placement).

So far, not many public debates have taken place around this issue in adult literacy on the state level, but concerns are sure to arise as programs realize the difficulties they face in reporting progress across levels in the time periods envisioned by the reporting system.

So Why Not Ask for An Exemption?

Two solutions to the problem of creaming seem possible: 1) set aside monies so that programs can develop an alternative assessment for lower level students or 2) ask that learners who have difficulty negotiating paper and pencil tests be exempted from testing. In my view, exemptions, as attractive as they may seems, are not the best solution in the long run, since we may end up marginalizing both this group and programs that serve them. As ESL programs in K-12 have seen, being exempt from accountability requirements is not the blessing that it might seem. As a rule, if certain types of learners are excused from testing, they tend to disappear from the radar screen of administrators and are ignored when program decisions are being made. Furthermore, it is difficult to ask for funding for a population for whom no data is available.

I believe that, rather than asking for exemptions for students who cannot cope with the standardized tests approved by a state, we are better off advocating for the development of an alternative assessment framework for this group. There is an additional advantage to advocating for resources to develop an alternative assessment for those new to literacy. Once such an assessment is developed for one group, it is easier to acquire the resources to extend it to other levels and other populations.

Alternative Testing for Low Literate Students

What might an assessment that measures the incremental changes that occur at the initial levels of language and literacy development look like? It is entirely possible to design a framework that allows learners to demonstrate what they can say and understand in English despite limited proficiency (in fact the oral interview component of the BEST test does just that). It is also possible to design a "can-do" literacy assessment (of the type first suggested by Lytle and Wolfe) based on the kinds of texts and tasks that those new to literacy deal with every day. For example, tasks could be designed that allow learners to select pieces of print that they can recognize fairly easily, along with those that give them some difficulty and others that pose a still greater challenge (e.g., McDonald's logos, sale signs, 50% off promotions, their own street address, a letter from the INS or the TANF office). After selecting these print pieces, learners would read the items once together with the friendly teacher/facilitator/assessor and would then try a few text pieces that they have selected on their own.

If a program wants to create an assessment that works double duty (as a basis for program improvement and for accountability), a further step is necessary: the development of scales, rubrics, and benchmarks that indicate the expectations for any given level and to what degree learners are close to acquiring the kind of knowledge, skills, and strategies that are a core part of our curriculum.

The assessor rates individual performance on a scale without making a big deal of it. On the third round, the assessor might select an item that is slightly more difficult than the previous one, again encouraging the person to discuss the item and interpret what it says. Through assessments of this sort, we should be able to tell to what extent learners can handle a variety of literacy task at varying levels of confidence and proficiency. It would help us to see evidence of skills worth having, such as: 1) telling an electricity bill from a phone bill or a notice from the INS from a notice from school, 2) recognizing certain types of applications (housing, employment; citizenship), 3) interpreting real life environmental print (reading stop or danger signs), or 4) writing a note to a repair person, the landlord, or the worker on the next shift.

Asking learners to select tasks that they can do with confidence as a starting point for assessment and then moving up from there is not limited to the domains of practical literacy. For those interested in basic skills acquisition that focus on the subtasks of reading, one-on-one student-initiated assessment can tell us to what extent learners have developed the kind of "phonemic awareness" that allows them to select familiar words that start with the same consonant or identify words that rhyme. Those interested in basic writing proficiency can ask learners to select an evocative photograph or some other prompt, discuss it with the facilitator and then write the response.

Such an assessment plus conversation model can also provide baseline data on literacy practices, documenting the kind of print task that learners engage in (looking at TV Guide; reading the Bible; checking the horoscope or soccer scores (in English or in a native language newspaper) and recording how these practices change over time.

Assessments that allow learners to select a simple task and then branch out is hardly a new concept. In fact, it is the basis for the kind of "adaptive" assessment that has been used in computer-based testing. True this this type of assessment requires oneon-one administration, but as practices in K-12 have shown, after the initial intake assessment has been completed, teachers can take out a few minutes with each student during class time over the course of three weeks or so to document what learners can do that they could not do before (trained facilitators could do short "pull out" sessions as well). As funding for adult literacy is increasing, the old refrain of "there is no money to do this" no longer holds true. There are alternatives to multiple choice tests and we must advocate for their development and their use if we are serious about documenting progress for all learners, including those who still struggle with basic literacy.

Building an Assessment Framework that Yields Worthwhile Results

Developing an assessment that captures gains at the lower levels is only the starting point in a larger effort to build a system that works. Other efforts are needed, at both the local and the state levels so that we don't end up with an accountability system that is driven in large part by what current standardized tests are able to measure. If we want the quality of adult literacy to increase, we need an approach that measures to what extent learners are acquiring the knowledge, skills, and strategies that matter in the long run. These might include generative skills, such as gaining meaning from various print sources important to one's life; communicating one's thoughts and ideas; lean-Ling how to learn; knowing about and using resources effectively; and learning with and from others (along with the sub skills that help learners become increasingly more proficient in these areas).

How can this be done? At the local level, a three-pronged approach might be necessary: 1) finding a way to live with the currently available standardized tests, selecting the "LOT" -- the least objectionable test -- and keeping in mind the principle of "first, do no harm" to students; 2) convincing the state that the data a program has provided over the years are at least as valid and reliable as standardized tests such as the TABE and therefore the process should continue and 3) work with others to develop an assessment system that reflects the realities of adult learners' lives and focuses on what participating programs have deemed to be the core sets of knowledge, skills, and strategies important enough to teach and test.

Top of Page

Components of an Alternative Assessment System

Profiles and Portfolios

What might be the components of such a system? To start with, any program concerned about serving different groups of learners equally well, needs to collect demographic information that captures the kind of learner characteristics and experiences that may have a bearing on school success. After all, only by having rich descriptive information can we know what learners want and need to do with English and literacy (given their current circumstances and their goals for the future), how much schooling they have had (and how successful they were), and what the print and communication challenges are that they face in their everyday lives. Having descriptive information of this kind is invaluable since it allows us to see which learners are succeeding in our programs and which are languishing (or leaving) because their needs are not met.

This information can be collected in the form of profiles that travel with the student and to which teachers and learners contribute on an ongoing basis. In addition to background variables such as age, employment status, years of schooling, country of origin and languages spoken, these profiles can 1) capture current literacy practices (who is now speaking to the doctor without a translator; who has started to pick up a newspaper to check the weather); 2) chart shifts in learner goals and 3) record changes in life circumstances (new job, citizenship; economic self-sufficiency) important to stakeholders.

In these profiles, progress can be captured as it occurs (requiring only a line or two for two or three students per class). Profiles have the added advantage of encouraging teachers to create opportunities for learners to discuss what is happening in their lives, so they can spend some time observing. Profiles of this sort (also known as "running records") can be connected with portfolios that demonstrate student progress through writing samples, reading inventories, and various types of performance tasks. If a standardized test is used, results can be included in the profile as well, helping to flesh out the general picture of achievements and struggles.

From Learner Success to Accountability

This must be said: While an approach that combines rich profiles and individual portfolios will produce important information on individual students and provide insights into the relative success of certain learner groups, it does not, in and of itself, yield the kind of data needed for accountability. After all, we cannot ship boxes of profile folders to funders to have them realize what a great job we are doing.

To make profiles work for funders, a further step is needed, one that yields data in aggregate form so that policymakers can get a picture of the shape and size of the forest, not just a close-up of the trees. To measure progress and report to funders who is getting better at what, profiles need to include the following: a broad set of language and literacy tasks that are accompanied by rubrics, scales, and benchmarks for transition.

Rubrics are used to indicate what expectations are for any given area (face-to-face communication, dealing with print, accessing resources, etc.) and what evidence of success might look like. The scales that accompany the rubrics allow us to document where learners fall on a continuum of proficiency, documenting what they can do with relative ease, where they succeed with some help, and where they are struggling.

Since rubrics and scales can be designed for different skill domains (SCANS skills, communication strategies, navigating systems, civic involvement, learning how to learn, empowerment, etc.) and for various contexts (school, family, community), they can easily be matched to the goals of learners and adapted to the focus of particular program. They also allow for the kind of student control in task selection discussed above.

Once rubrics and scales are in place, meeting accountability requirements that call for aggregate data becomes relatively easy. Since the descriptors on a scale can easily be numbered (from 1 for "struggles" to 6 for "no problem", say), assessment results can be easily compiled, summarized, analyzed and reported out. If matched with demographic profiles, they allow a program to see which groups of learners are being served well by the program and where program changes are in order because success is lacking.

The beauty is that this kind of approach fulfills the same function as standardized test: learners are assessed on a variety of skills under standard conditions with common instruments on similar tasks (yet given choices in task selection and afforded multiple opportunities to shine on tasks that matter in a given context important to learners). But unlike the standardized tests currently available, profile assessments do not rely on multiple choice, paper and pencil items.

Rather they give learners the opportunity to demonstrate what they can do with language and literacy through more open ended assignments. Furthermore, profile approaches to assessment can be adapted for certain learner groups and modified to match the focus of a particular program (e.g., workplace, family literacy, citizenship). Most importantly, perhaps, they provide rich information that makes sense to teachers and learners, information that is useful to programs, not just funders.

Why then, are we not seeing more of these kinds of assessments? While extremely worthwhile and high in validity, these types of assessment carry a significant burden: they require consensus building on what is worth teaching and learning and a common understanding of what evidence of success might look like for any given skill domain. To be successful, profiles and portfolios have to be integrated into the curriculum and ongoing assessment must either be part of the day-to-day teaching we do, or time must be set aside at intake to establish baseline and toward the end of a teaching cycle to document progress. If that means the end of open-entry/open-exit as we know it and forces us into shorter instructional cycles that have a clear teaching/learning focus, so be it. To give such a framework a chance, a significant amount of teacher orientation, training and buy-in will be needed.

Clearly, there are not many adult hteracy programs that have the commitment, energy and resources to embark on that endeavor, although some, like the Arlington Education and Employment Program in Virginia are well on their way. But, given sufficient advocacy from local programs along with a modicum of political will on the part of state directors and other funders, teams, working groups and consortia could be set up to develop an assessment framework that, if not based on profiles, at least includes them. In fact, the National Institute for Literacy is moving in that direction, developing an assessment framework that combines the use of alternative assessments with standardized tests where appropriate in order to capture the gains that learners make who are part of the "Equipped for the Future" initiative.

What then is the bottom line, given the current climate of accountability for accouhtability's sake? We have several options: we can decide that cynicism is the only sane response to the current requirements, live with standardized tests as best as we can, try to lay low, figuring "this too shall pass," or commit ourselves to fighting for a saner system for our own sake and that of our students. On the local level, we must be prepared to work with others to decide on the focus of our programs and be willing to map out a core set of knowledge, skills and strategies that matter.

At the federal level, we must push for an accountability system that is driven not by what the current standardized tests are able to assess (which is rather limited), but by outcomes that reflect what sound adult literacy programs should be all about. Furthermore, if we are asked to show accountability related to outcomes and impacts, we must be given the resources to document success in meaningful ways.

Finally, while we may need to play the accountability game for the time being, we can also work toward a system that measures effectiveness where it counts: adult learners acquiring the kinds of knowledge, skills and strategies that are important to them now and that matter in the long run. If we give up too soon, we will only marginalize adult literacy further.

Originally published in Adventures in Assessment, Volume 11 (Winter 1998),
SABES/World Education, Boston, MA, Copyright 1998.

Funding support for the publication of this document on the Web provided in part by the Ohio State Literacy Resource Center as part of the LINCS Assessment Special Collection.

 

Boston CRC Central Northeast Southeast West
SABES is funded by Massachusetts Department of Education : :|: : Creative Commons Copyright Info.: :| : Webmaster : :| : :Site Map : : Last Modified 05/01/06