PODCAST: Test Day

In this month’s podcast, we talk with Scott Marion, director of the Center for Assessment in Dover, about assessments, their value and limitations, and the challenges of administering and using them this school year.

TRANSCRIPT

I’m Sarah Earle and this is School Talk. Do you still get a knot in your stomach when you hear the word test? Then you may want to take a few deep breaths before this one. It’s assessment season in New Hampshire, and today I’m talking with Scott Marion, executive director of the Center for Assessment, a Dover-based non-profit that works with education leaders around the state and country to develop more meaningful assessment and accountability programs and policies. 

I guess people have a lot of questions about assessment right now, as we round out a full year of pandemic-disrupted learning, so I thought we could  start with just a broad question, which is what is the role of assessment and maybe how should that role evolve in light of what schools and students have endured over the past year and are still dealing with?

So let’s think about assessment broadly right? So people say assessment and people think of the state test that is given at the end of the year. But assessment is much more broad than that. Teachers are assessing hopefully every day in very formative and innocuous ways, and then districts have assessments at the end of various units of instruction and things like that, and then there are these commercial — often commercial — interim assessments given a few times a year and then the state summative assessment. And so the most important assessments this year are those that are given closest to the kids, the formative assessments, the unit assessments that are given by teachers to evaluate what kids are learning relative to the curriculum they’re being taught. The larger scale assessments, these interim assessments and state assessments, do have a role in providing some sort of comparability across schools and districts to get a sense of, are certain districts struggling that haven’t struggled before? Are they struggling more than they struggled before in a relative sense? Because if you’re just doing in-district stuff you might not see that. So, they all have their role, and it’s important to keep, you know, stay in their lanes. 

But in terms of the pandemic, you know, everyone is talking about learning loss, which is one of my least favorite terms because it’s very pejorative and borders on a little racism, because they’re not really talking about the rich white kids and learning loss, even though I lose knowledge every day, and hopefully gain more back. We just can’t remember all these facts and things like that. 

Tell me about it, at my age…

Right, and it gets worse as you get older. And so learning loss is this notion that kids haven’t learned as much this year starting from the end of last year, as they would have if things had been normal. I think that’s fair to say, but it’s not like they haven’t learned anything. They’ve learned different things. They’ve learned likely some self-regulatory skills in having to do more work on their own. That’s a really important skill for the world of work. They’ve learned a lot of technology skills that they might not have learned otherwise. So they’re learning some different things. It’s important, though, to get a sense of where kids are at, because that’s what you use assessment for, to understand where people are at and understand what they need to do to move forward. 

Statewide assessments won’t tell you that at the student level because they’re not fine-grained enough. That’s why we need those classroom formative assessments to do that. 

Does your organization have your finger on the pulse of what’s going on with those in-school assessments? 

In certain schools and certain districts. I have a few colleagues who work on it a lot. I work on it a little less, but it’s one of the things we’ve been talking about, and we’re not the only ones talking about this, but in every crisis maybe there’s an opportunity. And so I’ve been saying — I wrote a blog about this last April thinking that I was only giving this advice for the short term and we’d be back in school — but if you’re giving what I call a Google-able assessment, so a kid can actually Google the answer to that assessment, it’s a bad assessment. Now that’s true in normal times I would say, but it’s much more apparent now. So if you’re giving me a test question, and I actually have my phone, but you don’t see it, now I can actually Google the answers or use my calculator or do something else with my phone that, now you don’t even know, do I know how to do it or do I just know how to Google it? Which is not unimportant. Compared to actually giving questions to kids to think more deeply, to solve novel problems, to actually require them to use the internet to look up resources and reason with that evidence. So this actually gives an opportunity to do things differently. I’m afraid that that’s happening, not quite as fast as I’d like. I think there’s still probably a lot of the old stuff. But I think teachers are probably figuring out that we just can’t keep doing the old stuff. 

Right. What are some of the other key concerns that have come to your attention in working with state and district leaders — you work with people across the country, right?

Yeah, we work in about 40 states. Everybody’s concerned with getting kids back into school. It’s happening. Several states that have been really slow to this have just announced that kids will be back in school by mid-April, in one case early April. And those are the states that have been slow to get kids back. As you know here in New Hampshire the Governor basically mandated that students come back into school, although not really on a full-time basis, it’s hybrid, they come in one or two days a week. I do think we could have been more bold on getting kids back into school. I’m of that camp, I’m a science guy. I’m also on the Rye School Board, and we’ve been in school all year long. 

You have? Full time?

Full time except Fridays are remote days, and we did that for two reasons. One, it gave teachers more time to plan while kids did other things, more asynchronous things, but also we wanted to be in shape in case we had to flip the switch and go remote, we didn’t want to have to learn this all over again. So kids were actually, and teachers were sort of in shape in case we had to go remote, but we haven’t, knock on wood. Now, in Rye we have the resources and, our biggest problem before this was declining enrollment so we had the space, but I think people could have been bolder around the country. So that’s the first concern that states are trying to figure out is how to get kids in school. And then in the testing world, they’re trying to wrestle with, do I do my same state test? The feds have said pretty unequivocally, unless you have a very good reason not to, we expect you to do the test. There’s a variety of reasons. They’re getting pushed by quote-unquote civil rights organizations saying that doing the tests is the only way to measure equity. And I say quote-unquote because there are just as many civil rights organizations who are opposed to testing. So, who gets to wear the mantle of civil rights? I’ll let them fight that out. People argue both sides of that.  Some argue civil rights is, you’re now going to take invaluable instructional time away from my kid who hasn’t had many learning opportunities this year. And now you’re going to do drill-and-kill test prep. And that’s a threat to their civil rights because they’re not learning what they should be learning, they’re just preparing for the test. Or, worse, you bring your kids back into school to test when it might not be safe for them or their families. So the civil rights thing, I don’t think anybody gets to just wear that mantle just because they declare it. So states are wrestling with that, they’re trying to figure out, testing in a year is hard, there’s so many things that can go wrong. And it doesn’t take much for it to go wrong and for it to get in the newspaper. This year there’s so many more things that could go wrong. There could be a COVID outbreak, depending on these South African or Brazilian or UK variants. What would happen? We’d have to shut down in the middle of testing. Extending test windows. It just raises all sorts of challenges. So states are dealing with this. Being a state testing director is a very hard job. And it’s hard in normal years. It’s usually like three jobs in one. And now they’re doing all these contingency plans: If this happens we go this way, and if this happens we go this way. This job was already crazy enough, and now you’re adding all these layers of complexity.  

So yes testing equitably and safely is a huge concern. The other concern I hear is how the tests will be used this year, and perhaps there are ways they could adjust how they’re used. I wonder if you could talk about that. 

Just yesterday as a matter of fact, the U.S. Department of Education released this template for accountability waivers. And they had talked about, in the second letter, signaling this before. They had already signalled under the last administration that they would be open to waiving a lot of the accountability requirements under the Every Student Succeeds Act, which is the governing federal law for education in this country. And so with this template that came out yesterday it’s clear that states will not have to use test scores or even calculate accountability at all this year. There’ll be no ratings of schools, there’ll be no identification of schools for comprehensive and targeted support and improvement or anything like that. They’ll all just roll forward a year. So that’s a good thing. The challenge of course is — and I learned this; I was the assessment director in Wyoming over 20 years ago, and I once said to a principal, “Oh but we don’t use this for accountability.” And he said, “The scores get put in the paper. To me that’s accountability.” 

Right. 

So there’s no formal accountability this year. Folks in the schools and districts will still feel some sense of accountability. And the challenge with that is, if the kids in a certain school perform lower than expected, or more poorly, then people will look at that and say, ‘oh that’s too bad, those kids performed poorly.’ But that usually carries some attribution, and that attribution is usually targeted at the teachers and the principals. When it could be the case that the kids didn’t have devices or internet access, and that’s not the school or teachers’ fault. 

So can you offer me kind of a broad overview of what this totally chaotic year has looked like from an assessment standpoint? 

Yeah, and so, this year is fascinating in a lot of ways, and that’s true for education in general. We get folks on the far left and far right agreeing with one another, and somehow the folks in the middle on both sides tend to agree with each other, but they don’t agree with the others. And so for instance what we’ve seen this year, there have been some pretty strong voices that we should not go ahead with state summative assessment as we know it. And I’ve been one of those voices along with many others. The reason is, it’s hard enough to interpret test scores in a normal year. And so, let’s say Sarah was a proficient student every year and this year scores in the lower category, let’s call that basic. So Sarah’s parents are going to say, ‘wow, Sarah didn’t learn that much, or Sarah didn’t do well on a test,’ or some other reason, right? Now, what they might not know is that there have been very well meaning and well reasoned arguments put forth that states and districts should teach fewer standards this year, they should reduce the scope of the curriculum, to focus on the highest leverage learning targets. And reputable groups like Student Achievement Partners have put forth these recommendations. Well let’s say a district did that. But they didn’t have time to change the design of the test, so now you’re getting assessed on 100% of the content, but you really only learned 70% of the content. Let’s say you learned it really well, but you’re being tested on the whole thing. So now you’re going to score worse, and my interpretation is that, ‘oh Sarah got dumber,’ or ‘Sarah’s teachers didn’t do a good job,’ instead of saying ‘she did a really good job on the stuff she was expected to learn.’ But I can’t see that in the test. It’s too black boxy to see that. That’s just one example. Now, in cases where districts didn’t follow these recommendations to reduce the content by design, they ended up reducing it by happenstance because there just wasn’t as much instructional time no matter how you sliced it. Everybody’s acknowledged that fact, that kids haven’t received as much instructional time as they would in a normal year. Nobody’s fault, it’s just hard to keep people engaged over Zoom and other things eight hours a day. We all have learned that painfully in our jobs. And so the problem there is teachers are making the decision of what content to jettison and what content not to jettison and so now I have a harder time interpreting that. That’s just part of the issues that get in the way of making accurate interpretations of test scores this year. Now, there are other folks who are saying, ‘we need these tests to understand the scope and scale of the pandemic effects on learning.’ And they’re making the claim that we need this data to plan interventions. And my argument — again, this is where being on a school board comes in handy — is, if you’re not planning these interventions today, you’re already late. So for instance, an opportune time is this summer, to open up schools to allow kids to come in and maybe get two weeks of math camp or four weeks of math or whatever it might be. We see math as the most problematic area because that’s something you learn in school. Unless you’re like my kid, you’re not really learning a lot at home. You get tortured with it at home with me. But most kids, they read at home, but they don’t do math at home. And so that’s one of the things we see as suffering. But you can’t wait until test scores come back in July to start planning for summer. That’s my counter argument that we need these data. 

Can you talk at all specifically about how New Hampshire’s plans for standardized testing are shaping up this year and how they might look the same or different? 

As far as I know there are plans to administer the state Summative Assessment at the end of the year to all school districts. They’ll still be offered in person, not remote. I don’t think they’ll have a problem being able to administer these tests. There are certain districts — I live down on the Seacoast, and there are certain districts, particularly at the upper grades, middle school and high school, like SAU 16 Exeter, Portsmouth, even Dover, are not really in person. I think they’re moving back to coming back in person. I just read that all the Portsmouth educators who want to get shots are going to get them this weekend, so that should enable them to be able to move back to more full-time in person. And once they do that, if they extend the test windows. Most states are extending the test window to accommodate testing to the end of the school year. 

Okay, but they will all be in person? 

New Hampshire tried remote administration with some of the interim assessments, the tests they mandated earlier this year. That didn’t go that well, and that’s true across the country. 

So what does happen to those students who, for whatever reason, aren’t comfortable being in person? 

The U.S. Department has also signaled that they’re willing to waive the 95% requirement. In law, there’s a requirement that 95% of kids are expected to test, and that’s true not just statewide, but for every school district. There has to be some sanctions. And so what happens typically is that those kids count against you in your achievement indicator. So let’s say I have 100 kids in the school but 90 tests, and I want to figure out how many are proficient. Well, if 45 of them score proficient, I wouldn’t say 45 over 90. I would say 45 over 95. It affects your proficiency rating and things like that. So there’s a strong incentive to get kids into tests. 

Okay. You contributed to a report published last month by the National Academy of Education, and one of the recommendations from that report that I found really interesting was the idea of assessment literacy. What do you think is missing from conversations around assessment? What do parents, community members, policy makers, the media, need to understand more fully? 

This is one of our longest standing issues in my field. There is this notion that most people, other than people trained in this world, are not very literate. They don’t understand how to select, design, and interpret assessments. And that’s true for most teachers, that’s true for most principals. Even though you say, well teachers make tests all the time. They’re usually horrible, for the most part. I know that that won’t go over well with this audience, but it’s true, because it’s hard to do. If you asked me to knit a sweater, you wouldn’t want to wear it because I don’t know how to do it. I wasn’t trained. When you go to school to become a teacher, you almost never take a course in assessment, and the only assessment training is maybe part of what you might call your methods class. I was a science teacher, so I had a whole course in methods of high school science education. And I don’t even think I got any assessment work in there. I think I got two weeks of assessment as part of my educational psychology course. So how would you expect it to learn it then? Except for the ones who have this interest to go back and pursue professional development. Now, as part of the New Hampshire PACE project, the Performance Assessment project we have here, there were teachers who got a ton of assessment training, building their knowledge and skills around performance assessments and other types of assessments, and as part of that, some of the leaders — we had about 50 or 60 teachers — I would stack them up against anyone. But they had an opportunity to build their skills over multiple years. 

And then the public is even tougher, right? They get a score that’s a number, and they say, “it’s a number. It must be right.” And they have no idea about the amount of “error.” So we say “error,” and in my field, my job is really quantifying error, quantifying uncertainty if you will, and so when I say that you scored an 85%. If that was a teacher-made test, just like when you see political polls with a margin of error, people sort of understand that, although they never really pay attention to it. And so if I would give you the margin of error on that 85% in a classroom test, it would probably be anywhere from 65% to 105%. I would guess your score is somewhere in there. That’s a pretty wide range. And so the way we get away with it in classroom assessment is we just have a lot of assessments. So that way, there’s not as much resting on any single assessment. And teachers get to know their kids in many ways, and I absolutely believe they adjust their grades based on this deeper knowledge they have of kids from working with them every day. People say, “Oh, we’re going to make our grading scale more rigorous. We’re going to raise an A from a 90 to a 93%.” Well guess what: The same percentage of kids get As and Bs and Cs no matter where you put that scale. So that’s all part of assessment literacy. But the part we care about is, can you design or select an appropriate assessment for what you want you want your kids to be able to know and do, particularly at the depth we want you to be able to do it? That’s really hard. The deeper part is hard. It’s easy to think of factual questions like Trivial Pursuit or Jeopardy. It’s harder to think of questions that elicit this deeper thinking of evidence. 

Then, the other thing that’s as important. It’s like the old Seinfeld episode. It’s one thing to take the reservation, it’s another thing to hold the reservation. And so, in this case, the question is, can I do something with the results. So I see that Nicole might have scored here, or I might see this in her student work. But do I understand enough how to interpret that accurately and make the right instructional adjustment? Otherwise I’m just collecting data. And the point of assessment should be to improve my practices as a teacher and students’ ability to learn what we want them to learn.  It’s very hard. It takes time. We have a lot of strategies here at the center. We have a lot of strategies. A guy named Rick Stiggins is really the father of assessment literacy in the measurement field. He’s been working on this for 50 years. It’s just very hard. It’s not like you can just add a year to pre-service education so they learn this stuff. Well, how much more time are we going to add to pre-service education? It’s not like these students are coming out getting paid like entry-level engineers or lawyers. They’re getting paid $30,000 or $40,000. We can’t make them go to school and take on debt for seven years. So this has to be part of the work that goes on in schools. The schools, principals, district leaders need to be responsible for ensuring that their teachers both learn the most important pedagogical techniques, but also the assessment that goes with that. And that’s the hard thing. And the leaders often lack these skills. That should be a key focus. 

So, broadly, how do we fix that? But more specifically, are there specific policy pieces that could contribute to or provide a foundation for better quality assessments?

That’s a great question and it’s the thing that keeps me up at night. There’s a lot of things that keep me up at night, but this is one of them. So right now, certainly for the last 20 years, we’ve structured our state assessment systems driven by federal requirements, first it was No Child Left Behind, now it’s the Every Student Succeeds Act, that it’s the end of year assessment that’s created by an outside entity and administered at the state level that has all the weight and all the visibility. And if you think about this logic. A colleague of mine said this: Right now, the school is responsible for administering the test and collecting other data. The school sends all these data to the state, and the state says how the school’s doing. That’s backwards, right? As opposed to the school saying to the state, this is how we’re doing and here’s the data to support that. So we’ve created these incentives, both in accountability policy and assessment policy, that says the state — and I’m not blaming anyone at the state; I was a state official, and many are trying to figure out how to work around this and build more balanced systems — but the state is left as the one to determine whether you are a good school, and back when we were doing more teacher evaluations, whether you were a good teacher, as opposed to this being done locally. So a policy fix, and this would be a big fix, is to think about distributing accountability responsibility and assessment responsibility more broadly, instead of just the State House, but to, the districts are responsible. I’ve been at this a long time and people tease me, showing me pictures of Don Quixote tilting at windmills because I keep doing this — but imagine if we had a system where the district said, ‘we want to work on X. We want to work on much better project-based learning in science and civics because we think that’s a shortcoming and we  think it’s going to be important to our kids and our community and our future. And here’s how we’re going to do it. Here’s how we’re going to model whether it’s successful, and here’s how we’re going to collect the data, and here’s the criteria that we’re going to use up front to judge whether or not it’s been successful and whether or not we need to adjust.’ If I do that, the state would maybe need to approve the plan so it’s not just laissez faire. But if you think about it, when have you ever gotten good at anything because somebody told you, ‘hey you need to get good at this?’ I say, ‘I want to be a better writer.’ So I talk with these editors, I get feedback on this, and then I have to engage in these metacognitive skills to be a better writer. It’s not just because somebody said, ‘be a better writer.’ Or, ‘you want to be a better runner, okay run faster.’ Oh good, thank you. And so that’s the equivalent of what we have now. So the ways we could actually foster more assessment literacy is if you had people take more responsibility for that and actually get feedback. The guy who founded the Center for Assessment, his name is Rich Hill, my boss here when I first joined it, was involved in the Kentucky Reform Act work back in the 90s. Back in those days they started these writing and math portfolios that local teachers and districts had to assemble portfolios of students’ work and they had to rate the students’ quality. But what they also did was that the state then audited a sample of these. And you can well imagine because we do this with NH’s PACE as well. The audited work always scores lower than the scores the teachers gave it, but it was a very fast couple of cycles before the teachers were able to internalize the criteria and apply the more appropriate criteria to students’ writing and math work. And so we saw these things come into alignment much tighter because teachers actually learned and internalized these criteria. That was really fast. So we can create these kinds of policies to incentivize this type of thing. We talk about more balanced assessment, the stuff that happens locally, in classrooms, in districts, in schools, should play a much bigger role. Right now it’s outsourced either to the state or to these commercial groups. We’re farming out the expertise instead of growing it. It takes more work but it’s like everything else in life, it’s better outcomes. 

Thanks again to Scott Marion for joining me today. “School Talk” is produced by our intern, Henry Lavoie. To stay up to date on education news, follow us on Facebook, sign up for our newsletter at reachinghighernh.org, and follow School Talk wherever you get your podcasts.