Last night I posted a short survey on Twitter. I asked participants to analyze the following question.

Then I asked them to answer the question. I purposefully set up the survey so participants could select more than one answer, though I gave no encouragement to do this. All the question said was, “Your answer.” Finally, I left a section for comments.

When I tweeted out the survey, I didn’t provide any details about the problem or my intent. I was trying really hard to see if other people saw the same thing I did without leading them to it.

Turns out many people did. I feel validated.

So here’s the story. This is question #28 from the 2016 Grade 3 Texas state math test, called STAAR, that students took last spring. Back in February, one of our district interventionists emailed me to say that he thought both choice G and J are correct answers. I opened up the test, analyzed the question, and realized he was right. I immediately drafted an email to the Texas Education Agency to ask about it.

Good morning,

I have a question about item 28 on the grade 3 STAAR from spring 2016.The correct answer is listed as J. This makes sense because the number line directly models a starting amount of 25 people and then taking some away to end at 13, the number of people still in the library.However, the question isn’t asking for the model that most closely represents the story. Rather, it asks which model can be used to determine the number of people who left the library.In that case, answer choice G is also correct. Our students understand that addition and subtraction are inverse operations. Rather than thinking about this as 25 – __ = 13, answer choice G represents it as 13 + __ = 25, which is a completely valid way of determining the number of people still in the library.I look forward to hearing TEA’s thoughts about this question. You can reach me at this email address or by phone.Have a great day!

About a month later I still hadn’t received a response so I emailed again and got a call the next day. It turns out I wasn’t the only person who had submitted feedback about this question. Unfortunately, according to the person on the phone, after internal review TEA has decided not to take any action. However, they do acknowledge that the wording of this question could be better so they will do their best to ensure this doesn’t happen again.

I told her I wasn’t happy with that answer and that I would like to protest that decision. She didn’t think that’s possible, but she offered to pass my email along to her supervisor or ask the supervisor to call me. I asked for her supervisor to call me.

Surprisingly, my phone rang about two minutes later.

The supervisor asked me to go over my concern with her so I explained pretty much what I said in my email. She said she understood, but if we looked at G that way then all of the answer choices could potentially be right answers. This was confusing to me because I don’t think F would help you determine the answer at all. If anything it shows 25 + 13, which will not give you an answer of 12.

I stressed that my concern is that answer choice G is **mathematically correct*** with regards to answering the question asked*. I get that J is a closer match to representing the situation, and if the question had asked, “Which number line best represents the situation?” then I probably wouldn’t be emailing and calling.

But it doesn’t.

The question asks, ‘Which number line represents **one way** to determine the number of people who left the library?” If you know how to use addition to solve a subtraction problem, then answer choice G is totally **a** **way** to find the number of people who left the library.

She said that is a strategy, not a way of representing the problem.

“That’s exactly what a **way** is. How you would do something, your strategy,” I replied.

She decided to redirect the conversation, “Let’s look at the data on this question. 68% of students chose J. 9% chose F, 12% chose G, and 10% chose H. The data shows students weren’t drawn to choice G. It’s not a distractor that drew them from choosing J.”

“I don’t care about that. The number of students who selected G doesn’t change the fact that it’s mathematically correct. If anything we should give those students the benefit of the doubt because we don’t know why they picked it.”

“Exactly,” she replied. “We don’t know why they picked it, so we can’t assume they were adding.”

“That’s not okay. Since we don’t know why they picked it, we’re potentially punishing students who chose to use a perfectly appropriate strategy of addition to solve this problem. There are a lot of 3rd graders in Texas, and 12% of them is a large number of kids. Who knows if this is the one question they missed that could have raised their score to passing?”

From this point she steered the conversation back to the question and how J is still the best choice because this is a subtraction problem.

“But you aren’t required to subtract to solve it! We work really hard in our district to ensure our students have the depth of understanding necessary of addition and subtraction to know that they can add to find the answer to a subtraction problem. We want them to be flexible in how they choose to solve problems. And again, the question isn’t asking students which number line best matches the situation. It just asks for **one way** to find the number of people who left, and both G and J do that.”

She went back to her original argument that if I’m correct then all of the answer choices could be used to find the answer to the question. She talked about how choice F shows both parts of the problem, 25 and 13, so you could technically find the answer. I disagreed because you end up with a total distance of 38. There’s nothing that makes me see or think of the number 12.

We went round and round a few more times. She wasn’t budging, and I was having a hard time listening to her justifications. She assured me they were going to be much more diligent about how number lines are used in future questions, but this question was going to remain as-is because she believes J is the best answer.

The whole exchange left me livid. In some small way, TEA is acknowledging that this question is flawed, but they aren’t willing to do the right thing by either throwing it out or making it so either G or J could be counted as correct.

They’re just going to do better next time.

But we’re talking about a **high stakes test**! Our students, teachers, principals, and schools don’t get to just “do better next time.” They are held accountable for their scores now. They can be punished for their scores. People can be moved out of their jobs because of students’ scores. So much is at stake that if a question is this flawed, TEA should show compassion to our students, not stubbornness. They should admit that both answers are mathematically correct and update each students’ score.

Because we’re not talking about a small handful of kids.

12% may not sound like much, but when 327,905 students took this test, that means nearly **40,000(!)** of them chose answer choice G. That’s 40,000 students who are being punished because of a poorly worded item that has two answers.

That’s not correct.

Ruth WilsonThey are flat out wrong. G and J are both valid. Sad part is when we test we aren’t suppose to be viewing the questions. Sort of hard to find questions that are bogus if you can’t even see and evaluate them…even after the fact!

bstockusPost authorYeah, I’ve emailed about several questions since they released this batch of tests in August. Unfortunately the test is long over by that point.

Mark ChubbHard to get into a debate with a mathematician and a statistician. The mathematician points out what is mathematically possible. And the statistician worries about their data.

Glad you caught this one:)

In the end, I hope they learned something about the mathematics here!

bstockusPost authorThanks, Mark! I’ve said my piece, and it feels good to get it off my chest. The person I spoke with suggested I volunteer for the item review committees they have. For some reason I thought only teachers could be on them, but I’m definitely going to submit my name now that I know I can join in!

primemathblogDefinitely try to get on those committees! I’m on one in my state and it’s helpful to actually get to address item issues before they get to the test with department members that are actually receptive to feedback. We still occasionally run into “the state says we have to do it this way”, but it’s much less frequently!

DeniseThank you for being a voice for our children and our teachers. Where were the reviewers before the assessment was released? Makes me wonder how many other questions with misconceptions there are.

bstockusPost authorYes, the state has committees review the items ahead of time, and they also are field tested. When I submitted feedback about a question in the fall, the response I got was, “The teachers who reviewed this item were okay with it.” I found out I can submit my name to volunteer for these committees. Somehow I thought only classroom teachers were able to, but now that I know differently, I’m going to be proactive and try to help catch these mistakes before they make it into a live test.

lovemathwebYou took a stand! I can see clearly how with the new thinking we are teaching the students to interact with information in different ways this two answers are both right. Instead of subtraction we often use counting on. TEA should not take this so lightly, like you said roughly 40,000 students could have passed the test.

bstockusPost authorThank you for your support! It definitely bothers me how many students this affects. I hate to think of even one student thinking they solved this correctly but getting marked wrong. Granted, they’ll never get to see their test again so I’ll take some small comfort that they have no idea.

MandyFrom Teaching Student-Centered Mathematics, K-3 by Van de Walle and Lovin, p. 94, “Think Addition” is the most powerful way to think of subtraction facts.” Van de Walle then goes on to explain and support why this is such a powerful strategy in several other chapters and how it’s especially useful for students who struggle. So, in my book study where I’m coaching teachers to use research-based, sound instruction for mathematics, am I supposed to include the caveat that, “Oh, by the way, this may or may not apply to standardized tests, but trust me, it is a powerful way to teach students subtraction, especially those with learning disabilities.” So frustrating…

bstockusPost authorI hear you! The last thing I want is for any teacher to be turned off of good instruction for the sake of the way things are tested. I want assessment and instruction to be aligned, but that means I have to have faith and trust that the assessments I have no control over are in alignment with what we know about how kids should learn and do mathematics. Instances like this test that.

JenniferBrian I am so impressed with your commitment, passion and dedication to the students’ in your district. They are so lucky to have a cheerleader like you supporting them. In Ontario, we also encourage a variety of problem solving strategies. We share and celebrate different ways of thinking. I can’t imagine penalising students for using a less popular strategy. Hopefully because of your undertaking the tests will be more fair in the future!

bstockusPost authorThank you for your support! That’s one thing TEA did repeat several times: They are going to be very cautious about questions like this in the future. I sure hope so!

Thank you again! I appreciate the kind words. 🙂

fschwopeThis is the exact same conversation I had with TEA about this question! Thank you for confirming my thinking.

bstockusPost authorGlad to know I’m not alone, but frustrated to know that they’re hearing the same thing from multiple people and not acting on it.

ShellyI’ve been on item review committees in my state, but ran into almost this exact scenario. Math teachers: “This is a terrible question for these reasons.” State reps: “This question cost a lot of money to develop, don’t you think it’s good enough?” And on and on, similar to your experience with the lady on the phone. Before I was on the committee, I had serious doubts about the validity of the math test given in my state. Being on the committee didn’t change my views.

bstockusPost authorAs discouraging as that it is, I appreciate you sharing your experiences. I’m still probably going to volunteer to be on a committee so I can at least feel like I’m being proactive rather than reactive, but I’ll definitely go in realizing I still might run into some issues. Thank you again!

julierwrightThis is so maddening, but at least you were allowed to comment. On our state tests, both when they were OAKS (Oregon Assessment of Knowledge & Skills) and now with SBAC, we’re basically barred from writing the kind of stuff you wrote above, even when emailing state assessment people.

I still hold a grudge over how the old sixth grade Oregon math tests had question after question about “direct variation” when the STATE STANDARDS and the curriculum I’d been using only used the words “proportional relationship”. Yeah, the second year my kids took it, I knew to warn them they were the same thing, but the first year they hemorrhaged points over it. My kids’ scores were way higher the second year. Was it because I was that much better a teacher or just because I knew about some of the test’s quirks? STUPID.

Joe SchwartzBrian, I admire your tenacity and the way you’ve advocated for our students. I feel your fury and exasperation. Your post highlights one of the reasons why my wife and I have refused to allow our children to participate in any high stakes standardized testing.

Rhonda BurgerVery true! At the very least TEA should apologize! They hold teachers and districts accountable for every little detail, and if we make an honest mistake, it could still cost teachers our licences.

I used this question in a benchmark retest in January, and I mistakingly thought J was the only correct answer (I used the state’s answer key) even though I too teach subtraction and addition are inverse operations. The state also expects the students to recognize inverse operations as well. After reading Mathematically Correct’s response, I pulled those tests and found results similar to the state. Out of 23 students, 13 chose J, but 4 chose G and 6 were incorrect. And furthermore, TEA is wrong in their explanation that the other two choices could also be correct. Hogwash!

We all make mistakes and so do they. Fix it.

Rhonda

kajimuralWAY to go Brian. Keep at them!

