Last night I posted a short survey on Twitter. I asked participants to analyze the following question.

Then I asked them to answer the question. I purposefully set up the survey so participants could select more than one answer, though I gave no encouragement to do this. All the question said was, “Your answer.” Finally, I left a section for comments.

When I tweeted out the survey, I didn’t provide any details about the problem or my intent. I was trying really hard to see if other people saw the same thing I did without leading them to it.

Turns out many people did. I feel validated.

So here’s the story. This is question #28 from the 2016 Grade 3 Texas state math test, called STAAR, that students took last spring. Back in February, one of our district interventionists emailed me to say that he thought both choice G and J are correct answers. I opened up the test, analyzed the question, and realized he was right. I immediately drafted an email to the Texas Education Agency to ask about it.

Good morning,

I have a question about item 28 on the grade 3 STAAR from spring 2016.The correct answer is listed as J. This makes sense because the number line directly models a starting amount of 25 people and then taking some away to end at 13, the number of people still in the library.However, the question isn’t asking for the model that most closely represents the story. Rather, it asks which model can be used to determine the number of people who left the library.In that case, answer choice G is also correct. Our students understand that addition and subtraction are inverse operations. Rather than thinking about this as 25 – __ = 13, answer choice G represents it as 13 + __ = 25, which is a completely valid way of determining the number of people still in the library.I look forward to hearing TEA’s thoughts about this question. You can reach me at this email address or by phone.Have a great day!

About a month later I still hadn’t received a response so I emailed again and got a call the next day. It turns out I wasn’t the only person who had submitted feedback about this question. Unfortunately, according to the person on the phone, after internal review TEA has decided not to take any action. However, they do acknowledge that the wording of this question could be better so they will do their best to ensure this doesn’t happen again.

I told her I wasn’t happy with that answer and that I would like to protest that decision. She didn’t think that’s possible, but she offered to pass my email along to her supervisor or ask the supervisor to call me. I asked for her supervisor to call me.

Surprisingly, my phone rang about two minutes later.

The supervisor asked me to go over my concern with her so I explained pretty much what I said in my email. She said she understood, but if we looked at G that way then all of the answer choices could potentially be right answers. This was confusing to me because I don’t think F would help you determine the answer at all. If anything it shows 25 + 13, which will not give you an answer of 12.

I stressed that my concern is that answer choice G is **mathematically correct*** with regards to answering the question asked*. I get that J is a closer match to representing the situation, and if the question had asked, “Which number line best represents the situation?” then I probably wouldn’t be emailing and calling.

But it doesn’t.

The question asks, ‘Which number line represents **one way** to determine the number of people who left the library?” If you know how to use addition to solve a subtraction problem, then answer choice G is totally **a** **way** to find the number of people who left the library.

She said that is a strategy, not a way of representing the problem.

“That’s exactly what a **way** is. How you would do something, your strategy,” I replied.

She decided to redirect the conversation, “Let’s look at the data on this question. 68% of students chose J. 9% chose F, 12% chose G, and 10% chose H. The data shows students weren’t drawn to choice G. It’s not a distractor that drew them from choosing J.”

“I don’t care about that. The number of students who selected G doesn’t change the fact that it’s mathematically correct. If anything we should give those students the benefit of the doubt because we don’t know why they picked it.”

“Exactly,” she replied. “We don’t know why they picked it, so we can’t assume they were adding.”

“That’s not okay. Since we don’t know why they picked it, we’re potentially punishing students who chose to use a perfectly appropriate strategy of addition to solve this problem. There are a lot of 3rd graders in Texas, and 12% of them is a large number of kids. Who knows if this is the one question they missed that could have raised their score to passing?”

From this point she steered the conversation back to the question and how J is still the best choice because this is a subtraction problem.

“But you aren’t required to subtract to solve it! We work really hard in our district to ensure our students have the depth of understanding necessary of addition and subtraction to know that they can add to find the answer to a subtraction problem. We want them to be flexible in how they choose to solve problems. And again, the question isn’t asking students which number line best matches the situation. It just asks for **one way** to find the number of people who left, and both G and J do that.”

She went back to her original argument that if I’m correct then all of the answer choices could be used to find the answer to the question. She talked about how choice F shows both parts of the problem, 25 and 13, so you could technically find the answer. I disagreed because you end up with a total distance of 38. There’s nothing that makes me see or think of the number 12.

We went round and round a few more times. She wasn’t budging, and I was having a hard time listening to her justifications. She assured me they were going to be much more diligent about how number lines are used in future questions, but this question was going to remain as-is because she believes J is the best answer.

The whole exchange left me livid. In some small way, TEA is acknowledging that this question is flawed, but they aren’t willing to do the right thing by either throwing it out or making it so either G or J could be counted as correct.

They’re just going to do better next time.

But we’re talking about a **high stakes test**! Our students, teachers, principals, and schools don’t get to just “do better next time.” They are held accountable for their scores now. They can be punished for their scores. People can be moved out of their jobs because of students’ scores. So much is at stake that if a question is this flawed, TEA should show compassion to our students, not stubbornness. They should admit that both answers are mathematically correct and update each students’ score.

Because we’re not talking about a small handful of kids.

12% may not sound like much, but when 327,905 students took this test, that means nearly **40,000(!)** of them chose answer choice G. That’s 40,000 students who are being punished because of a poorly worded item that has two answers.

That’s not correct.