Assessment 2.0 on the Horizon

By William H. Zaggle

There are some great models for a new kind of assessment, at least in science. While attending the Turning Technologies Higher Education User Conference on the Harvard Campus in October, I was treated to a lecture by Harvard Physicist Dr. Eric Mazur on the concepts of interactive teaching and “dynamic assessment,” a term that originated, I believe, in the work of the Russian psychologist Lev Vygotsky, specifically Mind in Society: Development of Higher Psychological Processes (1978).

I was inspired by Dr. Mazur’s creative ways of testing students’ mental models rather than rote knowledge and his use of “conceptual” questions and discussions that require students to exercise the “skeptical, critical, and scientific thinking skills” our species developed as early hominids. By presenting mental “problems” in small doses and  testing students dynamically, gradually increasing the level and complexity of the problem to keep students within their “zone of proximal development,” as defined by Vygotsky, his students learn to develop their own mental models of the problem and form their own questions rather than just learn someone else’s answers to the questions.

He described how he once taught an entire physics class and then, at the end, curious as to why students still seemed confused even after answering the questions correctly, developed and delivered to them a simple mental model survey of basic physics. He was shocked that very few could pass. He even found professors on campus that could not pass, even though no complex math or formulas were required. He now starts the course with the mental model survey, and the students understand early on why they need certain formulas and actually work together to seek them out or derive them individually. Now nearly all of his students can pass the survey by the end of the class. One example he shared from the survey was on the expansion of metal when heated. The expansion is based on one simple rule: When heated, all molecules grow farther away from all the other molecules. So what happens to the inside circle of a “donut” of metal when it heats? Does it get bigger, smaller, or stay the same?

I know many who would love to implement this type of formative assessment but are apparently locked into the “hijacked” version of the vision of assessment where not passing a state level, end-of-year exam often means students won’t be promoted, loss of funding, or worse.

Assessment 2.0 methods seem to be making progress, but there are roadblocks. For example, how to develop assessments that are not only online but also authentic, personalized, negotiated, engaging, problem oriented? How to get such new forms of assessment accepted and merged into our existing mental models of measurement? Bobby Elliott of the UK, in a 2008 paper, does a fairly good job of describing at least some of the many issues.

I was also present at the State Educational Technology Directors Association conference (SETDA – Leadership Summit and Education Forum 2010, Innovation through State Leadership, November 7-10, 2010, National Harbor, Maryland where Arne Duncan announced the new Educational Technology Plan. Apparently, two different state groups are now working on Assessment 2.0 type tests that can be shared by all states. Arne Duncan’s comments, captured from an earlier announcement of the next generation assessments and moving beyond the “bubbles,” are on-line.

So it seems there is at least some level of grappling going on at the federal level toward a new generation of testing, Assessment 2.0, with an emphasis on formative, interactive strategies. Hopefully there is a light at the end of the tunnel for the replacement of current outdated testing methods. The ultimate beneficiary will be the students, as it should be.

4 Responses

  1. “So what happens to the inside circle of a “donut” of metal when it heats? Does it get bigger, smaller, or stay the same?”

    Trivial if you can reason well. However, note well that this is a “bubble” question, the kind so strongly frowned upon by so many educators.

    Multiple-choice tests or any machine-scorable tests are only as good as the questions, but they can be very good indeed if the questions are.

    Finding questions such as this one is not an easy task. It’s so much easier to ask a student to calculate the expansion of a one-meter piece of metal with a given expansion coefficient when heated from one temperature to another. The student can approach this sort of question in two different ways. There’s the memory way: recognize the type of problem, remember the step-by-step procedure for solving it, and “plug and chug.” Then, there’s the reasoning way: figure out, from the name, what an expansion coefficient is, derive the formula (trivial if you understand the first part) or use reasoning to insert numbers in the thought process that amounts to the same thing, and, voila!, you have the answer.

    My experience with typical science courses, indirectly through my children, is that the former pedagogy prevails. See pattern; remember step-by-step procedure, apply procedure. Don’t even try to understand what it all means.

    If we produce tests with the second sort of questions, then a number of teachers will figure out the easy way to “train” the students to do them. If the questions are of the former sort, then students will be able to do both sorts, and the teachers will not have the short cut method of teaching. To me, it appears as if quite a few science teachers are incapable of teaching the second way. I had an encounter with a mathematics teachers that proved to me convincingly that she didn’t.

    Let me just say, “Stop knocking machine-scored tests. Knock the test writers instead.”

  2. Harry,

    Could higher order thinking mean, in some cases, “to heck with the calculating process” and “just give me the answer so I can solve the problem”?

    For example, do we need to know “how” an answer is derived when all we need is the answer to make “higher order” decisions?

    For example, decision makers may know squat about how to crunch data to determine if product A is better than B. They just need to have the results to determine the next step.

    Their higher order skill isn’t the mechanics of statistics but the validity and generalizability of results as well as their uses, interpretation, and implications.

    Number crunching could be left to those who love to work with numbers, but their days may be numbered since machines will sooner, rather than later, replace them.

    When we place a premium on some forms of processing, we may lose sight of the goal, which is to apply the outcomes of the process to better understand and perhaps solve problems.

    In other words, the hands on in some cases may be lower-order, and the application of the outcomes, higher order.

    When we make this distinction, we then focus on how results can be applied to real-world problems. Thus, instead of asking students to spend years learning how to crunch numbers, we’d simply skip that step and ask them, instead, to focus on how to apply the answers.

    For example, in building a pyramid, the HOT (higher order thinker) would ask the LOT (lower order thinker) to figure out how many men and how long it would take to move 10-ton stones into place using plan A.

    After getting the answer, the HOT would determine if another method might be better. S/he thinks of two additional plans, B and C, and asks the LOT to figure out how to implement them and to derive the best manpower and time estimates for each.

    Today’s fighter pilots, in flight and fight, don’t need a crew on board to help them calculate various mathematical functions because these are being done via computers or even remotely by techs thousands of miles away. They have all the info they need in front of them, and many of the decisions that were made by humans not very long ago are now being made by computers in nanoseconds.

    In other words, the goal is to leave, as much as possible, all of the lower-order thinking to machines and to simultaneously move toward machines for higher-order thinking, too.

    This leaves the highest order decision-making to the pilot, the human being. One of the implications is that the fighters of the future will be remotely controlled by a pilot sitting at HQ somewhere thousands of miles away from the action. He’s simply no longer needed in the cockpit.

    Another implication is that, in future math classes, students won’t bother with how to solve data crunching problems. This will be left to smart apps or machines. Instead, they’ll focus on how to apply the results to specific problems.

    For writing teachers, the day will come when the how of mechanics is no longer an issue. Grammar, spelling, mechanics, even lower level aspects of style and arrangement, will be handled by apps and smart machines.

    Documentation (APA, MLA, etc.) will be automatically done with zero effort on the part of the writer. The review of lit, too, will be handled by smart apps.

    This will leave the writer with the highest order tasks, i.e., deciding on the desired outcomes and best general strategies and leaving the rest — the how to — to computers.

    The point is that a lot of what we’re focusing on now as “higher order” may be, relatively speaking, lower order.

    I really believe that HOT is a lot more fun and conducive to learning, and this is where our attention ought to be. This means bypassing a lot of the crunchwork and going straight to the application to futz with real-world problems.

    The HOT student would become expert in knowing what kinds of questions to ask to, say, build a better battery for a solar-powered car and in understanding how to weigh and apply the answers in a recursive process that eventually leads to the desired result.

    But don’t get me wrong. HOT is and will always be a lot harder than LOT. It will require more creativity, imagination, logic, intuition, courage, confidence, and intelligence. But freed from the mechanics of number crunching, the work may turn out to be a lot more fun. -Jim S

    • Gee, did I really explain myself that poorly?

      Understanding the question of what happens to a hole in a metal sheet when the metal is heated is by far more important than calculating the expansion of a piece of metal. The numbers do not trump the conceptualization.

      If you must figure out the expansion of the metal, then do you use some memorized procedure or use your head? This particular example may be too simple, but i simply grabbed an example that related to the hole question.

      Understanding about metal expansion and the effect of different expansion coefficients is more important than crunching the numbers. The latter can always be done somehow else or checked by someone else or whatever. Beyond that, asking why this metal has one coefficient while another is different is where we’d like to see thinking go.

      My purpose in writing the response was really simply to suggest that bubble tests need not be evil. I really didn’t intend to create a discussion on higher order thinking, whatever that really is. The level of thinking must be some sort of continuum.

      Still, I did encounter plenty of classes that actively encouraged LOT and discouraged HOT. Better assessments should help to remove that problem.

  3. I remember a somewhat telling comment by Dr. Mazur was that when he asked these mental model questions of his students, some would actually answer… ‘Do you want to know what I think or what I was taught?” Meaning they probably knew which formula to use but not necessarily what relationship of reality that formula was describing. Their knowledge of how to solve the problem was there but their mental model or understanding was missing or incomplete. They were more LOT than HOT.

    Those with a keen mental model of how something works when it does, typically have the best chance of diagnosing what is wrong when it doesn’t, enhancing the something to work better, or stopping it from working when needed.

    Another somewhat interesting expansion example from Dr. Mazur’s modeling survey was the relationship of radius to circumference. He asked if a wire stretched tightly around a perfect globe the size of the earth was increased in length by only one meter, and then caused to hover above the earth like a ring around Saturn, how far would it be from the surface? Hairs? The height of a curb? The height of desk? The answer might make you stop and think about what using that next larger notch in your belt is really telling you.

    While the mental model survey was not meant to be a “test”, it does serve a purpose of helping to form a framework for understanding that goes beyond the knowledge of construction. It provides the starting point for a chain of dynamic assessments. Dynamic assessment allows building upon the answers with more questions like “so then what would happen if”, or “how could this be leveraged to solve other problems”, and is where the mental model of understanding is constructed and real HOT gets the exercise. I believe this is also where the more static forms of testing tend to leave the realm of instruction if not very carefully constructed and applied with or without bubble sheets.

    I do believe that the hope for the Assessment 2.0 initiative is to build these better assessments and begin to measure the student’s wisdom of understanding beyond just the knowledge of the facts.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: