Data Mining and Online Learning

Jim ShimabukuroBy Jim Shimabukuro
Editor

George Siemens, in an interview with Audrey Watters, says, “In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the form of an exam or a test” (“How Data and Analytics Can Improve Education,” O’Reilly Radar, 25 July 2011). In the context of online learning, he’s underscoring the data mining tools built into learning management systems to do just that – provide on demand information on student log-ins, participation, completion of activities, etc. that can be used to formatively monitor progress. He also mentions the capacity to mine more complex data such as the quality of a student’s performance, but this area is still relatively unexplored.

Still, teachers are discovering that online classes provide mountains of qualitative digital data for each student. In essence, everything that’s done by everyone in an online class is automatically recorded and archived. For example, I teach completely online writing courses and have access to a mind-boggling amount of performance information. For each student, I can access all email exchanges that we’ve had, all discussion forum and chat posts, all confirmations of tasks completed, all evaluations written for classmates’ drafts, and all drafts written and comments received from peers and from me. Because this data is in digital form, it’s also searchable, fluid, and portable.

This means that the student’s writing process is visible to a degree that we could only dream about in face-to-face hardcopy classrooms. Using traditional on-ground methods, much of the learning process was beyond our reach. For example, we didn’t make and keep hardcopies of every draft that had been reviewed by students and teachers as well as every comment that was made. We didn’t record all discussions. This is not to say that we couldn’t. We could, but the labor would have been so intense that it would have been all but impossible. Add to this the cabinets and space needed to store these files as well as the effort required to search, by hand, documents and files and we begin to understand the enormity of this undertaking.

The unprecedented mother lode of formative qualitative information, though, introduces new problems. Because we can do something online that we can’t on-ground doesn’t mean we ought to. The slowest part of the virtual instructional process, the bottleneck, is that which requires human processing. A human being still has to dig through, select, and interpret the data to use it. And this requires the one commodity that’s always in short supply for teachers – time. Thus, even though the information is only a few clicks away, teachers may not be able to use much of it.

In the interest of time, I use an academic triage system to determine which students I’m going to mine. I routinely monitor the reviews that students write for one another’s drafts and generate scores based on their knowledge of assignment criteria and writing guidelines and their ability to apply them in their assessments. When students fall below a certain score, I dig into their process. Thus, when I review their current drafts, I also review past drafts and comments they’ve received. This approach reveals patterns, or more precisely, patterns of failure. I can share these patterns with students and pinpoint corrective actions.

Ultimately, though, triage systems like this are stopgaps. It’s just a matter of time before we invent efficient and effective methods to mine qualitative data. In the digital world, this means “smarter” programs that will allow teachers to set parameters for an application that will, in turn, hunt for, process, and instantly generate the kinds of reports that we need to assess learning as it’s occurring and in a context that sheds light on the entire process. With a click, we’ll literally be able to “see” how the student is learning and intervene at any point to guide performance.

15 Responses

  1. […] Data Mining and Online Learning By Jim Shimabukuro Editor George Siemens, in an interview with Audrey Watters, says, “In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the form of an exam… Source: etcjournal.com […]

  2. The potential for use of “data mining” with online learning tools truly is great. These data sit in different repositories with different means of access and summary, however. Furthermore, as Jim points out, we’re not yet certain how best to use this information.

    I see this situation as an exciting moment as we stand on the edge of yet another new frontier. Once sufficiently refined, the process of analyzing the data captured online will allow automatic differentiated instruction and automatic flags provided to instructors for the purposes of “just-in-time” intervention.

    Oddly enough, no new technological breakthroughs must be made to reach this promised land of online instructional tools. We possess all of the technology required right now. So, why not just move ahead and deploy these great learning adjuncts?

    We face several roadblocks. I’ll suggest some and invite others to add in theirs.

    1. Instructor ignorance. It’s hard to be, say, an English instructor and also be familiar with all of the data mining tools you’d use (e.g. SQL as a trivial example).

    2. The Babel of learning tools. As an instructor, you may use a number of distinct tools in learning. You face a large problem in attempting to combine the data in all of them.

    3. Reticence to use online tools. Many instructors use few, if any, online tools and many choose those that do not provide online storage of student work.

    4. Inability to translate ideas into reality. While this point may seem similar to #1, it goes considerably deeper. When you see a student make a writing error, you somehow know it due to years of experience. Translating that feel into the almost childlike language of computers can be a challenge even if you are very conversant with computer languages and tools. After all, computers don’t “feel.”

    Despite the hurdles, these problems will be overcome in time. Thousands of instructors in each discipline experimenting with different ways of solving the problems will arrive at solutions. Let’s also not forget that a number of profit-making organizations also have an interest in the solution.

    • Harry, perhaps the biggest problem re analytics is that, all too often, it serves bureaucratic rather than instructional needs. The distinction between the two is critical. Unfortunately, the former usually has little to do with instruction and more to do with justifying the existence of added layers of administration.

      The kind of mining I’m describing in this article is for frontline instruction. As teachers, we need tools that give us insights into how our students are interacting with the learning environments that we create. As Siemens suggests, best practice in assessment is on formative, on the process of learning.

      Online instruction provides a gold mine of qualitative performance data, but we don’t have the tools to efficiently use it to guide our students — or better yet, to allow our students to facilitate their own learning. However, because the data is digital, it can be mined for the information we want. As you say, the know-how to create these mining tools is already available.

      However, I don’t buy the idea that a teacher must be able to create her or his own data gathering tools. It’s tantamount to saying that people who use computers ought to know how to build them from scratch, or that people who drive should be able to build and maintain their own cars. In fact, the vast majority of administrators who use analytics hire staff to manage the technical aspects of mining. Tool building and tool use aren’t the same, and we need to distinguish between these, too.

      Teachers need tools for formative analytics — tools that they can easily manipulate or “squeeze” to get the information they want when they want it so that they can tell individual students exactly what they need to do to improve. Ideally, this mining tool would be “smart” and determine, from the context, what information to gather and how to present it. Thus, for Student X’s current draft, it might automatically report not only repeated grammatical problems but repeated failures to apply specific guidelines. This report could be sent to both the teacher and student. Mining tools such as this could be used proactively by students to guide their own learning. -Jim S

  3. It’s reassuring that George Siemens repeatedly insists on privacy issues in this interview. Because he is not speaking of learning that happens within closed LMS’s, but of Massive Open Online Courses (MOOC), with learning happening outside formal virtual classrooms, in social networks, on blogs, in forums etc. (see Stefanie Panke‘s reports on Plenk 2010 here).

    When you enter an e-mail address in Gmail and the darned thing says “You might consider adding X, Y and Z too”, when you go to YouTube.com and the page greets you with suggestions of videos you might be interested in, when you start following someone on Twitter and the page tells you you might be interested in following other people too, it is a bit spooky, but you know that it is just a piece of software knitting Aran patterns with gathered data. And that the humans who wield the software are not interested in you as a person, just in the patterns you might fit in. Or at least you hope so.

    But educators wielding the same kind of powerful software to assess students’ informal learning outside the learning institution – this could be really scary.

    • Claude, I disagree with your statement: “George Siemens . . . is not speaking of learning that happens within closed LMS’s, but of Massive Open Online Courses (MOOC), with learning happening outside formal virtual classrooms, in social networks, on blogs, in forums etc.”

      Of course he is. And he also mentions MOOCs. -Jim S

      • Hi Jim,
        I was thinking of:

        … An area of data gathering that universities and schools are largely overlooking relates to the distributed social interactions learners engage in on a daily basis through Facebook, blogs, Twitter, and similar tools. Of course, privacy issues are significant here. However, as we are researching at Athabasca University, social networks can provide valuable insight into how connected learners are to each other and to the university. (…)
        The existing data gathering in schools and universities pales in comparison to the value of data mining and learning analytics opportunities that exist in the distributed social and informational networks that we all participate in on a daily basis. It is here, I think, that most of the novel insights on learning and knowledge growth will occur. When we interact in a learning management system (LMS), we do so purposefully — to learn or to complete an assignment. Our interaction in distributed systems is more “authentic” and can yield novel insights into how we are connected, our sentiments, and our needs in relation to learning success.

        in the interview you review, and of the fact that Siemens’ recent work has been about getting learning outside walled-in LMSs.
        For an example of analytics applied to social networks, see the second part (from 11.01) of Rob Roy’s The Birth of a Word TED talk, where he explains how his lab is now using the tools developped for analyzing how his son learned to speak to social network comments about news in traditional media.

  4. […] Data Mining and Online Learning By Jim Shimabukuro Editor George Siemens, in an interview with Audrey Watters, says, "In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the … Source: etcjournal.com […]

  5. […] Via Scoop.it – A New Society, a new education! By Jim Shimabukuro Editor George Siemens, in an interview with Audrey Watters, says, “In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the …Show original […]

  6. […] Data Mining and Online Learning By Jim Shimabukuro Editor George Siemens, in an interview with Audrey Watters, says, "In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the … Source: etcjournal.com […]

  7. […] Data Mining and Online Learning #yam By Jim Shimabukuro Editor George Siemens, in an interview with Audrey Watters, says, "In terms of evaluation of learners, assessment should be in-process, not at the conclusion of a course in the … Source: etcjournal.com […]

  8. […] original article: Data Mining and Online Learning « Educational Technology and … This entry was posted in Educational technology and tagged access-journals, creativity, […]

Leave a comment