Khan Academy, Google apps, Moodle - Data Mining

gerri.songer's picture

Data mining your children
Consider the popular nonprofit tutorial service Khan Academy. It’s free. But users do pay a price: In effect, they trade their data for the tutoring.

“Data is the real asset,” founder Sal Khan told an academic conference last fall.

The site tracks the academic progress of students 13 and older as they work through online lessons in math, science and other subjects. It also logs their location when they sign in and monitors their Web browsing habits. And it reserves the right to seek out personal details about users from other sources, as well, potentially building rich profiles of their interests and connections.

After POLITICO inquired about Khan Academy’s privacy policy, which gave it the right to draw on students’ personal information to send them customized advertising, the policy was completely rewritten. The new text, posted online late last week (, emphasizes Khan Academy’s commitment to protecting privacy and deletes the line about targeted advertising.

But the revised policy makes clear that Khan Academy still allows third parties, such as YouTube and Google, to place the tiny text files known as “cookies” on students’ computers to collect and store information about their Web usage. Khan Academy also states that it may share personal information with app developers and other external partners, with students’ consent.

Parents and teachers typically turn to companies’ privacy policies to try to figure out what student data is being collected and how it could be used. Clarity is a rarity.

Even companies that assert they do not sell personal information typically reserve the right to change that policy at any time. Most won’t notify users in the event of such a change. Instead, they recommend reading the online privacy policy regularly to see if it’s been updated.

Most policies also indicate that student information will trade hands, and may be subject to an entirely new privacy policy, if the company is sold — a common fate for a start-up.

Then there’s the legal jargon and fuzzy terminology to unravel.

Moodle, which many schools use as a forum for students to post work and communicate with teachers, states that it won’t share users’ personal information — “but it may be accessible to those volunteers and staff who administer the site and infrastructure.” Who are those volunteers? Are they trained to protect user privacy? The site lists an email address for users to get more information, but questions sent to that address bounced back.

Google’s privacy policy is considerably more detailed, but until recently, it did not make clear that the company scanned all emails sent through its Google Apps for Education platform, which is used by millions of students and teachers. The automated scan picked out key words that might suggest a user was, say, planning a camping trip. Google could then use that information to target ads to that individual. It did not routinely send ads to students, but it did direct them to alumni who used the Google Apps for Education platform.

After angry students filed a lawsuit (, Google updated its terms of service to acknowledge the email scanning — and then announced late last month that it would stop the practice altogether for customers using Apps for Education.

Other companies don’t make any privacy policy at all available for parents to review, POLITICO has found.

The data storage and analytics firm eScholar, which sells software to help districts manage records on 20 million students — and stores some of that data on its servers — does not have a posted privacy policy. Spokeswoman Ann Tarasena said the company is working on it. In the meantime, eScholar writes privacy protections into its contracts with districts. It wouldn’t release the contracts — citing privacy concerns.

On Thursday, responding to questions raised by this article, the company posted online a statement of its general privacy principles (, including a pledge not to sell student data.

Then there’s Panorama Education, a data analytics platform used by thousands of schools and backed by investors including Facebook’s Mark Zuckerberg and actor Ashton Kutcher. CEO Aaron Feuer said the company abides by each district’s privacy rules, but it does not have a blanket policy to share with the public.

The lack of consistent standards troubles Sen. Markey, who has become a leading voice on consumer privacy in Congress. “The goal here should be to help scholars make the grade,” Markey said, “not help companies make a sale.”

"Parents need to understand: Why is this information being collected? How is it going to help my child? What is being done to protect it? Lots of places aren't doing these things," Guidera says. "We're encouraging states and districts to annually publish a list so people can see exactly what's being collected and why." But disclosure is not enough, Guidera says; the responsible use of data means using that data to help students, not to trigger negative consequences.

What Parents Need To Know About Big Data And Student Privacy

Policymakers and legal experts are being forced to improvise new policies and procedures aimed at protecting the privacy of young people. Critics fear the misuse of student data by hackers, marketers and, most worryingly, by the government authorities who themselves are collecting it.

Concern is growing:
This month, a working group on big data and privacy appointed by President Obama released its findings ( Alongside recommendations to update the "Consumer Privacy Bill of Rights" and pass national legislation regarding data breaches, the group singled out data collected on students in school as a matter of special concern.

In March, New York became the first state to make it someone's job to oversee this vital issue, creating a position called chief privacy officer in the Education Department. The job description? "Establishing standards for educational agency data security and privacy policies." Translation: providing the state's 698 school districts and over 500 colleges and universities, as well as state agencies, with uniform approaches to managing — and protecting — student data such as test scores, transcripts, health information, even dates of birth, racial or ethnic standing and Social Security numbers.

In that same legislation, New York became the last state to pull the plug on InBloom. The project was supposed to create a shared infrastructure for storing student data and making them available to educational software developers, but it had to shut down after drawing the ire of privacy advocates.
But the potential here is great as well. A report by McKinsey ( last year singled out education as the sector that could benefit the most from the free exchange of data, adding as much as $1.2 trillion to the economy through more efficient, effective instruction.

Student data used to be the pet cause of a small group of lawyers and activists. Now, in part because of the InBloom controversy, the issue is gaining broader attention ( This year, 82 bills in 32 states have been introduced that somehow address student privacy.

In 2005, things changed. The federal government began awarding grants for the creation of Statewide Longitudinal Data Systems (, or SLDS. That marked the entrance of big data into education, enabled by the leaps forward in the ability to store and process information on remote servers "in the cloud." States and schools for the first time could centralize, organize, search and analyze information on millions of students, in the ways that corporations have been doing for decades. And many for-profit companies, like Google, have rushed in to help them do this, providing software to collect and crunch this information.

As the name implies, Statewide Longitudinal Data Systems (SLDS) create unique numbers to identify students and track them from the day they enter kindergarten, or even preschool. In other words, a wealth of information, contained in a single record, can follow a student for 20 years or more.

In some states these systems are being extended and shared across state lines, to private and for-profit colleges, and with employers. They're also being used in at least 17 states to track teachers from their grades in teachers' college, to their students' performance in the classroom.

So, you can see why privacy advocates have concerns. "My younger son's records were breached when he was in college," says Sheila Kaplan (, a New York student privacy activist who was involved in drafting the legislation creating a chief privacy officer. "I was amazed. Why would you collect records that you can't protect? As more parents become aware of this, they are freaked out."

"Most people, when you say the word 'educational data,' the first thing that comes to mind is a test score," says Aimee Guidera of the Data Quality Campaign (, a group backing the data push. "We're helping to redefine what data is in education, so it really is a wide breadth of data points that come together to provide a richer picture."

Here's an example. In 2013, New York City learned with the help of student data tracking that almost four of five public high school graduates needed remediation when they got to city community colleges. That suggests a mismatch between what was on the state high school tests, and what students actually needed to know in college.

Data systems could soon integrate software designed to monitor learning and provide feedback to teachers, schools, students and parents. Programs such as DreamBox Learning, Khan Academy and Scholastic's Math 180 automatically crunch information at split-second intervals, from how many problems a student solved to the time he or she spent doing it. This information can create a detailed picture of student performance, and prompt teacher interventions at just the right moment — an innovation known as "learning analytics."

In 2009, the U.S. Department of Education made creation of an SLDS mandatory for any state that wanted to win funds under its Race to the Top program. "That provided a boost to underscore the work we were doing," Guidera says. "It reinforced what states were already doing, raised the priority and made the data issue a sexy one by calling it out as something we needed to focus on."

Today, all 50 states and the District of Columbia, Puerto Rico and the Virgin Islands have made progress on creating an SLDS. Guidera says 44 states have at least some ability to connect K-12 with postsecondary data.

What The Law Says
The main law that governs data kept by public schools is the 1974 Family Educational Rights and Privacy Act, or FERPA ( It gives parents and students, once they turn 18, three rights: to inspect their own records, to correct those records, and to give consent in writing before the release of those records to any third party.

Well, for the most part. There are two blanket exemptions. One covers the "what" (of student information) and the other the "who" (is authorized to see it).

"The big hole in FERPA is directory information," says Sheila Kaplan, the privacy activist. She explains: FERPA allows schools to release a student's "name, address, telephone number, date and place of birth, honors and awards, and dates of attendance" without first obtaining consent (although they are supposed to disclose the release and allow parents to opt out of directories).

The second hole got much, much wider in the past few years.

FERPA always allowed school officials to release records to other education officials without parental consent. In 2008, that right was expanded to contractors and volunteers, as long as they were under "direct control" of schools. This included for-profit cloud service providers.

"That opened the doors for the Googles and the Microsofts," says Khaliah Barnes, director of the Student Privacy Project at the Electronic Privacy Information Center, or EPIC. In 2011, a second exemption allowed schools to release information to "authorized representatives" of state authorities.

The Data Quality Campaign was among those that pushed for the changes. It argued that individual parental authorization was "impractical" for big-data systems.

In 2012, EPIC sued the Education Department to fight the new regulations. The suit was dismissed for lack of standing.

Privacy advocates say that all of this creates three separate categories of risk — from hackers, marketers and spies.

First, the more information is collected, the more it is centralized, and the longer it is stored, the greater the danger from hackers. "If a school district loses a computer disk with information of about 200 kids on it, it's terrible," says Joel Reidenberg, director of the Center on Law and Information Policy at Fordham Law School. "Now let's say it's a large data set that includes 20,000 students that gets breached. The magnitude is much greater."

There have been several recent examples ( at major universities where tens of thousands ( of student records were stolen or accidentally exposed (

Reidenberg notes that school districts may have lower security than large corporations or universities. "You have failures at institutions that are spending millions trying to protect the security of their data. Is there any reason to believe that school systems are going to be more successful?"

Increasing the concern, many apps by their very nature are designed to make student data more shareable than ever before. According to a recent investigation by Politico (, apps such as Learnboost (, a free online gradebook program, make it easy for teachers to email student records around the web. Similar concerns arise when teachers start Facebook pages, as many do, exposing classroom discussions to the commercial web.

So far, there have been no reports of large breaches of SLDS. Although Reidenberg says that could be because we don't know about them. Last year his center at Fordham Law examined agreements between school districts and their cloud providers and found that most do not require providers to disclose any leaks.

The second concern is that student data will be monetized. Reidenberg's study ( found that fewer than 7% of district contracts restricted the sale or marketing of student information by vendors. It did not, however, say how many of the cloud service providers are actually selling that info.

But often the issue is murkier than the outright sale of information. For many cloud services, like Google Apps, the entire business model is based on mining data for marketing. "A quarter of the services are free to the districts — the providers are monetizing it somehow," Reidenberg says. Even the nonprofit Khan Academy allows third parties like Youtube to track students' web usage (

In practice, defining the commercial misuse of student data is tricky. A program such as Pearson's enVisionMATH, a software-based tutoring platform, continuously analyzes millions of data points on student performance in order to improve its products and pitch more relevant products to school systems. That's both an educational and a commercial use.

The final potential threat comes not from shady hackers or greedy vendors. It's that the very people who create and maintain these databases will misuse the information.

Recall, advocates want students tracked all the way into the workforce. "We have a real disconnect about knowing how well we're preparing students for the world of work," the Data Quality Campaign's Guidera says. But for privacy experts, this raises the specter that a student's suspension in third grade could be used to deny him a job 15 years later. Or what about someone being denied a spot in a public university because big data predict she is overwhelmingly likely to drop out?

"If someone makes an 'adverse decision' about you based on your credit report, they have to inform you." Yet no such protection exists for student records.

Khan Academy
"We want them to take ownership of their learning," Khan says.At Khan Academy, that ownership starts when user accounts are set up, not by schools or districts, but by the students. The academy sends out a confirmation e-mail, the user logs in, and he or she is presented with a "Knowledge Map" from which to choose a topic. A set of problems for the topic appear, one after the other. When students get stuck, they can choose to watch a related video or start showing hints, either of which lays out how to work through a problem step by step.

At each decision point, the system is collecting data: how many problems are solved, how many hints are taken, how long the student takes to solve the problem, what videos are watched, how much time is spent on each video. Students can add "coaches," such as the teacher and other students, who can view the same data stream.

It's that data stream that Khan would actually like us to think about when we hear his name, not the videos. And that's where the investment needs to be, he says. "It seems, anecdotally, that people find value in [the videos]; but if we can use that as a way to bring them and engage in this deeper experience and then leverage the data that we can collect to make anything we create better, that'd be a win."

It's the Data
Kami Thordarson understands the power of the data behind the videos. Los Altos School District, where she works, has been experimenting with Khan Academy resources for a couple of years. Thordarson was chosen as one of the test pilots in those efforts because, she explains, "I had been somewhat of a risk taker and I love technology; I love bringing new, innovative things into the classroom."

Recently, Thordarson became the California district's innovative strategies coach, a new position where she spends her time helping teachers come up with new ways to deliver instruction. But she has her teaching chops. She'd spent four years at Los Altos teaching fifth and sixth grades; prior to that she'd taught in Colorado for a decade.

The innovation of Khan, she insists, isn't in the videos. "To be honest, that is such a small part of how we use Khan Academy and where our focus is. That it's not really our main concern. The videos are there, and they're an awesome resource. But that isn't what drives us to use the product."

The value of Khan lies with its lesser-known components: open-ended and interactive math exercises and the data those produce. "Khan Academy for us is a tool that helps us drive curriculum decisions. It generates data unlike any other tool that we've got. I can get immediate feedback on how kids are performing on certain skills that I can't get from other assessments. It's real time."

Generating detailed data is clearly something that standard video materials don't do, so saying that Khan is just video instruction is hardly fair or accurate. But the Khan data is not as sophisticated as what would be generated by true diagnostic software, which can analyze students' underlying skill deficiencies and branch them to appropriately relearning activities before moving them on in the skill sequence.

Khan Academy: This video is a MUST SEE.

In a send-off of the Comedy Central classic Mystery Science Theater 3000, two teacher-educators sit in front of a Khan Academy video on multiplying and dividing positive and negative integers and offer their critical commentary. Dave Coffey and John Golden are the hosts here (they really do need at least one talking robot), and they clearly are not big fans of Mr. Khan or his patron Mr. Gates.

The two teachers systematically dissect the video, noting a variety of missteps. There are a few unquestionable errors of mathematics: Khan uses incorrect terminology at a couple of points. Khan is also inconsistent in his language about positive and negative numbers (using plus when he means positive, or minus when he means negative), which is perhaps a lesser sin, but poor practice and misleading for students. He's also inconsistent in his use of symbols, sometimes writing "+4", sometimes writing "4", never explaining why he does or doesn't. He making the kind of mistakes that would reduce his score on the Mathematical Quality of Instruction observational instrument, used in the Gates-funded Measures of Effective Teaching Project.

The true fuel of their satire is their broader critique of Khan's approach. Khan teaches students to memorize a small set of procedural rules for dealing with multiplying negative numbers, with essentially zero effort expended to explain conceptually what the symbolic manipulations represent. In fact, in the final minute of the video, Khan says verbatim, "In your own time, think about why these rules apply."

Khan Academy pulled down the video satirized in MTT2K, Episode 1 within a day or so of publication. It will be interesting to see if they simply fix the outright errors, or if they address some of the broader pedagogical concerns.


Khan Academy teaches only one part of mathematics—procedures—and that isn't the most important part. Writing about mathematics, developing a disposition for mathematical thinking, demonstrating a conceptual understanding of mathematical topics are all more important than procedures. That said, procedures are still important, and Khan Academy provides one venue where students can learn them.

The Wrath Against Khan: Why Some Educators Are Questioning Khan
Objections to Khan Academy:
• Technology Replacing Teachers
• The Bill Gates Connection
• Old Wine, New Bottles, Bad Pedagogy
• Learning or Leveling Up?
• Khan Academy: Part of a Larger Trend - the "gamification of everything"; the potential for widespread distribution of educational materials online; YouTube-created stars bypassing the sanctioning of older institutions (Rebecca Black, Justin Bieber, Salman Khan); an anti-teacher climate (Waiting for Superman, Wisconsin, etc); a reliance on standardized testing to gauge students' learning; and various education reform movements.

Sylvia Martinez, Generation Yes:
• Khan Academy and the mythical math cure -
• Khan Academy: Algorithms and Autonomy -
• "Don't we need balance?" and other questions about Khan Academy -

Frank Noschese, John Jay High School:
• Khan Academy is an Indictment of Education -
• Khan Academy and the Effectiveness of Science Videos -
• Khan Academy: My Final Remarks -

How Khan Academy Is Changing the Rules of Education:

Khan Academy: The revolution that isn’t

Sal Khan, founder of the popular Khan Academy, explains how he prepares for each of his video lessons. He doesn’t use a script. In fact, he admits, “I don’t know what I’m going to say half the time.”

The highest ranking official in American education says that effective teaching requires training and planning, and then holds up as his archetype someone who openly admits to showing up to class every day unprepared. If a teacher said that, they’d be fired.

Khan Academy boasts almost 3,300 videos that have been viewed over 160 million times. But there’s a problem: the videos aren’t very good.


Take Khan’s explanation of slope, which he defines as “rise over run.” An effective math teacher will point out that “rise over run” isn’t the definition of slope at all but merely a way to calculate it.

Experienced educators have begun to push back against what they see as fundamental problems with Khan’s approach to teaching.

When asked why so many teachers have such adverse reactions to Khan Academy, Khan suggests it’s because they’re jealous. “It’d piss me off, too, if I had been teaching for 30 years and suddenly this ex-hedge-fund guy is hailed as the world’s teacher.”

Of course, teachers aren’t “pissed off” because Sal Khan is the world’s teacher. They’re concerned that he’s a bad teacher who people think is great; that the guy who’s delivered over 170 million lessons to students around the world openly brags about being unprepared and considers the precise explanation of mathematical concepts to be mere “nitpicking.” Experienced educators are concerned that when bad teaching happens in the classroom, it’s a crisis; but that when it happens on YouTube, it’s a “revolution.”

The truth is that there’s nothing revolutionary about Khan Academy at all. In fact, Khan’s style of instruction is identical to what students have seen for generations: a do this then do this approach to teaching that presents mathematics as a meaningless series of steps. Khan himself says that “math is not just random things to memorize and regurgitate,” yet that’s exactly how his videos present it.

The real problem with Khan Academy is not the low-quality videos or the absence of any pedagogical intentionality. It’s just one resource among many, after all. Rather, the danger is that we believe the promise of silver bullets – of simple solutions to complex problems – and in so doing become deaf to what really needs to be done.

We need to invest in professional development, and provide teachers with the support and resources they need to be successful. We need to give them time to collaborate, and create relevant content that engages students and develops not just rote skills but also conceptual understanding. We have to help new teachers figure out classroom management – to reach the student who shows up late to class every day and never brings a pencil – and free up veteran teachers to mentor younger colleagues.

We need to stop focusing on the teachers who are doing it wrong and instead recognize the ones who are doing it right.

We have to recognize the good, and then cultivate it.

Before we can do that, though, we have to agree on what “good” is. I don’t know what I’m going to say half the time isn’t good enough, and we have to stop pretending that it is.

We face very real challenges in K-12 education today, and they will not be solved with just a Wacom tablet and a YouTube account. Instead, they’ll be solved by teachers who understand their content; who understand how children learn; who walk into the classroom every day and think, “I know exactly what I’m going to say, because that’s what teaching means.”

The Khan Academy Controversy

History behind the Khan Academy:
In 2004, Sal Khan began sending tutorial videos to his cousin to help with her studies. The Khan Academy itself was founded in 2006, and Sal Khan began working on the videos full-time in 2009. Sal Khan gave a much publicized TED talk about the Khan Academy ( and “flipped classrooms” in 2011. The Khan Academy was pilot tested in the Los Altos school district for the first time in the 2010-2011 school year.

Interesting critiques of the Khan Academy:
An intriguing starting point is the Washington Post debate between Karim Kai Ani (, the founder of Mathalicious, and Sal Khan about the meaning of slope. It’s starts with a critique from Kai Ani, followed by a response post from Khan (

To grossly summarize this debate, the argument was that Sal Khan’s definition of slope (“rise over run”) was too simplistic and in part because of the simplicity did not teach anything more than a series of rules needed to calculate slope. In other words it did not teach the “concept” of slope.

This Wired Article ( gives a balanced presentation of the Khan Academy, highlighting both its strengths and weaknesses. Here’s an interesting blog post ( on why students may actually be worse off when using the Khan Academy as opposed to traditional teaching. And here’s a recent followup to the Washington Post stream featuring another critique from other math educators ( on the treatment of decimals and equal signs in the Khan Academy. And then there was the MTT2K ( challenge this summer to point out outright flaws in the Khan Academy videos, and instances in which procedure was emphasized over concept (something Khan claims to be against but critics argue he does repeatedly in his videos).

Also interesting is this blog from the Los Altos ( school district. I suppose many may take this as a criticism of the Khan Academy since they describe the hallmark videos as unengaging to some extent, but it also reaffirms the many aspects of the Khan Academy package that do work very well in their classrooms.

Other criticisms:
• Arguing about whether Khan’s videos are better than the lecture of an experienced teacher just detracts from the real issue - what happens outside the lecture? Changing the place of the lecture in the curriculum and the activities that surround the lecture is what makes something revolutionary, rather than coming up with a revolutionary new lecture. Most certainly Khan’s videos are not the best lectures, but perhaps the most revolutionary aspect of the Khan Academy is the de-emphasis of the lecture in his academy. The fact that his academy is so successful with such poor lectures says something.
• The Khan Academy offers video lectures that promote rote learning.

• Learning is not a one-size-fits-all package but individualized and customized
• Information is presented on-demand and in the kids’ control
• Instant feedback
• Competence is rewarded through achievements and badges