Will Data Scientists Have a Big Impact on Education?

The data explosion in public schools and universities has increased the demand for people who understand its potential.

by / June 2, 2017 0
Data could change the course of education for years to come. That makes the emerging role of the data scientist especially important.

For education to make good on the promise of a data-driven future, a new role needs to emerge within the leadership ranks. While a few universities and a handful of school districts presently employ “data scientists” or other similarly titled professionals, this position — which is rapidly gaining ground in government and in diverse industry verticals — has yet to take deep root in education. 

Depending on what level of education you investigate and who you talk to, the data scientist is either a boon to its future or a position without a purpose. But there’s no question that data, used the right way, is knowledge, and how it’s used in education could change its course for years to come. More significantly, how the data is interpreted and analyzed makes it important to understand the role, current and future, of the data scientist.

In the K-12 world, data scientists may have to fight to prove their worth. While the business community has begun to invest in data as a driver of success, many educators feel lukewarm about it. They’ve been collecting data for years, and it’s mostly been used to penalize them. “We have used it as a zinger, a way to shut down schools and fire superintendents,” said Kecia Ray, director of the Center for Digital Education (the Center is part of e.Republic, Converge’s parent company).

These professionals will be highly educated. Eighty-eight percent of data scientists have at least a master’s degree, and 46 percent have Ph.D.s, according to recruiting firm Burtch Works. But it will take more than just a knowledge of how the numbers work to succeed in this field, experts say.

Even those willing to admit the possible usefulness of data still come to the table with little understanding of its potential. “Because we have focused so much on collecting and reporting, we don’t have people well trained in the synthesizing, analyzing and interpreting roles,” Ray said.

So the K-12 data scientist will have to be an educator, someone who is willing and able to help teachers understand and engage in data in new, more meaningful ways. Data scientists also will have to be big thinkers, individuals capable of envisioning far-reaching uses for their efforts.

“In K-12, our ideas have to scale to large numbers,” said Andrew Berning, a research scientist in K-12 Learning Analytics at the University of Texas at Arlington. “There are 5.5 million K-12 students in Texas, and some of our districts have 100,000 students. If the data scientists are going to be worth their salt, they are going to have to be able to scale their ideas across all the backgrounds and the levels of readiness that kids bring to school.”

Considering that potential large-scale impact, Berning believes most K-12 data scientists should be positioned fairly high in the administration. “They will always have to report to the instructional side of the house, but it should probably be a cabinet-level position reporting to the superintendent,” he said. “It should be on a par with the chief technology officer.”

From such a lofty perch, data scientists will need to deliver more than just good ideas and interesting reports: They will need to deliver outcomes. Given the media hype around artificial intelligence, big data and all the related buzzwords, educators are going to expect data scientists to make a noticeable difference.

“They need to show good working models,” said Leo Brehm, CIO/CTO of the Public Schools of Northborough and Southborough in Massachusetts. “That means they need to show how data can take the logistical burden off teachers. Teachers want to reach their students, and if you can free up their time to explore and make connections, they will be on board with that.”

Experts in higher education echo this thirst for tangible results. They say data scientists will need to be skilled in presenting information in ways that educators can readily put to use. “It’s one thing to create data that sits in tables, and it’s something else to be able to visualize that information,” said Susan Metros, a former associate vice provost and former associate CIO for Technology Enhanced Learning at the University of Southern California. “If we are really going to be able to understand data and compare it and adopt it into practice, people are going to have to be able to see it, to compare it, to interact with it in ways that are meaningful.”

Data scientists will need to be discreet. They will need to be remarkably sensitive to the requirements for privacy and confidentiality that surround their work. “There is a growing call for open access, but with education, just like with medical data, you can’t really have that,” said Alex J. Bowers, an associate professor of education leadership at Teachers College, Columbia University.

They’ll also be expected to respond to such calls for openness while at the same time keeping individual student data locked up tight. Bowers suggested the way to strike the balance is to keep data private, but give open access to all the algorithms and analyses that school districts use. This invites the public to be a part of the process, while still respecting discretion.

Overcoming Skepticism

The role of the education data scientist becomes more apparent when you consider the fact that educators at all levels stand knee-deep in data. Colleges have enrollment information, financial data and a range of personal metrics. In K-12, “you know where students live, their household financial status, how many parents they have. Then you have all this academic data, qualitative and qualitative, about how they respond in math and history and science and reading,” said Ray.

Administrators at the New York Institute of Technology use data to make real-estate decisions. The Public Schools of Northborough and Southborough leans on data to maximize its technology investments, while Georgia public schools leverage data to give teachers a 360-degree view of students’ progress.

But challenges are inherent in the emerging wave of data-driven education. Teachers may be wary after years of seeing student metrics wielded like a club and test scores used as a rationale for penalizing districts.

Some are skeptical of data scientists’ ability to peer deeply enough into the inner workings of education. “In an online course, I can tell you how many minutes someone spent on a page,” said Metros. “But they could have been checking Facebook or having three other conversations at the same time.”

In higher education, data scientists are tapping metrics to improve student outcomes, but the push goes far beyond that. Schools are using data to better manage their facilities, fine-tune online courses and allocate their course offerings.

At the New York Institute of Technology, Mark C. Hampton, vice president for planning, analytics and decision support, has tapped into analytics to find more classroom space. With campuses on Long Island and Manhattan, where space is always at a premium, data can make a difference.

“It’s not like we can just pitch a tent. We need to have a very concrete plan, and absent this data, we haven’t been able to do that,” he said. A recent dive into the data showed that classes in the health professions are among the school’s strongest performers. “As a direct result of these analyses, the conversations have turned to be about getting them more space.”

Tristan Denley is using data to ensure students can maximize the number of courses they take. As vice chancellor for academic affairs at the Tennessee Board of Regents, he is responsible for 46 schools, including six universities, 13 community colleges and 27 colleges of applied technology.

“To enable students to take a fuller schedule, you have to do more than just say, ‘Take a fuller schedule.’ If you want them to do that, and you want them to take the right classes, you need to crank the numbers,” he said.

Data in the Ed Tech Ecosystem

The world of ed tech is being dramatically reshaped by the application of data.

Take for instance Brainly, whose online K-12 social learning platform draws 80 million students a month from 35 countries. Students use the site to exchange curated questions and answers about academic topics.

“We have about 30 million answered questions and 8,000 new questions daily. 

So we have a lot of data to analyze in terms of the content,” said Erik Choi, Brainly’s principal researcher. “We also let our users ‘friend’ each other, so we have a lot of data on who is friending whom. We also have more than 1,000 moderators curating our content, so we have information about why this answer is bad, why this question is bad.”

Site operators are developing algorithms to automatically detect the quality of questions and answers. They integrate dozens of data points in order to quickly discard bad answers, speedily approve winning replies, and focus their moderators’ energies on the middle zone: Questions and answers that require a little bit of attention. These predictive models make for dramatically improved efficiencies.

“We can message them that they are at a six out of 10 in terms of their learning activities, and tell them the steps they can take to improve their learning score,” he said. “Proper interventions can improve each individual’s learning process, and data can help to drive that.”

Denley uses a mix of off-the-shelf software products to dig deep into course utilization data, to see who is taking what courses and also to see where students get stuck. “We have to offer the courses students need in sufficient capacity, at different times, during the day and in the evening, online and on branch campuses. There is real hard science sitting behind those decisions,” he said.

Such analyses could cut right to the heart of the academic enterprise. “The challenge in higher ed is about capacity,” Ray said. “In order to keep all these people employed and all these buildings running, we have to fill a certain number of seats. So how do you recruit and retain students, with the right ratio of people paying cash and taking loans and taking grants, all while being financially stable enough to attract the best faculty and build the best facilities?”

Data can help schools to meet that high mark. Some also see a role for data in improving student outcomes across both online and traditional higher education. “We are able to identify students within three key classes and know whether or not they are going to graduate,” said Todd Gary, a visiting scholar in data science at WPC Healthcare, serving in the Office of Research at Middle Tennessee State University. “We can do it in the first semesters of freshman year. If you look at how they are doing in a writing class, a math class and the first class in their major, that is all you need, and if you have that, then you can do the early intervention.”

Such predictive algorithms rely on the availability of rich sources of data. Universities already have metrics aplenty, and lately the rise of online learning has brought with it a significant new data stream in support of educators’ efforts. Who looked at what course materials, when and for how long? It’s all measurable, and such data is especially helpful given that teachers in an online setting often do not have as close a connection to their students as do their brick-and-mortar peers.

“When you teach an online course, you often don’t actually know the students personally. The interaction is very different from an in-person class,” said Manfred Minimair, associate professor and program director for data visualization and analysis at Seton Hall University. Online educators can close the gap by digging into the information they do have available. “You can actually track when do they open some document, how often do they see it, what time of day. You can learn more about their study habits because everything is being tracked.”

The New and Unexpected in K-12

In the K-12 world, data can challenge educators and administrators to view old problems in new ways. For example, conventional wisdom says the way to combat principal turnover is by paying principals more. The data say otherwise, according to Bowers. He ran a data analytics project to understand why principals leave their jobs and found that the ones who complain about being underpaid also are among the least effective leaders.

“At a policy level, this means that the idea of paying principals more might end up keeping people in those jobs who aren’t the best people for those jobs,” Bowers said.

That’s just one small example of the new and unexpected ways modern data analytics can change the K-12 picture. Others, meanwhile, are looking to data science to shape how school districts spend their precious IT dollars.

In the Public Schools of Northborough and Southborough, for example, Brehm recently started using the ed tech accountability platform CatchOn to track how apps were being used. While the district has a big inventory of apps that it pays for, it turned out the most-used app is a freebie from National Geographic.

“We had no fewer than 50 apps that we were paying for, but we had no idea what the usage was like,” he said. “If only 10 teachers are using an app, I can’t justify spending $25,000 a year on that. The new data enable me to ask the right questions: What is working? What are people using or not using?”

Student outcomes likewise can be swayed by creative data initiatives. Robert Swiggum, CIO of the Georgia Department of Education, has been working along those lines. Specifically, his team has developed a streamlined way for educators to meld data from mid-year testing into the district’s longitudinal data system in order to combine long-term data on a student’s progress with snapshot status checks. “That has really helped teachers a lot. By combining the longitudinal data with the present-day data, they can generate more personalized learning,” he said.

The state system, which saw 66 million hits in 2016, also offers a resource link with educational support materials. Teachers can use those materials based on the combined testing and longitudinal information in order to tailor instruction. “Now they have a 360-degree view of this student, and they have the materials they need to act on that,” Swiggum said.

To ensure teacher buy-in, the IT team worked hard to make the system user friendly. “The hardest part is just to get people to look at it. Teachers are so busy with so many different things, they have a ton of work,” Swiggum said.

While teacher buy-in may be a hurdle in some districts, others wrangle with the technical aspects of data science. It can be a complex chore to integrate and perform analytics on multiple data points for thousands or tens of thousands of students. The task requires a robust architecture, said Greg Hughes, CIO of the Delaware Department of Education.

Hughes said he is fortunate in that his state supports a single information system for all 135,000 public school students. “That means we have a really nice pipeline to feed our data warehouse. We have a common interface with all the schools,” he said. “That makes it a little easier for us to pull data together.”

Availability of data is a key enabler, Hughes added. Of equal or even greater importance is the availability of a skilled data scientist.  

Adam Stone Contributing Writer

A seasoned journalist with 20+ years' experience, Adam Stone covers education, technology, government and the military, along with diverse other topics. His work has appeared in dozens of general and niche publications nationwide.