Discover what smart strategies, solutions and practices you can be implementing to prepare your IT infrastructure for the inevitable technological changes coming to your campus.
Machine learning research and business data analysis were just a few projects that 18 students worked on this summer with the Watson supercomputer, which garnered worldwide fame last year after winning a $1 million Jeopardy! challenge on the game show in 2011.
IBM chose 18 student interns out of 1,400 applicants to explore how the Watson computing system could help solve real-world challenges.
A professor who advises a group of students from Yale shared how his students learn analytics skills like the students did who participated in the internship. They worked on a project for clients of IBM that involved analyzing business data and making recommendations for the clients, but were not involved in the intern program.
"I find that the questions they bring to us are not the standard ones; they're interesting and challenging," said Ravi Dhar, professor of marketing and director of the Yale Center for Customer Insights, who advised the students when needed. "I like the way the students took a big challenge and broke it up into mini challenges and looked at many different perspectives. These projects have the potential to be both analytical and also creative."
And they also involve social media. Consumers leave their digital footprints everywhere on social networks and the Web. These footprints show what products they're looking for, how they're choosing them and which ones they buy.
Social media and data analysis skills are especially important for businesses, whose managers are typically older and not trained as well on social media. Many younger business students are using social media every day and are learning to analyze the data on social media sites. That way, they can bring new insights to bear.
In an internship this summer, Malcolm Greaves and other students gained experience with computer science research, specifically in the subdiscipline of machine learning research. Greaves, a senior at Carnegie Mellon, hopes to build and design programs that improve machines' performance as they gain more experience.
At Carnegie Mellon last fall, students and faculty watched the supercomputer's performance on the $1 million Jeopardy! challenge.
"What I saw in Watson was this state-of-the-art artificial intelligence system," Greaves said, "and I was humbled by its awesomeness and its ability. I remember thinking, 'I would love to work on it.'"
But since Watson had already come out of the research phase, Greaves figured he wouldn't get the chance to do so. A few months later, his adviser sent him an email about IBM's internship. Greaves applied that week and was later offered an internship.
While at IBM, Greaves improved his programming skills, technical knowledge and machine learning-specific algorithms. The project he worked on involved feature selection, a task that's important in machine learning. In machine learning, people design a model -- a way of viewing and understanding the task that you want the machine to complete.
In the case of Watson, the broad task was to provide the supercomputer with a question posed in sentence form, which Watson answers. Watson returns a ranked list of up to five answers, with the first answer being the one the supercomputer places the most confidence in.
A series of complex models help the supercomputer understand the task. These models include features that describe the data. Once the supercomputer has the data, it learns something, and answers questions correctly after receiving new data. Watson has about 400 features to make sense of data.
An essential question with any machine learning project is, "Do you need all these features, and are they all relevant to the task you're trying to do? " If they aren't, then researchers try to figure out which features could be removed and see whether that improves performance.
The project Greaves worked on involved developing and researching methods that the Watson core development team could use to improve the supercomputer's performance. From a computer program standpoint, performance equates to speed and memory use. Machine learning performance involves how well it does a task. Researchers measured how accurately Watson returns answers to questions.
By cutting out features, researchers eliminate computations that don't need to be performed, which saves time and memory.
"When you do feature selection, if you do it properly, it will improve not only your program performance, but it will give you that answer faster, which will improve your machine learning performance," Greaves said. "So it will give you the correct answer more often or you'll have a higher likelihood of having the correct answer as one of the top one, two or three answer positions."
Greaves said he had a good amount of control in the project and felt like he was trusted to solve problems. He really appreciated that trust, which is not always given to interns.
And once he graduates this school year, Greaves hopes to go on to graduate school so he can continue to hone his skills in machine learning research.
You may use or reference this story with attribution and a link to