March 23, 2015
COMPETITION CHALLENGED PARTICIPANTS TO USE DATA SCIENCE TO EFFECT CHANGE ON A GLOBAL SCALE
McLean, VA — After 90 days of intense competition, in which the data science community applied its skills to effect change on a global scale, a team of deep learning specialists has won the inaugural National Data Science Bowl. The competition, co-sponsored by Booz Allen Hamilton and Kaggle and created with data from oceanographers at Oregon State University Hatfield Marine Science Center, challenged participants to create an algorithm that automates an ocean health assessment process, which would have taken marine researchers more than two lifetimes to manually complete.
The Hatfield Marine Science Center, the data provider and beneficiary of the event, has received more than an estimated four-million dollar ‘in kind’ donation of analytics research through the participants’ submissions. Team Deep Sea from Ghent University developed the most effective algorithm to automatically classify more than 100,000 underwater images of plankton, marking a major step forward for the marine research community, as plankton populations are key indicators of ocean health. The work by the seven-person team of graduate students and post-doctoral researchers automated the classification process for the first time in history. Together, they beat more than 1,000 other teams and achieved better results than 15,000+ competing submissions in quickly and accurately distinguishing 121 distinct categories of oceanic organisms.
Through their work, they have enabled the rapid analysis of data sets with a digital size equivalent to 400,000 3-minute YouTube videos, enough to watch continuously for over 800 days. Team Deep Sea’s work therefore represents major advances for both the marine research and data science communities.
“The excitement and participation that the National Data Science Bowl generated during its inaugural competition was inspiring,” said Booz Allen Hamilton’s Josh Sullivan, a senior vice president in the firm’s Strategic Innovation Group. “This was an extremely difficult problem to solve; and through hard work, perseverance and ingenuity, the participants had a massive impact on the marine research community. The National Data Science Bowl was born from the realization that, in order to thrive, the data science community must be given opportunities to use its talents to benefit both business and society. It’s also a testament to the growing importance of data science across all disciplines that, most recently, received mainstream attention from the White House with President Obama’s appointment of the country’s first Chief Data Scientist.”
“The quality of submissions and types of ideas being discussed by our community were truly amazing,” said Anthony Goldbloom, Kaggle’s founder and CEO. “The winning team used a cutting edge deep learning approach to create their winning model. Currently, even basic machine learning techniques are not widely used in the marine sciences and this competition has done a tremendous amount towards further exposing researchers in the field to its benefits.”
“We were originally drawn to this competition because of the vital social cause it supported,” said Team Deep Sea member Pieter Buteneers. “What we found was a truly life-changing opportunity to collaborate as a team and build something great together, and we are proud to have competed against such a high-caliber field of data scientists.”
For researchers at the Hatfield Marine Science Center, the algorithms will serve as critical tools for assessing ocean health. A large and thriving plankton population is crucial for driving some of the planet’s most life-sustaining processes. They cycle nearly half of the carbon absorbed by the Earth’s ocean each year as well as form the foundation for marine food webs. They are extremely susceptible to changes in ocean temperature and chemistry, making them a key indicator of broader ecological health. The algorithms created through the National Data Science Bowl will allow rapid assessment of their populations, enabling the marine research community to monitor ocean health at an unprecedented speed and scale. These types of real-time insights have not been possible through manual identification and analysis and represent an important step forward in understanding as well as protecting the environment.
"We’re excited to receive the winning algorithms from the National Data Science Bowl and to test and validate these proofs of concepts in our own labs,” said Hatfield Marine Science Center Director Bob Cowen. “Our hope is that we will be able to expand upon this research and, eventually, make it an open source tool for the marine research community.”
For more information about the winning teams and to apply to participate in a future National Data Science Bowl, please visit www.datasciencebowl.com.