Cook County's Open Data Week is partnering with Big Data Week for a global spin on local data.
This week, data scientists from around the world will come together on the 28th April for the first ever 24-hour global data science hackathon. The challenge – to improve the US Environmental Protection Agency’s Air Quality Index by coming up with better and more accurate predictive models of metropolitan air pollution.
The EPA’s Air Quality Index is used daily by people suffering from asthma and other respiratory diseases to avoid dangerous levels of outdoor air pollutants, which can trigger attacks. The aim of the hackathon is to help build local early warning systems that are capable of accurately predicting dangerous levels of air pollutants on an hourly basis.
Hosted by Data Science London and Data Science Global as part of a Big Data Week event, and organized by Kaggle, the first ever global data science hackathon will take place at the same time in several cities around the world spanning a 24-hour period. During this time the data scientists will compete with each other for cash prizes using a large dataset provided by the Cook County local government, in Illinois, to create the best predictive model.
One of the goals of the event is to promote the exchange of ideas, and to disseminate knowledge and raise awareness of local data science communities around the world. Also the hope is that by encouraging group thinking, teamwork and free spirit competition it will promote an ethos of mutuality among data scientist across the globe.
Participants of the Data Science Global competition will be able to compete remotely or by attending any of the local venues. This will involve host cities from five participating countries, including London, New York, Boston, Chicago, San Francisco, Melbourne, Canberra, Berlin and Turku.
Those attending the London venue will be able to take part in both the Data Science London competition and the global prize. Remote participants and those attending other venues will only be able to compete for the global prize.
With prizes worth GBP 3,000 (USD 4,750), the event is sponsored by EMC, a world leader in Data Science and Big Data Solutions. London contestants will also be provided with a copy of the Community Edition of EMC/Greenplum HD, Base and MADlib.
“EMC is delighted to sponsor the hackathon,” said Chris Roche, Regional Director for Greenplum. “Healthcare provision and – specifically – the treatment of chronic diseases is one of the major concerns of governments worldwide. Serious respiratory disease affects over 700 million people globally and chronic disease accounts for over 80% of all primary care consultations.”
“If the hackathon can in some small way contribute to positive healthcare outcomes then the event will prove more than worthwhile,” said Roche. “What I like about the hackathon and the data science community is the accelerated innovation that they create. These open learning environments complement well Greenplum’s open source, agile and social approach to data science. Data science is, after all, a team sport.”
The Data Science Global hackathon will be organised as a Kaggle event. Kaggle’s innovative solution for statistical/analytics outsourcing is the leading platform for predictive modelling competitions. Its unique competition platform will provide real-time leader boards to allow participants to continuously keep track of their scores.
“This is a great opportunity to bring together the best data science minds in the world and see what they can achieve in just 24-hours,” said Jeremy Howard, President and Chief Scientist of Kaggle. Also it will be the first Kaggle competition to be held at venues so it should be quite a social event. Because for the first time participants will actually have a chance to meet each other.
Amazon AWS will provide free access to AWS cloud computing services to all the participants in the competition. Around 350 data scientist have already signed up to the event, with many more expected to join in the coming days.
According to the World Health Organisation there are now estimated to be 235 million people suffering from asthma. Globally, it is now the most common chronic disease among children, with incidence in the US doubling since 1980. In Chicago alone it is responsible for more than 70,000 emergency room visits each year.
“We are excited that Cook County data will be utilized during the Big Data Week Hackathon to create an air quality predictor model,” said Cook County Board President Toni Preckwinkle. “My administration has made a commitment to transparency and improved services, which includes pushing for open data across the board. I look forward to the innovation that will emerge from this event – it’s a testament to the enterprise and creativity that can occur when data is available in the public domain.”
Venue Details: HubWestminter, a startups hub in London New Zealand House 80 Haymarket, London SW1Y 4TE
Data Science London is a non-profit organization dedicated to the free, open, dissemination of data science. With more than 476 members, it is the second largest Data Science community in the world. Created by Data Scientists for Data Scientists, it acts as a forum for discussions and the exchange of ideas.
Data Science Global is a non-profit organization dedicated to bringing together the world’s communities of data scientists, artists, technologists and visionaries.
The goal of the Data Science hackathon is to promote the exchange of ideas and to disseminate data science knowledge, as well as to raise the awareness on local data science communities across the globe.
Big Data Week is a series of events aimed at bringing together Big Data communities to gain a better understanding of the diverse aspects of Big Data, from the technology challenges to the commercial opportunities.
Kaggle is the global leader in running predictive modeling competitions.
The company has run approximately 100 competitions with major enterprise, government, and academic customers including Allstate Insurance, Boehringer Ingelheim, Dunnhumby, Ford, Heritage Health, Microsoft, NASA, Stanford and Wikipedia. Over 33,000 data scientists worldwide have contributed to competitions that tackled the toughest predictive problems in the marketing, life sciences, insurance, financial services, travel, and science verticals.