What is Kaggle?
Kaggle is the most popular platform for hosting data science and machine learning competitions. A whole community of kagglers grew around the platform, ranging from those just starting out all the way to Geoffrey Hinton.
In 2017, Kaggle was acquired by Google and integrated with Google Cloud Platform. Now, both the competition data can be hosted in the cloud, and the compute can happen there as well. Kagglers have a possibility to run their competition code on GCP by creating the so-called Kaggle kernels (interactive Python or R notebooks), which make it possible to share code and submit competition entries.
Kaggle already hosted other competitions organized by the financial sector companies: forecasting stock movements based on news, predicting value of a transaction for a customer or predicting real estate value fluctuations. Financial data is most often tabular in nature. Recently, image classification and segmentation competitions were outnumbering the ones based on tabular and time-series data: pneumonia detection on X-Ray, working with satellite imagery, seismic images, or just ordinary photographs.
The competition organized by Home Credit ended up being incredibly popular and by the number of competition entries it was the biggest competition ever on kaggle (by the number of participants, it’s second only to the playground challenge on predicting the survival on Titanic, which is a very romantic, but completely moot problem).
In fact, the Home Credit’s competition might right now hold the status of the biggest data science competition ever, if we consider the fact that Kaggle’s counterparts are not nearly as popular. CrowdAI focuses more on academia (e.g., for NIPS benchmarks) and governmental institutions, startcrowd.club and drivendata.co are focusing on companies, and all of them combined do not attract nearly as many participants as Kaggle does.