Introduction
I have built prediction models using Python to forecast horse racing outcomes based on public data from the Jockey Club. The models I used include Neural Networks, Logistic Regression, and Random Forest. However, the prediction accuracy is often below 50%. I was wondering if that’s the best result we can get from forecasting or prediction prespective.
To address this, I set up a website at https://nyp.pythonanywhere.com/racecard/?1 and invited horse racing fans to submit their choices. For each race, they can submit 3 selected horses.
The competition
In the latest match at the Hong Kong Jockey Club, there were 5 human participants whose performance I tracked. I was amazed to find that some participants did exceptionally well. For instance, a website member with the username “ML” consistently and accurately picked 2 out of 3 horses for Place selection.
The final results showed that humans won the competition with 52% accuracy, while Neural Network Regression came in 2nd with an accuracy rate of 48%.
Conclusion
There must be some factors lacking in the machine learning data that give human judgment an advantage. This is the area I am going to explore further. I hope that my models will eventually outperform humans in the long run.
The monkey believes that facts and science are the best ways to make decisions and predictions. That's why it created this website, to make different types of machine learning predictions and verify their accuracy. Although many people say that horse racing is unpredictable and random, I want to see what level of accuracy can be achieved by applying machine learning to this data.