14 Pierre Aucoin
I joined this group this year to get a better understanding on what professional data solution teams are like for professional sports. Becoming a Data Scientist for a sports team is something that I have always wanted to do, so this was an excellent experience for me. This season, it was great working with like-minded individuals who shared this passion.
Projects
usportsR and usportspy
My main project this year was working on a R and python package that contained data from usports box scores for a variety of sports. For this package, I worked along side Rithika Silva, David Awosoga, and Shamar Phillips to help generate this package. The main thing I worked on was data scraping and validation. I had to ensure that every possible game was being scraped and that no duplicates were missing. This package contains data for hockey (men’s and women’s), soccer (men’s and women’s), basketball (men’s and women’s), volleyball, (men’s and women’s) and Football from the 2009-10 season to the end of this past season (2025-26). Over the summer, I will ensure that the process has been refined to allow the package to update on a regular basis for the 2026-27 season. The R package is available through CRAN link and the python package through PyPi link. Both of these packages are also both available through github as well link for R link for py.
Reflections
Overall, I am very proud of helping finish both usportsR and usportspy. I am hopeful that with enough outreach, that data scientists not just from Waterloo, but other usports affiliated schools across the country use these packages to help improve their sports teams.
For next season, I am hopeful that I will be able to use usportsR and usportspy to help see where the Waterloo Men’s basketball team needs to improve if they hope to make the playoffs in the 2026-27 season while maintianing the package’s data quality.