20  David Awosoga

Author

Graduate Student
PhD Statistics
Program Coordinator

I started the data solutions stream of the Waterloo Warriors IST to further my vision of making the University of Waterloo a leader in sports analytics, particular within their application to varsity athletics. To create a sustainable pipeline of performance analysts and data scientists equipped to fill these roles across the athletics department, I developed a curriculum in the form of a directed reading program and actively mentor and guide students on their respective projects. I trained, coordinated, and supervised the other members of the data solutions stream, and while the work required to get this initiative off the ground has been tremendous, but the positive feedback from coaches, teams, and members has provided assurance that this group will continue to blossom.

Projects

Administrative Continuity and Sustainability

In the spirit of “containerizing” data solutions - that is, converting data solutions into a self-contained program that can be maintained by any data scientist with minimal transition overhead - I focused on formalizing processes that help ensure accessibility, sustainability, and continuity.

Some low-hanging fruit were setting up a program an email account so that communications no longer live in my personal inbox, an instagram account for increased social media presence, and establishing a consistent bi-weekly meeting cadence for the group.

For member accountability, we introduced check-ins that were due before each meeting. These provided a record of student IST meeting attendance, project status updates, an opportunity for students to provide feedback on program operations and requests for support.

Along the same vein, we introduced performance reviews that students completed at the end of every term for themselves and the other data solutions within their team. Coaches also had the option to submit reviews, and aggregated feedback was summarized and disseminated back to each member.

With the high volume of decisions that directly impact the long-term future of the program, I enlisted several senior members - Fauzan Lodhi, Rithika Silva, Kyu Min Shim, Shamar Phillips, and Arun Ramji, to act as an advisory panel whom I would submit proposed initiatives and ideas to. I messeaged these “Uncs” regularly to get second opinions on different ideas, and they also supported with interviewing prospective applicants during our recruitment cycle.

Finally, I introduced annual year-end interviews with each member at the end of April, to directly hear their feedback on the year that was and preview upcoming academic and co-op considerations for returning members.

To streamline recruitment, we introduced a two-part take-home assessment for prospective applicants. This consisted of a data analysis component and an infrastructure development component (developed by the brilliant Rithika Silva). We used GitHub Classroom to manage submissions, which streamlined application review considerably.

Applied Data Science Systems and Infrastructure

This year we also formalized a working version of our “tech stack” - the software systems, platforms, and tools used to develop a successful technology infrastructure. Myself and the Uncs spent a lot of time deliberating how to balance competing requirements between incorporating industry-standard practices within data solutions, managing costs, reducing the overhead of knowledge transfer, and centralizing core operations.

In doing so, we curated an infrastructure system that comfortably supports a wide range of applied data science tasks such as data acquisition, application deployment, and project management, while remaining secure, scalable, and extremely low-cost. While there are some practical limitations to our approach, the benefits of our methods far outweigh these drawbacks. By prioritizing pragmatic solutions over perfect ones, the maturation of our software architecture ensures that sports analytics and data solutions in varsity sport can continue to be accessible and cost-effective. Our software choices are detailed below:

Our chosen technology infrastructure for 2025-2026.
Infrastructure Component Chosen Software
Programming Languages R, Python
Developer Collaboration GitHub Team
Workflow Automation and Computational Resources GitHub Actions
BLOB Storage GitHub Releases
Relational Database Azure SQL
Stakeholder Communication Posit Connect Cloud
Project Management GitHub Issues & GitHub Projects

Reflections

  • Year 2 was awesome. We have a very strong graduating class and their absence will definitely be missed, but the maturation of the returning members and injection of the bright new students is quite exciting. It has been an absolute privilege to work alongside such talented, humble, and driven people.

  • My goals for next year mirror my contributions this year - refine our tech stack and optimize our administrative workflows.