20 David Awosoga
I started the data solutions stream of the Waterloo Warriors IST to further my vision of making the University of Waterloo a leader in sports analytics, particular within their application to varsity athletics. To create a sustainable pipeline of performance analysts and data scientists equipped to fill these roles across the athletics department, I developed a curriculum in the form of a directed reading program and actively mentor and guide students on their respective projects. I trained, coordinated, and supervised the other members of the data solutions stream, and while the work required to get this initiative off the ground has been tremendous, but the positive feedback from coaches, teams, and members has provided assurance that this group will continue to blossom.
Projects
Administrative Continuity and Sustainability
In the spirit of “containerizing” data solutions - that is, converting data solutions into a self-contained program that can be maintained by any data scientist with minimal transition overhead - I focused on formalizing processes that help ensure accessibility, sustainability, and continuity.
Some low-hanging fruit were setting up a program an email account so that communications no longer live in my personal inbox, an instagram account for increased social media presence, and establishing a consistent bi-weekly meeting cadence for the group.
For member accountability, we introduced check-ins that were due before each meeting. These provided a record of student IST meeting attendance, project status updates, an opportunity for students to provide feedback on program operations and requests for support.
Along the same vein, we introduced performance reviews that students completed at the end of every term for themselves and the other data solutions within their team. Coaches also had the option to submit reviews, and aggregated feedback was summarized and disseminated back to each member.
With the high volume of decisions that directly impact the long-term future of the program, I enlisted several senior members - Fauzan Lodhi, Rithika Silva, Kyu Min Shim, Shamar Phillips, and Arun Ramji, to act as an advisory panel whom I would submit proposed initiatives and ideas to. I messeaged these “Uncs” regularly to get second opinions on different ideas, and they also supported with interviewing prospective applicants during our recruitment cycle.
Finally, I introduced annual year-end interviews with each member at the end of April, to directly hear their feedback on the year that was and preview upcoming academic and co-op considerations for returning members.
To streamline recruitment, we introduced a two-part take-home assessment for prospective applicants. This consisted of a data analysis component and an infrastructure development component (developed by the brilliant Rithika Silva). We used GitHub Classroom to manage submissions, which streamlined application review considerably.
Applied Data Science Systems and Infrastructure
This year we also formalized a working version of our “tech stack” - the software systems, platforms, and tools used to develop a successful technology infrastructure. Myself and the Uncs spent a lot of time deliberating how to balance competing requirements between incorporating industry-standard practices within data solutions, managing costs, reducing the overhead of knowledge transfer, and centralizing core operations.
In doing so, we curated an infrastructure system that comfortably supports a wide range of applied data science tasks such as data acquisition, application deployment, and project management, while remaining secure, scalable, and extremely low-cost. While there are some practical limitations to our approach, the benefits of our methods far outweigh these drawbacks. By prioritizing pragmatic solutions over perfect ones, the maturation of our software architecture ensures that sports analytics and data solutions in varsity sport can continue to be accessible and cost-effective. Our software choices are detailed below:
| Infrastructure Component | Chosen Software |
|---|---|
| Programming Languages | R, Python |
| Developer Collaboration | GitHub Team |
| Workflow Automation and Computational Resources | GitHub Actions |
| BLOB Storage | GitHub Releases |
| Relational Database | Azure SQL |
| Stakeholder Communication | Posit Connect Cloud |
| Project Management | GitHub Issues & GitHub Projects |
Reflections
Year 2 was awesome. We have a very strong graduating class and their absence will definitely be missed, but the maturation of the returning members and injection of the bright new students is quite exciting. It has been an absolute privilege to work alongside such talented, humble, and driven people.
My goals for next year mirror my contributions this year - refine our tech stack and optimize our administrative workflows.
On the administrative side, introducing volunteer agreements, improving project and organizational management, and providing better defined student support will be piloted. I plan to learn as much as I can about the operations of similar programs run by collegiate programs in the NCAA, such as the University of Connecticut Sports Statistics Experiential Learning Program, Centre College’s Sports Analytics Program, and the Charlotte Athletics Analtyics Internship Program.
On the infrastructure side, I’ll focus on refining member technical skill acqusition and domain knowledge training, streamlining our data engineering pipelines, securing access to more computational resources, and perhaps adding additional web development frameworks to our current toolkit.