On October 23rd, Sparks Consulting published the first in a series of articles examining the link between various factors and college rowing team performance. Specifically, the Sparks research team examined the statistical impact of international athletes on team performance at the IRA National Championship and the NCAA Division I National Championship. We found a small but statistically significant effect when studying performance at IRAs, and no statistically significant effect on performance at NCAAs. This result indicates that the importance of international recruiting may be overstated. More importantly, this result lends itself to further investigative study of factors that could potentially influence performance at championship regattas.
To this end, the Sparks research team next decided to study the role of coaching experience on team performance. We used two metrics for coaching experience: total experience in current position (e.g. Coach Gladstone’s seven year tenure at Yale) and total college head coaching experience (e.g. all 42 years that Steve Gladstone has been a head coach on the collegiate level). Intuitively, there should be a relationship between both metrics and team performance. For total experience in current position, it logically follows that superior job performance should be accompanied by increased job security, and vice versa. For total head coaching experience, it follows that more relevant experience should be accompanied with increased knowledge and learned skills. But is this the case?
A First Look at the Numbers
Initial examination of the IRA numbers does not show a clear relationship. The top six finishing teams at the 2017 IRA National Championship featured coaches with a combined 45 years in their current position. In contrast, the next six teams featured coaches with a combined 81 years in their current position. In terms of total collegiate head coaching experience, the top six teams featured coaches with 111 combined years. This total is skewed heavily by Yale coach Steve Gladstone’s 42 years and Harvard coach Charley Butt’s 32 years of college head coaching experience. The next six teams featured coaches with a combined 98 years of total experience. The only outlier here is Dartmouth coach Wyatt Allen, who at the time was three seasons into his first collegiate head coaching job.
The NCAA coaching data is even more striking. The top six teams at the 2017 NCAA Championship Regatta featured coaches with a combined 73 years of experience in their current position, bookended by champion Washington’s first-year coach Yaz Farooq and sixth-place finisher Stanford’s first-year coach Derek Byrnes. The next six coaches boasted 121 years of combined experience. In terms of total collegiate head coaching experience, the top six coaches had a combined 110 years of experience, and the next six coaches had a combined 160.
Into the Weeds
Driven by the interesting initial examination of the data, the Sparks research team decided to run the same statistical procedure as in our previous article. To ensure seamless comparison with our previous study, we use the same scope of data (the 2015, 2016, and 2017 NCAA and IRA Championship Regattas for open women and heavyweight men)1. The data are modelled in mixed effects multiple regressions.
As a refresher, multiple regression models measure the relationship between a certain dependent variable (in this case total points) and a set of independent variables that we believe may influence the dependent variable. In this case, we believe that coaching experience, as well as points earned at the previous year’s regatta, number of boats entered in the regatta, and roster size, may influence points earned. The prior year points variable was included to account for the possibility that a team’s success in the previous year could lead to success in the current year. The number of boats entered by each team at the regatta was used to account for any correlation between number of entries and number of points earned. This variable was dropped in the NCAA model, as all teams at the NCAA regatta entered three boats. Roster size was used to account for team depth. A team with 48 athletes competing for 24 spots could potentially be more successful than a team with 28 athletes competing for 24 spots.
Mixed effects models generate “handicaps” for each subject to control for inherent differences between the subjects. In this case, the model theoretically accounts for differences between teams with respect to team culture, coaching, and even weather. It is highly important to note that mixed effects do not account for variables that change significantly during this timeframe. Indeed, this series of articles will attempt to study these elusive variables. For more information about mixed effects models, check out this informative tutorial.
After building and running a set of statistical models, the Sparks research team ran a series of tests to determine the statistical significance of the results. Most statistical procedures, from electoral polls to TV ratings, use a sample that represents a small percentage of the total population. As a result, statisticians must determine whether or not the results found with the small sample should be inferred to apply to the population as a whole. Statistical significance, simply put, is the probability that our result is the product of random chance or sampling error. If a model is deemed as “statistically significant” it means that the results are highly unlikely to be the result of random variation, and as such the model can reliably be used to make inferences about the broader population.
Procedures and Results
First, the Sparks team tested the effect of coaching experience in current position. Recall that this means total seasons employed as head coach of that specific team. It is important to note that if a team changed from club to varsity during the coach’s tenure, only varsity years are counted in this metric. The club years are picked up by the total head coaching experience metric (see below).
No statistically significant results were found for either IRA coaches or NCAA coaches. In practical terms, this means that the length of a head coach’s tenure in their current position cannot reliably be used to predict championship regatta results.
Second, the team tested the effect of total college head coaching experience. This means total seasons employed as the head coach of a collegiate varsity or club team. This includes heavyweight and lightweight men’s and women’s teams.
Again, no statistically significant results were found for either IRA coaches or NCAA coaches. This means that even a head coach’s total years of head coaching experience cannot reliably be used to predict championship regatta results.
Conclusions
So what did our tests show?
The only clear conclusion that can be drawn here is that there is no significant relationship between coaching experience and championship regatta performance, and as such coaching experience cannot be used to predict performance.
When comparing the results of this study with the results of our international athletes study, we can infer that there is a stronger relationship between international recruiting and performance than there is between coaching experience and performance.
It is important to note that this study does not necessarily dispute the link between coaching and team performance. Indeed, many aspects of effective coaching cannot be reliably quantified, let alone studied in a statistical setting. Our model did not attempt to quantify leadership skills, technical skills, teaching ability, or physiological knowledge among coaches. It is highly possible that any, or all of these factors influence team performance. However, if coaching intangibles influence team performance, then this model indicates that coaching intangibles are not related to coaching experience.
To further elaborate, say we define coaching intangibles as A, coaching experience as B, and team performance as C. If A is correlated with C, and B is not correlated with C, then A is probably not correlated with B.
Essentially, this means that a coach’s status as the longtime head of a varsity program does not necessarily mean that they have the “right stuff” as it relates to building a successful team. This means that there must be other factors at play here.
Going forward, the Sparks research team will continue to examine the role of various organizational factors in team performance, as we we study whether there is a relationship between retention and performance.
Footnotes:
1. We chose the IRA and NCAA Championship Regattas because both regattas had a limited number of entries, varsity-only participation, and a full slate of lower level finals. Varsity-only regattas allowed us to compare apples to apples, as many club teams do not boast the recruiting resources and funding that many varsity teams do. Lower level finals allow for a full ranking of all boats in each boat class. We only use heavyweight teams because lightweight participation at the IRA National Championship did not constitute a large enough sample size. For women, we only use Division I NCAA teams because the number of internationally recruited Division II and III athletes was not large enough. We chose 2015-2017 as an initial starting point because we wanted to gauge the effect in recent years.