I wrote five long articles this season. The first three were something like a series on release angles. In each, I argued that markerless motion capture data would define the next information asymmetry between major league teams. I’m declaring victory on that one: Kyle Boddy, the founder of Driveline and a “special advisor” to the Boston Red Sox, wrote earlier this week that “early org adopters of Hawkeye, Kinatrax, and other biomech provider data are way ahead of the others.”
The next two were more personal. I wrote an essay about going to a sabermetrics conference in Chicago two days before Filipa started chemo. Then I published something this week on the potential for biomechanical data to revolutionize the sport’s understanding of pitcher injuries. This was also personal, albeit in a different way. I wanted to write this story because of my belief that putting data and software and research in the hands of the public is worth advocating for, in all parts of life.
It took advocacy, for example, to get California to be transparent about its transportation spending. Assembly Bill 2086, which just last week was signed into law, requires the state to make clear which specific projects are funded in the California Transportation Plan. This sounds dry but is super important: if researchers aren’t exactly sure how much money is being spent on highways instead of on buses and bikes and trains, it’s impossible to advocate for a different set of choices. It’s an active choice to make, to want the public to be able to find valuable information.
In geospatial analysis, the lack of an open access ethos hurts both the maps and the mappers. Those who make maps for a living must subject themselves to working with Esri. Esri owns and operates ArcGIS. Because so many cities/counties contract with Esri, they have a virtual monopoly on geospatial analysis. If a planning department needs to produce a map for a council report or whatever, it’s going to be made in ArcGIS Pro. And so Esri charges cities hundreds or thousands of dollars to use their software, and the expectation is everyone learns how to use it, even though it’s a clunky and subpar product. Who benefits from this arrangement? Esri executives, definitely. Who loses? People who have to make maps, I would argue.
Of broader concern is scientific research. Most academic research is published in journals which have been described, frequently, as an oligopoly, with five publishers controlling over half the market. Academic articles are only accessible if the reader wants to shell out dozens or sometimes hundreds of dollars. The researcher pockets zero dollars while the journals rake in massive profits. In exchange for making it more difficult to solve the serious problems of the world, some journal executives get to line their pockets.
Baseball is not so serious, and the conversation around truly open biomechanical data is really complicated — as I get into in a bit in that article, there are serious privacy concerns to work out, and players need to be central in those conversations. But there are also other, more selfish reasons teams might not want to share their findings. As Boddy mentioned in that same Twitter post, those early org adopters of biomech data “may not want to contribute their proprietary research that cost them millions of dollars in labor + technology investments.” Privacy isn’t the full explanation for the closed data status quo in baseball.
This produces understandable skepticism about the sport’s ability to fully embrace the principle of “open access.” I understand that Sean M. O’Rourke spoke to this in depth at a 2023 Saberseminar presentation, where he posed the question, “Can Baseball Data Ever Be Democratic?” And what I’m trying to say is, even if I share some of O’Rourke’s skepticism, I believe that the answer is yes — if we fight for it.
Really love this article!