Automated Sports Journalism? No thanks

Thu, Nov 25, 2010

News

Welcome to the future of the Internet my friends, where websites sprout up over night, and believe it or not, may be written by complex computer algorithms as opposed to actual people. Sound crazy? Sure, but a North Carolina startup called StatSheet recently launched an extensive network of spots sites catering to each and every of the NCAA’s 345 Division I basketball teams. Each and every post on a team’s site is completely auto-generated according to founder Robbie Allen. “The only human involvement is with creating the algorithms that generate the posts.”

TechCrunch reports:

StatSheet started out as basically a stats database for sports junkies. It stores 500 million different stats across most of the major sports. Now, it is taking all of those stats and creating news stories out of them. It has about 20 different types of articles that it generates, from season previews to game recaps. StatSheet might analyze 10,000 data points and 4,000 possible phrases to generate a single story.

So what does the end result look like?

Well as a sampling, here are the official pages for the Duke Blue Devils, the Florida Gators, and the Michigan Wolverines. Below is an excerpt from the Florida Gators’ 2010-2011 auto-generated preview.

Florida will play their first game of the 2010-2011 season in Gainesville on November 12 against North Carolina-Wilmington. Expectations are high for the Gators to best last season’s performance. They bring back players who contributed 79% of their total minutes last season and add the contributions of 3 Top 100 recruits, including #19 Patric Young, and 2 other freshmen. The Florida defense will suffer from a 29.6% drop in steals, the largest gap that the team will need to fill. Following that is assists, where they lost 17.5% of last year’s output…

Florida lost last year’s senior Dan Werner (27 minutes per game). The Gators will also have to get by without Ray Shipman (12 MPG), Kenny Kadji (transferred to Miami (FL)), Nimrod Tishman (2 MPG), and Hudson Fricke (1 MPG).

Is it readable? Sure. Perhaps even passable. But there’s something cold and uninviting about reading a computer generated story. I mean, the algorithm works just fine when it comes to regurgitating statistics and presenting bland factual information, but if you’re looking for actual analysis that encompasses information that only a human actually watching a live game could gauge, you might want to stick with, oh I don’t know, a real sports site.

  Share

1 Comments For This Post

  1. Rich Says:

    You know what? I think Mitch Albom is going to be an early adopter of this.

    ( For those who miss the joke: http://www.azcentral.com/arizonarepublic/viewpoints/articles/0508maceachern0508.html )

eXTReMe Tracker