Open Content: Data Available to the Public

Several valiant data-gathering projects have blossomed on the Web whereby groups of volunteers work together to enter historical data that's in the public domain for the benefit of sports data junkies everywhere. This page aims to track these efforts, congratulate their leaders, and encourage volunteers to get involved.

Know of a project that should be listed on this page? Contact us and let us know!

One goal of our companion Open Software section is to gather up tools that can process the datasets below and normalize them into the open SportsML and SportsDB formats.

  • United States
    • MLB: Major League Baseball
      • Retrosheet (uses RoSIN format; See tools in open software section)
        • Download sample SportsDB Database of stats from all games for the 2000 through 2007 seasons:
          • MySQL
          • PostgreSQL
          • SQL Server
            • Unzip and drop files into SQL Server data directory (such as C:\Program Files\Microsoft SQL Server\MSSQL\Data)
            • Open SQL Server Enterprise Manager
            • Open up "Databases" folder (Console Root\Microsoft SQL Servers\SQL Server Group\(local)(Windows NT)\Databases)
            • Right-mouse-click, and choose "All Tasks->Attach Database..."
            • Navigate to your \Data directory and select the *.MDF file
        • Use Retrosheet Spider script to pull down over 50 years of data, and (optionally) have it converted into SportsML
    • NFL: National Football League
    • NHL: National Hockey League

Other companies gather live and archival sport data and syndicate that information to publishers and enthusiasts. This page will also seek to track content published in the open standards described at that is available for licensing.