Skip to contents

Introduction

The homefield package helps gather data and create visuals to show spatial relevance of entities. By producing engaging graphics, homefield can answer questions like:

  • Which teams are remaining in the March Madness basketball tournament?
  • Where was each U.S. President from?
  • What teams are undefeated in college football?

For example, homefield can query online databases to determine which football teams were undefeated in the fourth week of the 1999 season and then generate a map where the teams are represented by their school colors and logo.


Installation

  1. Install R and R Studio.

  2. Install the latest compatible version of RTools.

  3. Install the devtools package.

install.packages("devtools")
  1. Install the homefield package via the Github repository.
devtools::install_github("Landon-Getting/homefield")

homefield relies on an online database for up-to-date college football information. This database requires API keys which can be requested and installed via the following instructions:

  1. Follow the directions here to receive and install a College Football Data API Key. Required for homefield::cfb_undefeated() and homefield::cfb_conquest().

  2. Once the package and keys have been installed, homefield can be initialized in an environment via library():

library(homefield)
# your code here

homefield_map()

homefield_map() is the crux of the homefield package. It uses structured data about entities and generates a homefield map. homefield_map() colors counties based on the closest entity and adds an image representing the entity over large groups of counties.

Entities can be sports teams, political candidates, businesses, or anything that might be represented spatially or possess territory. Users can create their own data describing entities or use other homefield package functions like cfb_undefeated() to generate already structured data.

Entities are required to have several attributes including identifier, latitude, longitude, color, and image. Each entity should be encoded as a row in a dataframe of entities.

For instance, a map may be composed of sport teams. The Iowa State Cyclones entity would have an entity identifier (Iowa State), a *latitude** (42.0266573), a longitude (-93.6464516), a color (#C8102E), and an image (http://a.espncdn.com/i/teamlogos/ncaa/500/66.png).

A dataframe with multiple entities and their attributes may look similar to the following:

entity <- c("Iowa State Cyclones", "Florida Gators", "UCLA Bruins")

# must be valid locations
lat <- c(42.01400, 29.64994, 34.16133)
lng <- c(-93.63577, -82.34858, -118.16765)

# must be a color in hexcode format
color <- c("#660015","#0021A5", "#ffc72c")

# can be a link or local file
image <- c("http://a.espncdn.com/i/teamlogos/ncaa/500/66.png",
           "http://a.espncdn.com/i/teamlogos/ncaa/500/57.png",
           "http://a.espncdn.com/i/teamlogos/ncaa/500/26.png")

homefield_data <- data.frame(entity,
                             lat,
                             lng,
                             color,
                             image)

homefield_data
#>                entity      lat        lng   color
#> 1 Iowa State Cyclones 42.01400  -93.63577 #660015
#> 2      Florida Gators 29.64994  -82.34858 #0021A5
#> 3         UCLA Bruins 34.16133 -118.16765 #ffc72c
#>                                              image
#> 1 http://a.espncdn.com/i/teamlogos/ncaa/500/66.png
#> 2 http://a.espncdn.com/i/teamlogos/ncaa/500/57.png
#> 3 http://a.espncdn.com/i/teamlogos/ncaa/500/26.png

Next, the dataframe is inputted into the homefield_map() function along with a valid local file path for saving the map. Additionally, there are optional arguments to:

  • threshold: Change how big a territory must be to receive an image
  • title: Add a title to the map
  • credit: Give credit to the map author
homefield_map(x = homefield_data, # dataframe
             output_file = "C:/Users/darthvader/Downloads/example_map.png", # save location
             threshold = 10000, # area in square km
             title = "Example Map", # map title
             credit = "Landon Getting") # credit for author

In this case, UCLA’s territory includes most of the western U.S. since they are located in California whereas the Florida Gators are located in the southeast.

UCLA’s territory also extends to Alaska and Hawaii. Although these regions are disconnected from their primary territory, their area is well above the default threshold of 10,000 \(\mathrm{km^{2}}\) and therefore the UCLA image are placed at their centers.


Getting data for homefield maps

In the previous example, the input dataframe describing the entities was created from scratch.

However, the homefield package currently provides 2 functions to query interesting college football data directly into the structured format: cfb_undefeated() and cfb_conquest(). Remember to complete the steps in Installation to receive and install an College Football Database API key.

cfb_undefeated()

cfb_undefeated() returns a dataframe with the undefeated teams for a particular season and week.

cfb_undefeated_s1999_w4 <- cfb_undefeated(season = 1999, week = 4)

head(cfb_undefeated_s1999_w4)
#>           entity      lat        lng   color
#> 1      Air Force 38.99697 -104.84362 #004a7b
#> 2       Arkansas 36.06807  -94.17895 #9c1831
#> 3         Auburn 32.60255  -85.48975 #03244d
#> 4            BYU 40.25753 -111.65452 #001E4C
#> 5 Boston College 42.33510  -71.16644 #88001a
#> 6  East Carolina 35.59685  -77.36456 #4b1869
#>                                                     image
#> 1 http://a.espncdn.com/i/teamlogos/ncaa/500-dark/2005.png
#> 2         http://a.espncdn.com/i/teamlogos/ncaa/500/8.png
#> 3         http://a.espncdn.com/i/teamlogos/ncaa/500/2.png
#> 4  http://a.espncdn.com/i/teamlogos/ncaa/500-dark/252.png
#> 5  http://a.espncdn.com/i/teamlogos/ncaa/500-dark/103.png
#> 6       http://a.espncdn.com/i/teamlogos/ncaa/500/151.png

The dataframe output from cfb_undefeated() can be used directly in homefield_map().

homefield_map(x = cfb_undefeated(season = 1999, week = 4),
             output_file = "C:/Users/darthvader/Downloads/cfb_undefeated_s2022_w4.png",
             title = "College Football Undefeated - Season 1999 Week 4",
             credit = "Landon Getting")

cfb_conquest()

In cfb_conquest(), teams start with the land closest to them. As the season progresses, teams acquire the land of the teams they defeat.

cfb_conquest_s1999_w4 <- cfb_conquest(season = 1999, week = 4)

head(cfb_conquest_s1999_w4)
#>      entity      lat        lng   color
#> 1 Air Force 38.99697 -104.84362 #004a7b
#> 2     Akron 42.99913  -78.77751 #84754e
#> 3  Arkansas 36.06807  -94.17895 #9c1831
#> 4  Arkansas 32.83772  -96.78279 #9c1831
#> 5    Auburn 36.21143  -81.68543 #03244d
#> 6    Auburn 32.60255  -85.48975 #03244d
#>                                                     image
#> 1 http://a.espncdn.com/i/teamlogos/ncaa/500-dark/2005.png
#> 2      http://a.espncdn.com/i/teamlogos/ncaa/500/2006.png
#> 3         http://a.espncdn.com/i/teamlogos/ncaa/500/8.png
#> 4         http://a.espncdn.com/i/teamlogos/ncaa/500/8.png
#> 5         http://a.espncdn.com/i/teamlogos/ncaa/500/2.png
#> 6         http://a.espncdn.com/i/teamlogos/ncaa/500/2.png

The dataframe output from cfb_conquest() can be used directly in homefield_map().

homefield_map(x = cfb_conquest(season = 1999, week = 4),
             output_file = "C:/Users/darthvader/Downloads/cfb_conquest_s2022_w4.png",
             title = "College Football Conquest - Season 1999 Week 4",
             credit = "Landon Getting")

Since Iowa State University defeated UNLV in week 3 of the 1999 season, they gained UNLV’s territory.

cfb_conquest() has an additional argument, division, that allows NCAA D1 FCS teams to start with land in addition to the FBS teams.


Summarizing Maps

homefield_stats() generates summary statistics about a particular map including the total land, water, and population within each entity’s territory.

The same dataframe input used in homefield_map() can be used in homefield_stats().

cfb_undefeated_s1999_w4_stats <- homefield_stats(x = cfb_undefeated(season = 1999, week = 4))

head(cfb_undefeated_s1999_w4_stats)
#>           entity         land       water       domain      pop
#> 1      Air Force 610234978656  2533273302 612768251958  6638127
#> 2       Arkansas 186833484833  4234865393 191068350226  5122951
#> 3         Auburn 106215319059  4688840187 110904159246  4365585
#> 4            BYU 292222970451  1844563724 294067534175  1448537
#> 5 Boston College 223771886456 31278621299 255050507755 31475972
#> 6  East Carolina 104201019210 15782222629 119983241839  7592900

These summary statistics can be used to further explore the data and answer questions like:

  • Which teams had the most territory at the start of the season?
  • Did any teams control more water than land in week 7?
  • What were the top 5 teams by population located within their territory?

homefield_stats() can be paired with the dplyr package to answer the last question.

dplyr::slice_max(cfb_undefeated_s1999_w4_stats, # output from homefield_stats()
                 order_by = pop, # sorting by population
                 n = 5) |> # top 5
  dplyr::select(entity, # selecting only entity and pop columns
                pop)
#>           entity      pop
#> 1 Boston College 31475972
#> 2  East Carolina  7592900
#> 3      Air Force  6638127
#> 4       Arkansas  5122951
#> 5         Auburn  4365585

Map Stats over Time

homefield_stats() can also create summary statistics for maps over time using the temporal argument. For example, homefield_stats() can be combined with cfb_undefeated() to show statistics over the course of a season.

# undefeated for each week, 0 through 15
cfb_undefeated_2022 <- list(cfb_undefeated(season = 2022, week = 0),
                            cfb_undefeated(season = 2022, week = 1),
                            cfb_undefeated(season = 2022, week = 2),
                            cfb_undefeated(season = 2022, week = 3),
                            cfb_undefeated(season = 2022, week = 4),
                            cfb_undefeated(season = 2022, week = 5),
                            cfb_undefeated(season = 2022, week = 6),
                            cfb_undefeated(season = 2022, week = 7),
                            cfb_undefeated(season = 2022, week = 8),
                            cfb_undefeated(season = 2022, week = 9),
                            cfb_undefeated(season = 2022, week = 10),
                            cfb_undefeated(season = 2022, week = 11),
                            cfb_undefeated(season = 2022, week = 12),
                            cfb_undefeated(season = 2022, week = 13),
                            cfb_undefeated(season = 2022, week = 14),
                            cfb_undefeated(season = 2022, week = 15))

# date of each week
cfb_dates_2022 <- lubridate::ymd(c("2022-08-27",
                                   "2022-09-03",
                                   "2022-09-10",
                                   "2022-09-17",
                                   "2022-09-24",
                                   "2022-10-1",
                                   "2022-10-8",
                                   "2022-10-15",
                                   "2022-10-22",
                                   "2022-10-29",
                                   "2022-11-5",
                                   "2022-11-12",
                                   "2022-11-19",
                                   "2022-11-26",
                                   "2022-12-3",
                                   "2022-12-10"))

cfb_undefeated_2022_stats <- homefield_stats(x = cfb_undefeated_2022,
                                  temporal = cfb_dates_2022,
                                  keep_max = FALSE,
                                  keep_visuals = TRUE)

Converting the output of square meters to square miles.

# converting land from square meters to square miles
cfb_undefeated_2022_stats <- cfb_undefeated_2022_stats |>
  dplyr::mutate(land = land/2.59e6) |>
  dplyr::select(entity,
                land,
                time,
                color,
                image)

head(cfb_undefeated_2022_stats)
#>              entity      land       time   color
#> 1         Air Force 73410.774 2022-08-27 #004a7b
#> 2             Akron  5384.247 2022-08-27 #84754e
#> 3           Alabama  9698.773 2022-08-27 #690014
#> 4 Appalachian State 11482.544 2022-08-27 #000000
#> 5           Arizona 23098.589 2022-08-27 #002449
#> 6     Arizona State 66025.140 2022-08-27 #942139
#>                                                     image
#> 1 http://a.espncdn.com/i/teamlogos/ncaa/500-dark/2005.png
#> 2      http://a.espncdn.com/i/teamlogos/ncaa/500/2006.png
#> 3  http://a.espncdn.com/i/teamlogos/ncaa/500-dark/333.png
#> 4      http://a.espncdn.com/i/teamlogos/ncaa/500/2026.png
#> 5        http://a.espncdn.com/i/teamlogos/ncaa/500/12.png
#> 6         http://a.espncdn.com/i/teamlogos/ncaa/500/9.png

The dataframe output has summary statistics for each team and week. In this example, the population within Alabama’s territory increases throughout the season.

cfb_undefeated_2022_stats |>
  dplyr::filter(entity == "Alabama") |>
  dplyr::select(entity, pop, time) |>
  head()
#>    entity     pop       time
#> 1 Alabama  429223 2022-08-27
#> 2 Alabama  452185 2022-09-03
#> 3 Alabama 2463971 2022-09-10
#> 4 Alabama 4137329 2022-09-17
#> 5 Alabama 8449732 2022-09-24
#> 6 Alabama 9614778 2022-10-01

The outputted summary data from homefield_stats() can be used to create other graphics describing the maps. The homefield package has a built-in function to visualize the state of homefield maps over time - homefield_racing().


Visualizing Stats

homefield_racing() generates a racing bar chart animation where entities compete for the top 10 spots based on summary statistics from homefield_stats().

For example, the output dataframe from homefield_stats() describing the 2022 college football season can be visualized with the homefield_racing() function. As teams gain territory, they appear higher on chart. However, when a team loses, they are removed from the chart. This can show the impact of triumphant victories or heartbreaking defeats.

homefield_racing(x = cfb_undefeated_2022_stats,
                stat_name = "land",
                title = "2022 Season Week by Week - Undefeated CFB homefield Map",
                subtitle = "Area in Square Miles",
                caption = "Data Source: cfbd.com",
                output_file = "C:/Users/darthvader/Downloads/cfb_undefeated_2022_racing.gif")


Interactive Maps

homefield_shiny() showcases several functions from the homefield package in an interactive application visualizing College Football seasons. Users can choose a type of map, season, and week and the homefield map will be automatically generated.