Title: | Updated US State Facts and Figures |
---|---|
Description: | Updated versions of the 1970's "US State Facts and Figures" objects from the 'datasets' package included with R. The new data is compiled from a number of sources, primarily from United States Census Bureau or the relevant federal agency. |
Authors: | Kiernan Nicholls [aut, cre, cph] |
Maintainer: | Kiernan Nicholls <[email protected]> |
License: | CC BY 4.0 |
Version: | 0.1.2 |
Built: | 2024-11-07 02:40:18 UTC |
Source: | https://github.com/k5cents/usa |
The United States Postal Service's official names for the cities in which ZIP codes are contained. This vector contains unique values, sorted alphabetically; because of this, they do not line up the other vectors in the way zip.code and zip.center do.
city.name
city.name
A character vector of length 19108.
Daniel Coven's web site and the CivicSpace US ZIP Code Database written by Schuyler Erle [email protected], 5 August 2004.
The county subdivisions of the US states and territories.
counties
counties
A tibble with 3,232 rows and 3 variables:
Federal Information Processing Standard Publication 5-2 code
Census county names
USPS official state, territory abbreviation code
The name of distinct US counties.
county.name
county.name
A character vector of length 19108.
Updated version of the datasets::state.x77 matrix, which provides eights statistics from the 1970's. This version is a modern data frame format with updated (and alternative) statistics.
facts
facts
A tibble with 52 rows and 9 variables:
Full state name
Population estimate (September 26, 2019)
Votes in the Electoral College (following the 2010 Census)
The data which the state was admitted to the union
Per capita income (2018)
Life expectancy in years (2017-18)
Murder rate per 100,000 population (2018)
Percent adult population with at least a bachelor's degree or greater (2019)
Mean number of degree days (temperature requires heating) per year from 1981-2010
Population: https://www2.census.gov/programs-surveys/popest/datasets/2010-2018/state/detail/SCPRC-EST2018-18+POP-RES.csv
Electoral College: https://www.archives.gov/electoral-college/allocation
Income: https://data.census.gov/cedsci/table?tid=ACSST1Y2018.S1903
GDP: https://www.bea.gov/system/files/2019-11/qgdpstate1119.xlsx
Literacy: https://nces.ed.gov/naal/estimates/StateEstimates.aspx
Life Expectancy: https://web.archive.org/web/20231129160338/https://usa.mortality.org/
Education: https://data.census.gov/cedsci/table?q=S1501
Temperature: ftp://ftp.ncdc.noaa.gov/pub/data/normals/1981-2010/products/temperature/ann-cldd-normal.txt
A statistically representative synthetic sample of 20,000 Americans. Each record is a simulated survey respondent.
people
people
A tibble with 20,000 rows and 40 variables:
Sequential unique ID
Random first name, see details
Random last name, see details
Biological sex
Age capped at 85
Race and Ethnicity
Educational attainment
Census regional division
Marital status
Household size
Has children
Is a US citizen
Was born in the Us
Family income
Employment status
Employment sector
Hours worked per week
Hours vary week to week
Has served in the military
Home ownership
Lives in metropolitan area
Household has internet access
Receives food stamps
Moved in the last year
Contacted or visited a public official
Participated in a community association
Talked with neighbors
Trusts neighbors
Uses a tablet or e-reader
Uses text messaging
Uses social media
Volunteered
Is registered to vote
Voted in the 2014 midterm elections
Political party
Religious (evangelical) affiliation
Political ideology
Follows government and public affairs
Owns a gun
This dataset was originally produced by the Pew Research center for their paper entitled For Weighting Online Opt-In Samples, What Matters Most? The synthetic population dataset was created to serve as a reference for making online opt-in surveys more representative of the overall population.
See Appendix B: Synthetic population dataset for a more detailed description of the method for and rationale behind creating this dataset.
In short, the dataset was created to overcome the limitations of using large, federal benchmark survey datasets such as the American Community Survey (ACS) or Current Population Survey (CPS). These surveys often do not contain the exact questions asked in online-opt in surveys, keeping them from being used for proper adjustment.
This synthetic dataset was created by combining nine separate benchmark datasets. Each had a set of common demographic variables but many added unique variables such as gun ownership or voter registration. The surveys were combined, stratified, sampled, combined, and imputed to fill missing values from each. From this large dataset, the original 20,000 surveys from the ACS were kept to ensure accurate demographic distribution.
The names were RANDOMLY assigned to respondents to better simulate a
synthetic sample of the population. First names were taken from the
babynames
dataset which contains the Social Security Administration's
record of baby names from 1880 to 2017 along with gender and proportion.
First names were proportionally randomly assigned by birth year and sex. Last
names were taken from the Census Bureau, who provides the 162,254 most common
last names in the 2010 Census, covering over 90% of the population. For a
given surname, the proportion of that name belonging to members of each race
and ethnicity is provided. The last names were proportionally randomly
assigned by race.
“For Weighting Online Opt-In Samples, What Matters Most?” Pew Research Center, Washington, D.C. (January 26, 2018) https://www.pewresearch.org/methods/2018/01/26/for-weighting-online-opt-in-samples-what-matters-most/
Take a vector of state identifiers and convert to a common format.
state_convert(x, to = NULL)
state_convert(x, to = NULL)
x |
A character vector of: state names, abbreviations, or FIPS codes. |
to |
The format returned: "abb", "name" or "fips". |
A character vector of single format state identifiers.
state_convert(c("AL", "Vermont", "06"))
state_convert(c("AL", "Vermont", "06"))
The 2-letter abbreviations for the US state names.
state.abb
state.abb
A character vector of length 52.
https://www2.census.gov/geo/docs/reference/state.txt
The area in square miles of the US states.
state.area
state.area
A numeric vector of length 52.
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
A list with components named x
and y
giving the approximate geographic
center of each state in negative longitude and latitude.
state.center
state.center
A list of length two, each element a numeric vector of length 52.
Center longitudinal coordinate
Center latitudinal coordinate
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
The Census division to which each state belongs, one of nine:
New England
Middle Atlantic
East North Central
West North Central
South Atlantic
East South Central
West South Central
Mountain
Pacific
state.division
state.division
A factor vector of length 52.
https://www2.census.gov/programs-surveys/popest/geographies/2018/state-geocodes-v2018.xlsx
The full names for the US states.
state.name
state.name
A numeric vector of length 52.
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
The Census region to which each state belongs, one of four:
Northeast
Midwest
South
West
state.region
state.region
A factor vector of length 52.
https://www2.census.gov/programs-surveys/popest/geographies/2018/state-geocodes-v2018.xlsx
A matrix version of the facts tibble, used to more closely align with the datasets::state.x77 matrix included with R.
state.x19
state.x19
A tibble with 52 rows and 9 variables:
2-letter abbreviation
Population estimate as of September 26, 2019
Votes in the Electoral College (following the 2010 Census)
Per capita income (2017)
Life expectancy in years (2017-18)
Murder rate per 100,000 population (2018)
Percent of population with at least a high school degree (2019)
Percent of population with at least a bachelor's degree (2019)
Mean number of "degree days" per year from 1981-2010
The 50 states, District of Columbia, and Puerto Rico.
states
states
A tibble with 52 rows and 8 variables:
2-letter abbreviation
Full legal name
Federal Information Processing Standard Publication 5-2 code
Census Bureau region
Census Bureau division
Area in square miles
Center latitudinal coordinate
Center longitudinal coordinate
The 6 non-state territories and federal district.
territory
territory
A tibble with 7 rows and 6 variables:
2-letter abbreviation
Full legal name
Federal Information Processing Standard Publication 5-2 code
Area in square miles
Center latitudinal coordinate
Center longitudinal coordinate
The 2-letter abbreviations for the US territory names.
territory.abb
territory.abb
A character vector of length 52.
https://www2.census.gov/geo/docs/reference/state.txt
The area in square miles of the US territories.
territory.area
territory.area
A numeric vector of length 52.
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
A list with components named x
and y
giving the approximate geographic
center of each territory in negative longitude and latitude.
territory.center
territory.center
A list of length two, each element a numeric vector of length 5.
Center longitudinal coordinate
Center latitudinal coordinate
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
The full names for the US territories.
territory.name
territory.name
A numeric vector of length 52.
https://tigerweb.geo.census.gov/tigerwebmain/Files/acs19/tigerweb_acs19_state_us.html
A list with components named x
and y
giving the approximate geographic
center of each ZIP code in negative longitude and latitude.
zip.center
zip.center
A list of length two, each element a numeric vector of length 44336.
Center longitudinal coordinate
Center latitudinal coordinate
Daniel Coven's web site and the CivicSpace US ZIP Code Database written by Schuyler Erle [email protected], 5 August 2004.
The United States Postal Service's 5-digit codes used to identify a particular postal delivery area.
zip.code
zip.code
A character vector of length 44336.
Daniel Coven's web site and the CivicSpace US ZIP Code Database written by Schuyler Erle [email protected], 5 August 2004.
This tibble contains city, state, latitude, and longitude for U.S. ZIP codes
from the CivicSpace Database (August 2004) augmented by Daniel Coven's web site (updated on January 22, 2012).
The data was originally contained in the
zipcode
CRAN package, which
was archived on January 1, 2020.
zipcodes
zipcodes
A tibble with 52 rows and 9 variables:
5 digit ZIP code or military postal code (FPO/APO)
USPS official city name
USPS official state, territory abbreviation code
Decimal Latitude
Decimal Longitude
Daniel Coven's web site and the CivicSpace US ZIP Code Database written by Schuyler Erle [email protected], 5 August 2004.