Preparing the Life expectancy data for plotting
I downloaded the Life Expectancy Data from Our World in Data I selected this data because I am interested in life expectancy to see how long I am going to live.
This is the Link to the Data.
The following code chunk loads the package I will use to read in and prepare the data for analysis.
glimpse(life_expectancy_by_country)
Rows: 19,028
Columns: 4
$ Entity <chr> "Afghanistan", "Afghanistan", "Afghanistan…
$ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", …
$ Year <dbl> 1950, 1951, 1952, 1953, 1954, 1955, 1956, …
$ `Life expectancy` <dbl> 27.638, 27.878, 28.361, 28.852, 29.350, 29…
Create the object regions that is list of regions I want to extract from the dataset
Change the name of 1st column to Region and the 4th column to Life Expectancy
Use filter to extract the rows that I want to keep: Year >= 1900 and Region in regions
Select the columns to keep: Region, Year, Life Expectancy
Assign the output to regional_life_expectancy
Display the first 10 rows of regional_life_expectancy
regions <- c("Oceania",
"International transport",
"Oceania",
"Asia (excl. China & India)",
"China",
"India",
"Africa",
"South America",
"North America (excl. USA)",
"United States",
"Europe (excl. EU-27)",
"EU-27" )
regional_life_expectancy <- life_expectancy_by_country %>%
rename(Region = 1, Life_Expectancy = 4) %>%
filter(Year >= 1900, Region %in% regions) %>%
select(Region, Year, Life_Expectancy) %>%
mutate(Life_Expectancy = Life_Expectancy * 1e-9)
Check that the total for 2019 equals the total in the graph
life_expectancy_by_country %>% filter(Year == 2019) %>%
summarise(total_emm = sum(`Life expectancy`))
# A tibble: 1 × 1
total_emm
<dbl>
1 17941.
Add a Picture
See hoe to chanche the width in the R Markdown Cookbook
Write the data to file the project directory
write_csv(life_expectancy_by_country, file="life_expectancy.csv")