以下の指標の中から、二つを選択して、データの概要(description)を記録し、データを WDI で取得し、以下の分析をする。
それぞれについて考察(気づいたこと、疑問など)を記す
CO2 emissions (metric tons per capita) :EN.ATM.CO2E.PC [Link] (co2pcap)
Forest area (% of land area):AG.LND.FRST.ZS [Link] (forest)
Renewable electricity output (% of total electricity output):EG.ELC.RNEW.ZS [Link] (renewable)
Electricity production from oil, gas and coal sources (% of total):EG.ELC.FOSL.ZS [Link] (fossil)
Electricity production from nuclear sources (% of total):EG.ELC.NUCL.ZS [Link] (nuclear)
Preview で確認。
Web Browser で、w5_c123456.nb.html など、R Notebook を見て確認。
もし、問題があれば、Run ボタンの右の三角から、Run All を選択し、エラーがでないか確認。
最初にもどる。
データ1:一人当たりの二酸化炭素排出量 (CO2 emissions (metric tons per capita))、“EN.ATM.CO2E.PC”、co2pcap [Link]
概要:Carbon dioxide emissions are those stemming from the burning of fossil fuels and the manufacture of cement. They include carbon dioxide produced during consumption of solid, liquid, and gas fuels and gas flaring.
データ2:森林面積(%)(Forest area (% of land area))、“AG.LND.FRST.ZS”、forest [Link]
概要:Forest area is land under natural or planted stands of trees of at least 5 meters in situ, whether productive or not, and excludes tree stands in agricultural production systems (for example, in fruit plantations and agroforestry systems) and trees in urban parks and gardens.
library(tidyverse)
library(WDI)
データのダウンロードと保存:コードと変数名を指定。
df_w6eda <- WDI(indicator = c(co2pcap = "EN.ATM.CO2E.PC",
forest = "AG.LND.FRST.ZS"),
extra = TRUE)
2回目からは、data から読み込めるようにしておく ファイル (Rmd) の保存場所に data フォルダがあることを確認
write_csv(df_w6eda, "data/w6eda.csv")
df_w6eda <- read_csv("data/w6eda.csv")
Rows: 16758 Columns: 14── Column specification ─────────────────────────────────────────────────────────────────
Delimiter: ","
chr (7): country, iso2c, iso3c, region, capital, income, lending
dbl (5): year, co2pcap, forest, longitude, latitude
lgl (1): status
date (1): lastupdated
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
df_w6eda
str(df_w6eda)
spc_tbl_ [16,758 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ country : chr [1:16758] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
$ iso2c : chr [1:16758] "AF" "AF" "AF" "AF" ...
$ iso3c : chr [1:16758] "AFG" "AFG" "AFG" "AFG" ...
$ year : num [1:16758] 2014 1971 2006 2013 1995 ...
$ status : logi [1:16758] NA NA NA NA NA NA ...
$ lastupdated: Date[1:16758], format: "2023-12-18" "2023-12-18" ...
$ co2pcap : num [1:16758] 0.2837 NA 0.0898 0.2981 0.0888 ...
$ forest : num [1:16758] 1.85 NA 1.85 1.85 1.85 ...
$ region : chr [1:16758] "South Asia" "South Asia" "South Asia" "South Asia" ...
$ capital : chr [1:16758] "Kabul" "Kabul" "Kabul" "Kabul" ...
$ longitude : num [1:16758] 69.2 69.2 69.2 69.2 69.2 ...
$ latitude : num [1:16758] 34.5 34.5 34.5 34.5 34.5 ...
$ income : chr [1:16758] "Low income" "Low income" "Low income" "Low income" ...
$ lending : chr [1:16758] "IDA" "IDA" "IDA" "IDA" ...
- attr(*, "spec")=
.. cols(
.. country = col_character(),
.. iso2c = col_character(),
.. iso3c = col_character(),
.. year = col_double(),
.. status = col_logical(),
.. lastupdated = col_date(format = ""),
.. co2pcap = col_double(),
.. forest = col_double(),
.. region = col_character(),
.. capital = col_character(),
.. longitude = col_double(),
.. latitude = col_double(),
.. income = col_character(),
.. lending = col_character()
.. )
- attr(*, "problems")=<externalptr>
df_w6 <- df_w6eda |>
select(country, iso2c, year, co2pcap, forest, region, income)
df_w6
df_w6eda |> drop_na(co2pcap, forest) |>
ggplot(aes(year)) + geom_bar()
country には、国と地域両方が入っています。地域の iso2c は以下のものです。
REGION <- c("1A", "1W", "4E", "6F", "6N", "6X", "7E", "8S", "A4", "A5",
"A9", "B1", "B2", "B3", "B4", "B6", "B7", "B8", "C4", "C5", "C6",
"C7", "C8", "C9", "D2", "D3", "D4", "D5", "D6", "D7", "EU", "F1",
"F6", "M1", "M2", "N6", "OE", "R6", "S1", "S2", "S3", "S4", "T2",
"T3", "T4", "T5", "T6", "T7", "V1", "V2", "V3", "V4", "XC", "XD",
"XE", "XF", "XG", "XH", "XI", "XJ", "XL", "XM", "XN", "XO", "XP",
"XQ", "XT", "XU", "XY", "Z4", "Z7", "ZB", "ZF", "ZG", "ZH", "ZI",
"ZJ", "ZQ", "ZT")
df_w6eda |> filter(iso2c %in% REGION) |> distinct(country, iso2c)
df_w6eda |> filter(!(iso2c %in% REGION)) |> distinct(country, iso2c, region, income)
BRICS を選択します。
BRICS <- c("Brazil", "Russian Federation", "India", "China", "South Africa")
df_w6 |> drop_na(co2pcap) |> filter(country == "Japan") |>
ggplot(aes(year, co2pcap)) + geom_line() +
labs(title = "日本の一人当たりの二酸化炭素排出量")
気づいたこと・疑問
df_w6 |> drop_na(forest) |> filter(country == "Japan") |>
ggplot(aes(year, forest)) + geom_line() +
labs(title = "日本の森林面積(%)")
気づいたこと・疑問
df_w6 |> drop_na(co2pcap) |> filter(country %in% BRICS) |>
ggplot(aes(year, co2pcap, linetype = country)) + geom_line() +
labs(title = "BRICS の一人当たりの二酸化炭素排出量")
気づいたこと・疑問
df_w6 |> drop_na(forest) |> filter(country %in% BRICS) |>
ggplot(aes(year, forest, linetype = country)) + geom_line() +
labs(title = "BRICSの森林面積(%)")
気づいたこと・疑問
必要に応じて log10 スケール (+ scale_y_log10
)
df_w6 |> ggplot(aes(forest, co2pcap, col = region)) + geom_point()
df_w6 |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
df_w6 |> filter(!(iso2c %in% REGION)) |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
df_w6 |> filter(!(iso2c %in% REGION)) |> filter(year == 2020) |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
気づいたこと・疑問
df_w6 |> filter(!(iso2c %in% REGION)) |> filter(year == 2020) |> drop_na(co2pcap, forest) |>
select(co2pcap, forest) |> cor()
co2pcap forest
co2pcap 1.00000000 -0.09914706
forest -0.09914706 1.00000000
データ1:一人当たりの二酸化炭素排出量 (CO2 emissions (metric tons per capita))、“EN.ATM.CO2E.PC”、co2pcap [Link]
概要:Carbon dioxide emissions are those stemming from the burning of fossil fuels and the manufacture of cement. They include carbon dioxide produced during consumption of solid, liquid, and gas fuels and gas flaring.
データ2:名前、コード、変数名、リンク
概要:
library(tidyverse)
library(WDI)
データのダウンロードと保存:コードと変数名を指定。
データの名前は、変えたほうがよいので、例でも、df_w6eda_1 や、df_w6_1 に変えてあります。
df_w6eda_1 <- WDI(indicator = c(co2pcap = "EN.ATM.CO2E.PC",
forest = "AG.LND.FRST.ZS"),
extra = TRUE)
2回目からは、data から読み込めるようにしておく ファイル (Rmd) の保存場所に data フォルダがあることを確認
write_csv(df_w6eda_1, "data/w6eda_1.csv")
df_w6eda_1 <- read_csv("data/w6eda_1.csv")
df_w6eda_1
df_w6eda_1
Error: object 'df_w6eda_1' not found
str(df_w6eda_1)
df_w6_1 <- df_w6eda_1 |>
select(country, iso2c, year, co2pcap, forest, region, income)
df_w6_1
df_w6eda_1 |> drop_na(co2pcap, forest) |>
ggplot(aes(year)) + geom_bar()
country には、国と地域両方が入っています。地域の iso2c は以下のものです。
REGION <- c("1A", "1W", "4E", "6F", "6N", "6X", "7E", "8S", "A4", "A5",
"A9", "B1", "B2", "B3", "B4", "B6", "B7", "B8", "C4", "C5", "C6",
"C7", "C8", "C9", "D2", "D3", "D4", "D5", "D6", "D7", "EU", "F1",
"F6", "M1", "M2", "N6", "OE", "R6", "S1", "S2", "S3", "S4", "T2",
"T3", "T4", "T5", "T6", "T7", "V1", "V2", "V3", "V4", "XC", "XD",
"XE", "XF", "XG", "XH", "XI", "XJ", "XL", "XM", "XN", "XO", "XP",
"XQ", "XT", "XU", "XY", "Z4", "Z7", "ZB", "ZF", "ZG", "ZH", "ZI",
"ZJ", "ZQ", "ZT")
df_w6eda_1 |> filter(iso2c %in% REGION) |> distinct(country, iso2c)
df_w6eda_1 |> filter(!(iso2c %in% REGION)) |> distinct(country, iso2c, region, income)
BRICS <- c("Brazil", "Russian Federation", "India", "China", "South Africa")
df_w6_1 |> drop_na(co2pcap) |> filter(country == "Japan") |>
ggplot(aes(year, co2pcap)) + geom_line() +
labs(title = "日本の一人当たりの二酸化炭素排出量")
気づいたこと・疑問
df_w6_1 |> drop_na(forest) |> filter(country == "Japan") |>
ggplot(aes(year, forest)) + geom_line() +
labs(title = "日本の森林面積(%)")
気づいたこと・疑問
df_w6_1 |> drop_na(co2pcap) |> filter(country %in% BRICS) |>
ggplot(aes(year, co2pcap, col = country)) + geom_line() +
labs(title = "BRICS の一人当たりの二酸化炭素排出量")
気づいたこと・疑問
df_w6_1 |> drop_na(forest) |> filter(country %in% BRICS) |>
ggplot(aes(year, forest, col = country)) + geom_line() +
labs(title = "BRICSの森林面積(%)")
気づいたこと・疑問
必要に応じて log10 スケール
df_w6_1 |> ggplot(aes(forest, co2pcap, col = region)) + geom_point()
df_w6_1 |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
df_w6_1 |> filter(!(iso2c %in% REGION)) |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
df_w6_1 |> filter(!(iso2c %in% REGION)) |> filter(year == 2020) |> drop_na(co2pcap, forest) |>
ggplot(aes(forest, co2pcap)) + geom_point(aes(col = region)) +
geom_smooth(formula = 'y~x', method = "lm", se = FALSE)
気づいたこと・疑問
df_w6_1 |> filter(!(iso2c %in% REGION)) |> filter(year == 2020) |> drop_na(co2pcap, forest) |>
select(co2pcap, forest) |> cor()
気づいたこと・疑問