internal-impact.Rmd
This vignette describes how to use vimpact to calculate impact as a member of VIMC. This requires a connection to the montagu database so can only be used internally. Note that this is all in development and the interface is likely to change.
impact <- vimpact::calculate_impact(
con, method = "calendar_year", touchstone = "201910gavi-5",
modelling_group = "CDA-Razavi", disease = "HepB",
focal_scenario_type = "default", focal_vaccine_delivery = list(
list(vaccine = "HepB_BD", activity_type = "routine"),
list(vaccine = "HepB", activity_type = "routine")
),
baseline_scenario_type = "novac",
burden_outcomes = c("hepb_deaths_acute", "hepb_deaths_dec_cirrh",
"hepb_deaths_hcc"))
str(impact)
#> tibble [9,898 × 4] (S3: tbl_df/tbl/data.frame)
#> $ country : chr [1:9898] "AFG" "AFG" "AFG" "AFG" ...
#> $ burden_outcome: chr [1:9898] "deaths" "deaths" "deaths" "deaths" ...
#> $ year : int [1:9898] 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
#> $ impact : num [1:9898] 0 0 0 0 0 0 0 53 81 97 ...
Or for impact by birth year, use method = "birth_year"
. This will get burden_estimate_set
ids for the baseline and focal scenarios for this particular touchstone, modelling group, disease. Then uses those to pull the burden_estimate
data for specified burden_outcomes
. Optionally filtering on country via countries
arg, year via vaccination_years
and under 5 age groups if is_under5 = TRUE
. We then use raw impact from burden_estimate
table and call relevant public facing impact method. For method = "calendar_year"
impact_by_calendar_year
. For method = "birth_year"
impact_by_birth_year
.
Define a recipe either as a csv or using recipe_template
recipe <- data.frame(
touchstone = "201910gavi-5",
modelling_group = "CDA-Razavi", disease = "HepB",
focal = "default:HepB_BD-routine;HepB-routine",
baseline = "novac",
burden_outcome = "hepb_deaths_acute,hepb_deaths_dec_cirrh,hepb_deaths_hcc;hepb_cases_acute_severe,hepb_cases_dec_cirrh,hepb_cases_hcc;dalys")
t <- tempfile(fileext = ".csv")
write.csv(recipe, t, row.names = FALSE)
This is a set of properties defining what data we want to extract from the db e.g. touchstone
, modelling_group
, disease
, focal and baseline scenarios and burden_outcome
. It captures the same info as the args to calculate_impact
above. Then use the recipe to define meta data frame
meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t, con = con)
And use this to calculate impact
old_impact <- vimpact:::get_raw_impact_details(con, meta, "deaths")
str(old_impact)
#> 'data.frame': 9898 obs. of 7 variables:
#> $ country : int 104 104 104 104 104 104 104 104 104 104 ...
#> $ time : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
#> $ baseline_value: num 5636 5765 5901 6034 6172 ...
#> $ focal_value : num 5636 5765 5901 6015 6075 ...
#> $ value : num 0 0 0 19 97 180 252 312 344 381 ...
#> $ index : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ burden_outcome: chr "deaths" "deaths" "deaths" "deaths" ...
Very similar to examples for calendar year and birth year, to calculate impact by year of vaccination stratified by activity type run
impact <- vimpact::calculate_impact(
con, method = "yov_activity_type", touchstone = "201910gavi-5",
modelling_group = "CDA-Razavi", disease = "HepB",
focal_scenario_type = "default", focal_vaccine_delivery = list(
list(vaccine = "HepB_BD", activity_type = "routine"),
list(vaccine = "HepB", activity_type = "routine")
),
baseline_scenario_type = "novac",
burden_outcomes = "dalys")
str(impact)
#> tibble [4,279 × 8] (S3: tbl_df/tbl/data.frame)
#> $ country : chr [1:4279] "AFG" "AFG" "AFG" "AFG" ...
#> $ vaccine : chr [1:4279] "HepB" "HepB" "HepB" "HepB" ...
#> $ activity_type : chr [1:4279] "routine" "routine" "routine" "routine" ...
#> $ year : int [1:4279] 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 ...
#> $ burden_outcome: chr [1:4279] "dalys" "dalys" "dalys" "dalys" ...
#> $ impact : num [1:4279] 103859 106575 105852 111953 115165 ...
#> $ impact_ratio : num [1:4279] 0.154 0.154 0.154 0.154 0.154 ...
#> $ fvps : num [1:4279] 676396 694088 689380 729111 750028 ...
or use method = "yov_birth_cohort
for impact by year of vaccination stratified by birth cohort. This uses the same logic as other impact methods but before calling the public facing impact function it will also extract FVP data from the data base for this touchstone
and vaccine delivery. It then calls either impact_by_year_of_vaccination_activity_type
or impact_by_year_of_vaccination_birth_cohort
to get impact.
We can re-use the recipe from above and get meta table for this method
meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t,
method = "method2a", con = con)
Then use this to get raw impact
old_raw_impact <- vimpact:::get_raw_impact_details(con, meta, "dalys")
Extract the fvp data and map output to required interface
fvps <- vimpact::extract_vaccination_history(con, "201910gavi-5", year_min = 2000,
year_max = 2030,
disease_to_extract = "HepB")
#> User defined touchstone version is used.
#> Converting input coverage data......
#> Extracted interpolated population.
#> Extracted raw coverage data...
#> Transformed coverage data.
fvps$fvps <- fvps$fvps_adjusted
fvps$country <- fvps$country_nid
Then can use meta
, raw_impact
and fvps
to calculate impact by year of vaccination
old_impact <- vimpact:::impact_by_year_of_vaccination(
meta, old_raw_impact, fvps, vaccination_years = 2000:2030)
str(old_impact)
#> 'data.frame': 4964 obs. of 27 variables:
#> $ country : int 4 4 4 4 4 4 4 4 4 4 ...
#> $ vaccine : chr "HepB_BD" "HepB" "HepB_BD" "HepB_BD" ...
#> $ activity_type : chr "routine" "routine" "routine" "routine" ...
#> $ scenario_description: chr "best-estimates" "best-estimates" "best-estimates" "best-estimates" ...
#> $ coverage_set : int 1750 1751 1750 1750 1751 1751 1751 1751 1751 1751 ...
#> $ delivery_id : int 487 3093 1169 1175 3376 4029 4272 3375 4564 3096 ...
#> $ disease : chr "HepB" "HepB" "HepB" "HepB" ...
#> $ scenario_type : chr "default" "default" "default" "default" ...
#> $ gavi_support_level : chr "with" "with" "with" "with" ...
#> $ year : int 2016 2017 2014 2025 2030 2015 2019 2027 2029 2020 ...
#> $ gavi_support : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
#> $ gender : chr "Both" "Both" "Both" "Both" ...
#> $ age : num 0 0 0 0 0 0 0 0 0 0 ...
#> $ target_source : num 1118424 1131834 1103729 1180725 1188620 ...
#> $ coverage_source : num 0.18 0.66 0.04 0.473 0.749 ...
#> $ cohort_size : num 1118424 1131834 1103729 1180725 1188620 ...
#> $ delivery_population : num 1118424 1131834 1103729 1180725 1188620 ...
#> $ fvps_source : num 201316 747010 44149 558483 890514 ...
#> $ fvps_adjusted : num 201316 747010 44149 558483 890514 ...
#> $ coverage_adjusted : num 0.18 0.66 0.04 0.473 0.749 ...
#> $ country_nid : int 4 4 4 4 4 4 4 4 4 4 ...
#> $ fvps : num 201316 747010 44149 558483 890514 ...
#> $ time : num 2016 2017 2014 2025 2030 ...
#> $ burden_outcome : chr "dalys" "dalys" "dalys" "dalys" ...
#> $ impact_ratio : num 0.154 0.154 0.154 0.154 0.154 ...
#> $ impact : num 30912 114701 6779 85754 136736 ...
#> $ index : int 1 1 1 1 1 1 1 1 1 1 ...
calculate_impact
has a very similar interface as a single row in an impact recipe. The output is slightly different format but output from previous method can be transformed into same format.
country <- dplyr::tbl(con, "country")
old_impact <- old_impact %>%
dplyr::left_join(country, by = c("country" = "nid"), copy = TRUE) %>%
dplyr::select(country = id, vaccine, activity_type, year = time,
burden_outcome, impact) %>%
dplyr::filter(!is.na(impact)) %>%
dplyr::arrange(activity_type, country, year, vaccine)
str(old_impact)
#> 'data.frame': 4279 obs. of 6 variables:
#> $ country : chr "AFG" "AFG" "AFG" "AFG" ...
#> $ vaccine : chr "HepB" "HepB" "HepB" "HepB" ...
#> $ activity_type : chr "routine" "routine" "routine" "routine" ...
#> $ year : num 2007 2008 2009 2010 2011 ...
#> $ burden_outcome: chr "dalys" "dalys" "dalys" "dalys" ...
#> $ impact : num 103859 106575 105852 111953 115165 ...
and we can see from a plot that the output is very similar
htmltools::tags$iframe(
src = "impact_comparisons_hepb.html",
width = "100%",
height = "400",
scrolling = "no",
seamless = "seamless",
frameBorder = "0",
`data-external` = "1"
)
htmltools::tags$iframe(
src = "impact_comparisons_measles.html",
width = "100%",
height = "400",
scrolling = "no",
seamless = "seamless",
frameBorder = "0",
`data-external` = "1"
)
This shows bars of mean difference between old impact & new impact method for each country for HepB, Measles and YF. Tooltips show the number of observations and the min and max order of the values. i.e. an impact of 23432.342 has order 5.
htmltools::tags$iframe(
src = "impact_comparisons.html",
width = "100%",
height = "400",
scrolling = "no",
seamless = "seamless",
frameBorder = "0",
`data-external` = "1"
)
There are small differences in impact because of differences in precision. calculate_impact
tries to do large parts of the aggregation on the database where burden_estimate
values are stored as Postgres real
type which has 4 bytes of storage size and 6 decimal digits of precision. Whereas if we pull this before aggregation the values are stored in R as doubles which are much higher precision leading to small differences when aggregated.
calculate_impact
at the moment won’t be able to generate impact for you for more than 1 set of scenario comparisons. Next steps will be to add a wrapper which can take some data like the impact recipe and call calculate_imapct
for multiple scenarios.