Convert a modelling group's stochastic files into the summary format, ready for later uploading to the Montagu data annex. Four files are produced which reduce age to all-age-total, and under-5-total, by calendar year, or birth-cohort year.
stone_stochastic_process(
con,
modelling_group,
disease,
touchstone,
scenarios,
in_path,
files,
cert,
index_start,
index_end,
out_path,
pre_aggregation_path = NULL,
deaths = "deaths",
cases = "cases",
dalys = "dalys",
runid_from_file = FALSE,
allow_missing_disease = FALSE,
upload_to_annex = FALSE,
annex = NULL,
allow_new_database = FALSE,
bypass_cert_check = FALSE,
testing = FALSE,
lines = Inf,
log_file = NULL,
silent = FALSE
)
DBI connection to production. Used for verifying certificate against expected properties
The modelling group id
The disease
The touchstone (including version) for these estimates
A vector of scenario_descriptions. If the files parameter is of length more than 1, then it must be the same length as the number of scenarios, and a one-to-one mapping between the two is assumed.
The folder containing the stochastic files
Either a single string containing placeholders to indicate filenames, or a vector of files, one for each scenario. Placeholders can include :group :touchstone :scenario :disease and :index
Name of the certificate file accompanying the estimates
A scalar or vector matching the length of scenarios. Each entry is either an integer or NA, indicating the first number in a sequence of files. NA implies there is a single file with no sequence. The placeholder :index in the filenames will be replaced with this.
Similar to index_start, indicating the last number in a sequence of a files. Can be scalar, applying to all scenarios, or a vector with an entry for each scenario, with an integer value or NA in each case.
Path to writing output files into
Path to dir to write out pre age-disaggregated data into. If NULL then this is skipped.
If deaths must be calculated as a sum of other burden outcomes, then provide a vector of the outcome names here. The default is the existing deaths burden_outcome.
If cases must be calculated as a sum of other burden outcomes, then provide a vector of the outcome names here. The default is the existing cases burden_outcome.
If DALYs must be calculated as a sum of other burden
outcomes, then provide a vector of the outcome names here. The default
is the existing DALYs burden_outcome. Alternatively, for the one
remaining group that does not provide DALYs, you can supply a data
frame here, and stoner will calculate DALYs using that recipe. The
data frame must have names outcome
, proportion
, average_duration
and disability_weight
. See stoner_calculate_dalys.
Occasionally groups have omitted the run_id from the stochastic file, and provided 200 files, one per run_id. Set runid_from_file to TRUE if this is the case, to deduce the run_id from the filenames. The index_start and index_end must be 1 and 200 in this case.
Occasionally groups have omitted the disease column from their stochastic data. Set this to TRUE to expect that circumstance, and avoid generating warnings.
Set to TRUE if you want to upload the results straight into annex. (Files will still be created, as the upload is relatively fast; creating the csvs is slower and worth caching)
DBI connection to annex, used if upload_to_annex is TRUE.
If uploading, then set this to TRUE to enable creating the stochastic_file table if it is not found.
If TRUE, then no checks are carried out on the parameter certificate (if provided).
For internal use only.
Number of lines to read from each file, Inf by default to read all lines. Set a lower number for testing subset of process before doing the full run.
Path to file to save logs to, NULL to not log to file. If file exists it will be appended to, otherwise file will be created.
TRUE to silence console logs.