Process stochastic data — stone_stochastic

Convert a modelling group's stochastic files into the summary format, ready for later uploading to the Montagu data annex. Four files are produced which reduce age to all-age-total, and under-5-total, by calendar year, or birth-cohort year.

stone_stochastic_process(
  con,
  modelling_group,
  disease,
  touchstone,
  scenarios,
  in_path,
  files,
  cert,
  index_start,
  index_end,
  out_path,
  pre_aggregation_path = NULL,
  deaths = "deaths",
  cases = "cases",
  dalys = "dalys",
  runid_from_file = FALSE,
  allow_missing_disease = FALSE,
  upload_to_annex = FALSE,
  annex = NULL,
  allow_new_database = FALSE,
  bypass_cert_check = FALSE,
  testing = FALSE,
  lines = Inf,
  log_file = NULL,
  silent = FALSE
)

Arguments

con: DBI connection to production. Used for verifying certificate against expected properties
modelling_group: The modelling group id
disease: The disease
touchstone: The touchstone (including version) for these estimates
scenarios: A vector of scenario_descriptions. If the files parameter is of length more than 1, then it must be the same length as the number of scenarios, and a one-to-one mapping between the two is assumed.
in_path: The folder containing the stochastic files
files: Either a single string containing placeholders to indicate filenames, or a vector of files, one for each scenario. Placeholders can include :group :touchstone :scenario :disease and :index
cert: Name of the certificate file accompanying the estimates
index_start: A scalar or vector matching the length of scenarios. Each entry is either an integer or NA, indicating the first number in a sequence of files. NA implies there is a single file with no sequence. The placeholder :index in the filenames will be replaced with this.
index_end: Similar to index_start, indicating the last number in a sequence of a files. Can be scalar, applying to all scenarios, or a vector with an entry for each scenario, with an integer value or NA in each case.
out_path: Path to writing output files into
pre_aggregation_path: Path to dir to write out pre age-disaggregated data into. If NULL then this is skipped.
deaths: If deaths must be calculated as a sum of other burden outcomes, then provide a vector of the outcome names here. The default is the existing deaths burden_outcome.
cases: If cases must be calculated as a sum of other burden outcomes, then provide a vector of the outcome names here. The default is the existing cases burden_outcome.
dalys: If DALYs must be calculated as a sum of other burden outcomes, then provide a vector of the outcome names here. The default is the existing DALYs burden_outcome. Alternatively, for the one remaining group that does not provide DALYs, you can supply a data frame here, and stoner will calculate DALYs using that recipe. The data frame must have names outcome, proportion, average_duration and disability_weight. See stoner_calculate_dalys.
runid_from_file: Occasionally groups have omitted the run_id from the stochastic file, and provided 200 files, one per run_id. Set runid_from_file to TRUE if this is the case, to deduce the run_id from the filenames. The index_start and index_end must be 1 and 200 in this case.
allow_missing_disease: Occasionally groups have omitted the disease column from their stochastic data. Set this to TRUE to expect that circumstance, and avoid generating warnings.
upload_to_annex: Set to TRUE if you want to upload the results straight into annex. (Files will still be created, as the upload is relatively fast; creating the csvs is slower and worth caching)
annex: DBI connection to annex, used if upload_to_annex is TRUE.
allow_new_database: If uploading, then set this to TRUE to enable creating the stochastic_file table if it is not found.
bypass_cert_check: If TRUE, then no checks are carried out on the parameter certificate (if provided).
testing: For internal use only.
lines: Number of lines to read from each file, Inf by default to read all lines. Set a lower number for testing subset of process before doing the full run.
log_file: Path to file to save logs to, NULL to not log to file. If file exists it will be appended to, otherwise file will be created.
silent: TRUE to silence console logs.