Tutorial 5: Monte Carlo Simulations and Sensitivity Analysis Support
This tutorial walks through the Monte Carlo simulation and sensitivity analysis (SA) functionality of Mimi, including core routines and examples. We will start with looking at using the Monte Carlo and SA routines with the multi-region Mimi model built in the second half of Tutorial 3, which is also available in the Mimi repository at examples/tutorial/02-multi-region-model
. Then we will show some more advanced features using a real Integrated Assessment model, MimiDICE2010.
For a more complete understanding of the Monte Carlo and SA Support, we recommend following up by reading How-to Guide 3: Conduct Monte Carlo Simulations and Sensitivity Analysis.
Working through the following tutorial will require:
- Julia v1.4.0 or higher
- Mimi v0.10.0 or higher
If you have not yet prepared these, go back to the first tutorial to set up your system.
MimiDICE2010 is required for the second example in this tutorial. If you are not yet comfortable with downloading and running a registered Mimi model, refer to Tutorial 2 for instructions.
The API
The best current documentation on the SA API is the how to guide How-to Guide 3: Conduct Sensitivity Analysis. This file can be used in conjunction with the examples below for details since the documentation covers more advanced options such as non-stochastic scenarios and running multiple models, which are not yet included in this tutorial.
Below we will refer separately to two types, SimulationDef
and SimulationInstance
. They are referred to as sim_def
and sim_inst
respectively as function arguments, and sd
and si
respectively as local variables.
Multi-Region Model Example
This section will walk through a simple example of how to define a simulation, run the simulation for a given model, and access the outputs.
Step 1. Setup
You should have Mimi
installed by now, and if you do not have the Distributions
package, take a moment to add that package by entering ]
to enter the Pkg REPL mode and then typing add Distributions
.
As a reminder, the following code is the multi-region model that was constructed in the second half of tutorial 3. You can either load the MyModel
module from tutorial 3, or run the following code which defines the same construct_Mymodel
function that we will use.
using Mimi
# Define the grosseconomy component
@defcomp grosseconomy begin
regions = Index() #Note that a regional index is defined here
YGROSS = Variable(index=[time, regions]) #Gross output
K = Variable(index=[time, regions]) #Capital
l = Parameter(index=[time, regions]) #Labor
tfp = Parameter(index=[time, regions]) #Total factor productivity
s = Parameter(index=[time, regions]) #Savings rate
depk = Parameter(index=[regions]) #Depreciation rate on capital - Note that it only has a region index
k0 = Parameter(index=[regions]) #Initial level of capital
share = Parameter() #Capital share
function run_timestep(p, v, d, t)
# Note that the regional dimension is defined in d and parameters and variables are indexed by 'r'
# Define an equation for K
for r in d.regions
if is_first(t)
v.K[t,r] = p.k0[r]
else
v.K[t,r] = (1 - p.depk[r])^5 * v.K[t-1,r] + v.YGROSS[t-1,r] * p.s[t-1,r] * 5
end
end
# Define an equation for YGROSS
for r in d.regions
v.YGROSS[t,r] = p.tfp[t,r] * v.K[t,r]^p.share * p.l[t,r]^(1-p.share)
end
end
end
# define the emissions component
@defcomp emissions begin
regions = Index() # The regions index must be specified for each component
E = Variable(index=[time, regions]) # Total greenhouse gas emissions
E_Global = Variable(index=[time]) # Global emissions (sum of regional emissions)
sigma = Parameter(index=[time, regions]) # Emissions output ratio
YGROSS = Parameter(index=[time, regions]) # Gross output - Note that YGROSS is now a parameter
# function init(p, v, d)
# end
function run_timestep(p, v, d, t)
# Define an equation for E
for r in d.regions
v.E[t,r] = p.YGROSS[t,r] * p.sigma[t,r]
end
# Define an equation for E_Global
for r in d.regions
v.E_Global[t] = sum(v.E[t,:])
end
end
end
# Define values for input parameters to be used when constructing the model
l = Array{Float64}(undef, 20, 3)
for t in 1:20
l[t,1] = (1. + 0.015)^t *2000
l[t,2] = (1. + 0.02)^t * 1250
l[t,3] = (1. + 0.03)^t * 1700
end
tfp = Array{Float64}(undef, 20, 3)
for t in 1:20
tfp[t,1] = (1 + 0.06)^t * 3.2
tfp[t,2] = (1 + 0.03)^t * 1.8
tfp[t,3] = (1 + 0.05)^t * 2.5
end
s = Array{Float64}(undef, 20, 3)
for t in 1:20
s[t,1] = 0.21
s[t,2] = 0.15
s[t,3] = 0.28
end
depk = [0.11, 0.135 ,0.15]
k0 = [50.5, 22., 33.5]
sigma = Array{Float64}(undef, 20, 3)
for t in 1:20
sigma[t,1] = (1. - 0.05)^t * 0.58
sigma[t,2] = (1. - 0.04)^t * 0.5
sigma[t,3] = (1. - 0.045)^t * 0.6
end
# Define a function for building the model
function construct_MyModel()
m = Model()
set_dimension!(m, :time, collect(2015:5:2110))
set_dimension!(m, :regions, [:Region1, :Region2, :Region3]) # Note that the regions of your model must be specified here
add_comp!(m, grosseconomy)
add_comp!(m, emissions)
update_param!(m, :grosseconomy, :l, l)
update_param!(m, :grosseconomy, :tfp, tfp)
update_param!(m, :grosseconomy, :s, s)
update_param!(m, :grosseconomy, :depk,depk)
update_param!(m, :grosseconomy, :k0, k0)
update_param!(m, :grosseconomy, :share, 0.3)
update_param!(m, :emissions, :sigma, sigma)
connect_param!(m, :emissions, :YGROSS, :grosseconomy, :YGROSS)
return m
end
Then, we obtain a copy of the model:
m = construct_MyModel()
Step 2. Define the Simulation
The @defsim
macro is the first step in the process, and returns a SimulationDef
. The following syntax allows users to define random variables (RVs) as distributions, and associate model parameters with the defined random variables.
There are two ways of assigning random variables to model parameters in the @defsim
macro. Notice that both of the following syntaxes are used in the following example.
The first is the following:
rv(rv1) = Normal(0, 0.8) # create a random variable called "rv1" with the specified distribution
param1 = rv1 # then assign this random variable "rv1" to the shared model parameter "param1" in the model
comp1.param2 = rv1 # then assign this random variable "rv1" to the unshared model parameter "param2" in component `comp1`
The second is a shortcut, in which you can directly assign the distribution on the right-hand side to the name of the model parameter on the left hand side. With this syntax, a single random variable is created under the hood and then assigned to our shared model parameter param1
and unshared model parameter param2
.
param1 = Normal(0, 0.8)
comp1.param2 = Normal(1,0)
Note here that if we have a shared model parameter we can assign based on its name, but if we have an unshared model parameter specific to one component/parameter pair we need to specify both. If the component is not specified Mimi will throw a warning and try to resolve under the hood with assumptions, proceeding if possible and erroring if not.
It is important to note that for each trial, a random variable on the right hand side of an assignment, be it using an explicitly defined random variable with rv(rv1)
syntax or using shortcut syntax as above, will take on the value of a single draw from the given distribution. This means that even if the random variable is applied to more than one parameter on the left hand side (such as assigning to a slice), each of these parameters will be assigned the same value, not different draws from the distribution
The @defsim
macro also selects the sampling method. Simple random sampling (also called Monte Carlo sampling) is the default. Other options include Latin Hypercube sampling and Sobol sampling. Below we show just one example of a @defsim
call, but the How-to guide referenced at the beginning of this tutorial gives a more comprehensive overview of the options.
using Mimi
using Distributions
sd = @defsim begin
# Define random variables. The rv() is only required when defining correlations
# or sharing an RV across parameters. Otherwise, you can use the shortcut syntax
# to assign a distribution to a parameter name.
rv(name1) = Normal(1, 0.2)
rv(name2) = Uniform(0.75, 1.25)
rv(name3) = LogNormal(20, 4)
# Define the sampling strategy, and if you are using LHS, you can define
# correlations like this:
sampling(LHSData, corrlist=[(:name1, :name2, 0.7), (:name1, :name3, 0.5)])
# assign RVs to model Parameters
grosseconomy.share = Uniform(0.2, 0.8)
# you can use the *= operator to replace the values in the parameter with the
# product of the original value and the value of the RV for the current
# trial (note that in both lines below, all indexed values will be mulitplied by the
# same draw from the given random parameter (name2 or Uniform(0.8, 1.2))
emissions.sigma[:, Region1] *= name2
emissions.sigma[2020:5:2050, (Region2, Region3)] *= Uniform(0.8, 1.2)
# For parameters that have a region dimension, you can assign an array of distributions,
# keyed by region label, which must match the region labels in the model
grosseconomy.depk = [Region1 => Uniform(0.7, .9),
Region2 => Uniform(0.8, 1.),
Region3 => Truncated(Normal(), 0, 1)]
# Indicate which variables to save for each model run.
# The syntax is: component_name.variable_name
save(grosseconomy.K, grosseconomy.YGROSS,
emissions.E, emissions.E_Global)
end
Step 3. Run Simulation
Next, use the run
function to run the simulation for the specified simulation definition, model (or list of models), and number of trials. View the internals documentation here for critical and useful details on the full signature of the run
function.
In its simplest use, the run
function generates and iterates over a sample of trial data from the distributions of the random variables defined in the SimulationDef
, perturbing the subset of Mimi's model parameters that have been assigned random variables, and then runs the given Mimi model(s) for each set of trial data. The function returns a SimulationInstance
, which holds a copy of the original SimulationDef
in addition to trials information (trials
, current_trial
, and current_data
), the model list models
, and results information in results
. Optionally, trial values and/or model results are saved to CSV files. Note that if there is concern about in-memory storage space for the results, use the results_in_memory
flag set to false
to incrementally clear the results from memory.
# Run 100 trials, and optionally save results to the indicated directories
si = run(sd, m, 100; trials_output_filename = "/tmp/trialdata.csv", results_output_dir="/tmp/tutorial4")
# Explore the results saved in-memory by using getdataframe with the returned SimulationInstance.
# Values are saved from each trial for each variable or parameter specified by the call to "save()" at the end of the @defsim block.
K_results = getdataframe(si, :grosseconomy, :K)
E_results = getdataframe(si, :emissions, :E)
Step 4. Explore and Plot Results
As described in the internals documentation here, Mimi provides both explore
and Mimi.plot
to explore the results of both a run Model
and a run SimulationInstance
.
To view your results in an interactive application viewer, simply call:
explore(si)
If desired, you may also include a title
for your application window. If more than one model was run in your Simulation, indicate which model you would like to explore with the model
keyword argument, which defaults to 1. Finally, if your model leverages different scenarios, you must indicate the scenario_name
.
explore(si; title = "MyWindow", model_index = 1) # we do not indicate scen_name here since we have no scenarios
To view the results for one of the saved variables from the save
command in @defsim
, use the (unexported to avoid namespace collisions) Mimi.plot
function. This function has the same keyword arguments and requirements as explore
(except for title
), and three required arguments: the SimulationInstance
, the component name (as a Symbol
), and the variable name (as a Symbol
).
Mimi.plot(si, :grosseconomy, :K)
To save your figure, use the save
function to save typical file formats such as PNG, SVG, PDF and EPS files. Note that while explore(sim_inst)
returns interactive plots for several graphs, Mimi.plot(si, :foo, :bar)
will return only static plots.
p = Mimi.plot(si, :grosseconomy, :K)
save("MyFigure.png", p)
Advanced Features - Social Cost of Carbon (SCC) Example
This example will discuss the more advanced SA capabilities of post-trial functions and payload objects.
Case: We want to do an SCC calculation with MimiDICE2010
, which consists of running both a base
and modified
model (the latter being a model including an additional emissions pulse, see the create_marginal_model
function or create your own two models). We then take the difference between the consumption level in these two models and obtain the discounted net present value to get the SCC.
The beginning steps for this case are identical to those above. We first define the typical variables for a simulation, including the number of trials N
and the simulation definition sd
. In this case we only define one random variable, t2xco2
, but note there could be any number of random variables defined here.
using Mimi
using MimiDICE2010
using Distributions
# define the number of trials
N = 100
# define your simulation (defaults to Monte Carlo sampling)
sd = @defsim begin
t2xco2 = Truncated(Gamma(6.47815626,0.547629469), 1.0, Inf) # a dummy distribution
end
Payload object
Simulation definitions can hold a user-defined payload object which is not used or modified by Mimi. In this example, we will use the payload to hold an array of pre-computed discount factors that we will use in the SCC calculation, as well as a storage array for saving the SCC values.
# Choose what year to calculate the SCC for
scc_year = 2015
year_idx = findfirst(isequal(scc_year), MimiDICE2010.model_years)
# Pre-compute the discount factors for each discount rate
discount_rates = [0.03, 0.05, 0.07]
nyears = length(MimiDICE2010.model_years)
discount_factors = [[zeros(year_idx - 1)... [(1/(1 + r))^((t-year_idx)*10) for t in year_idx:nyears]...] for r in discount_rates]
# Create an array to store the computed SCC in each trial for each discount rate
scc_results = zeros(N, length(discount_rates))
# Set the payload object in the simulation definition
my_payload_object = (discount_factors, scc_results) # In this case, the payload object is a tuple which holds both both arrays
Mimi.set_payload!(sd, my_payload_object)
Post-trial function
In the simple multi-region simulation example, the only values that were saved during each trial of the simulation were values of variables calculated internally by the model. Sometimes, a user may need to perform other calculations before or after each trial is run. For example, the SCC is calculated using two models, so this calculation needs to happen in a post-trial function, as shown below.
Here we define a post_trial_function
called my_scc_calculation
which will calculate the SCC for each trial of the simulation. Notice that this function retrieves and uses the payload object that was previously stored in the SimulationDef
.
function my_scc_calculation(sim_inst::SimulationInstance, trialnum::Int, ntimesteps::Int, tup::Nothing)
mm = sim_inst.models[1]
discount_factors, scc_results = Mimi.payload(sim_inst) # Unpack the payload object
marginal_damages = mm[:neteconomy, :C] * -1 * 10^12 * 12/44 # convert from trillion $/ton C to $/ton CO2; multiply by -1 to get positive value for damages
for (i, df) in enumerate(discount_factors)
scc_results[trialnum, i] = sum(df .* marginal_damages .* 10)
end
end
Run the simulation
Now that we have our post-trial function, we can proceed to obtain our two models and run the simulation. Note that we are using a Mimi MarginalModel mm
from MimiDICE2010, which is a Mimi object that holds both the base model and the model with the additional pulse of emissions.
# Build the marginal model
mm = MimiDICE2010.get_marginal_model(year = scc_year) # The additional emissions pulse will be added in the specified year
# Run
si = run(sd, mm, N; trials_output_filename = "ecs_sample.csv", post_trial_func = my_scc_calculation)
# View the scc_results by retrieving them from the payload object
scc_results = Mimi.payload(si)[2] # Recall that the SCC array was the second of two arrays we stored in the payload tuple
Other Helpful Functions
A small set of unexported functions are available to modify an existing SimulationDef
. Please refer to How-to Guide 3: Conduct Monte Carlo Simulations and Sensitivity Analysis for an in depth description of their use cases. The functions include the following:
delete_RV!
add_RV!
replace_RV!
delete_transform!
add_transform!
delete_save!
add_save!
get_simdef_rvnames
set_payload!
payload
Full list of keyword options for running a simulation
View How-to Guide 3: Conduct Monte Carlo Simulations and Sensitivity Analysis for critical and useful details on the full signature of this function, as well as more details and optionality for more advanced use cases.
function Base.run(sim_def::SimulationDef{T}, models::Union{Vector{Model}, Model}, samplesize::Int;
ntimesteps::Int=typemax(Int),
trials_output_filename::Union{Nothing, AbstractString}=nothing,
results_output_dir::Union{Nothing, AbstractString}=nothing,
pre_trial_func::Union{Nothing, Function}=nothing,
post_trial_func::Union{Nothing, Function}=nothing,
scenario_func::Union{Nothing, Function}=nothing,
scenario_placement::ScenarioLoopPlacement=OUTER,
scenario_args=nothing,
results_in_memory::Bool=true) where T <: AbstractSimulationData