Scripts¶
scripts (top level)¶
animate¶
Interactive animation of per-node epidemic states using graph-tool.
This script visualises a pre-computed node-state time-series (produced by
enabling save_node_states in a simulation run) as an animated graph
rendered with graph-tool and GTK.
Each node is rendered as a gender-specific icon (lady / man); nodes that are in an infectious state are highlighted with a red halo. Nodes that die are replaced with a zombie icon. The animation advances one day every two seconds.
Requires graph-tool, cairo, and the GTK bindings to be installed.
Typical usage:
python animate.py config.ini --nodes_file ../data/output/model/node_states.csv
plot_experiments¶
Command-line script for plotting MAIS simulation results.
Reads one or more simulation result files (.csv, .zip, or
.feather), computes aggregate statistics (median/mean with IQR or SD
shading), and saves the resulting plot. An optional fit/observed-data curve
can be overlaid.
Typical usage:
python plot_experiments.py results_a.zip results_b.zip \
--column I_d --out_file plot.png --label_names "Scenario A,Scenario B"
- plot_experiments.process_zip(zip_path: str, save_feather=False)[source]¶
Extract all CSVs from a ZIP archive and concatenate them into one DataFrame.
Creates a temporary directory next to the ZIP file, extracts all
*.csventries, reads them (ignoring comment lines starting with#), adds an"id"column to each, and concatenates the results. The temporary directory is removed in afinallyblock regardless of errors.- Parameters:
zip_path (str) – Path to the
.ziparchive produced byrun_multi_experiment.py.save_feather (bool) – If
True, the concatenated DataFrame is also saved as a.featherfile with the same base name as the ZIP. Defaults toFalse.
- Returns:
Concatenated DataFrame with all replicate results and an
"id"column identifying each source file.- Return type:
pandas.DataFrame
- plot_experiments.plot_dfs(dfs, column, figsize, out_path, xlabel, ylabel, labels=None, title=None, ymax=None, use_median=True, use_sd=False, fit_me=None, show_whole_fit=False, day_indices=None, day_labels=None)[source]¶
Create and save a multi-series line plot from a list of DataFrames.
Each DataFrame in
dfsis plotted as one line with an uncertainty band (IQR or SD). An optional fit/observed data curve can be overlaid. Axis limits are inferred from the data unless overridden.- Parameters:
dfs (list[pandas.DataFrame]) – List of result DataFrames, each containing at minimum a
"T"column and the column named bycolumn.column (str) – Name of the y-axis column to plot.
figsize (tuple[int, int]) – Figure size
(width, height)in inches.out_path (str) – File path where the plot image is saved.
xlabel (str) – Label for the x-axis.
ylabel (str) – Label for the y-axis.
labels (list[str] or None) – Legend labels, one per DataFrame in
dfs. Defaults toNone(no legend).title (str or None) – Plot title. Defaults to
None.ymax (int or None) – Upper limit of the y-axis. Inferred from data if
None. Defaults toNone.use_median (bool) – Use median as the central estimator when
True; use mean otherwise. Defaults toTrue.use_sd (bool) – Use standard-deviation shading when
True; use interquartile-range shading otherwise. Defaults toFalse.fit_me (pandas.DataFrame or None) – Optional DataFrame with columns
"T"andcolumnrepresenting observed/fit data to overlay as an unshaded line. Defaults toNone.show_whole_fit (bool) – If
True, includefit_mein the x-axis range calculation. Defaults toFalse.day_indices (list[int] or None) – Positions along the x-axis at which to place custom tick marks. Must be combined with
day_labels. Defaults toNone.day_labels (list[str] or None) – String labels for the ticks at
day_indices. Defaults toNone.
preload_graph¶
Command-line script for pre-loading and pickling a contact-network graph.
This script reads a model configuration file, constructs the graph described
in the [GRAPH] section, and saves it as a pickle file at the path given
by the file key in [GRAPH]. On subsequent runs, load_graph
automatically detects the pickle file and skips the (potentially expensive)
CSV parsing step.
Typical usage:
python preload_graph.py config.ini
run_experiment¶
Command-line script for running a single MAIS epidemic simulation.
This script loads a model configuration from an INI file (or a set of
configurations generated by ConfigFileGenerator), runs the simulation one
or more times, and writes the per-day state history to CSV files.
Typical usage:
python run_experiment.py config.ini my_run_id --n_repeat 5 -R 12345
Output files are written to the directory specified by output_dir in the
config’s [TASK] section (or the current working directory if not set).
Each repetition produces history_<test_id>_<i>.csv (or history.csv
for a single run). Optionally, per-node state histories are saved as
node_states_<test_id>.csv.
- run_experiment.demo(cf, test_id=None, model_random_seed=42, print_interval=1, n_repeat=1)[source]¶
Load, run, and save an epidemic model from a config file.
Builds a
ModelMfromcf, runs itn_repeattimes (resetting between repetitions), and writes the history CSV for each run. If the config enablessave_node_states, per-node state files are written as well.- Parameters:
cf (ConfigFile) – Loaded configuration object describing the experiment.
test_id (str or None) – Tag appended to output file names. Defaults to
None(no tag).model_random_seed (int) – Random seed for the first run. Subsequent runs use the same seed (the model is reset, not re-seeded, between repetitions unless a different seed is set explicitly). Defaults to
42.print_interval (int) – How often (in simulated days) to print a progress summary. Defaults to
1.n_repeat (int) – Number of independent repetitions to execute. Defaults to
1.
run_multi_experiment¶
Command-line script for running many parallel MAIS epidemic simulations.
This script is the parallel, multi-repetition counterpart of
run_experiment.py. It uses a Pool of worker processes to run
n_repeat independent stochastic replicates of the same model and writes
the results either into a single .zip archive (one CSV per replicate,
default) or into a combined .feather file.
Typical usage:
python run_multi_experiment.py config.ini my_run --n_repeat 100 --n_jobs 8
Output files are written to the directory specified by output_dir in the
config’s [TASK] section.
- run_multi_experiment.evaluate_model(model, setup)[source]¶
Run one replicate of the model and return its results.
This function is designed to be called by a worker process inside
utils.pool.Pool. It resets the model to a new random seed, runs the simulation, and returns the resulting DataFrame alongside bookkeeping information.On
AssertionError, the failure is logged to a.FAILEDfile and(idx, None, None, None, None)is returned so the pool can continue.- Parameters:
model (ModelM) – The model instance assigned to this worker.
setup (tuple) –
A five-element tuple
(idx, random_seed, test_id, config, args)whereidx(int) – worker index used to route answers back to the pool.random_seed(int or None) – seed to pass tomodel.reset.Nonemeans the seed is unchanged.test_id(str) – tag appended to output file names.config(ConfigFile) – configuration object (used only for error reporting).args(tuple) –(ndays, print_interval, verbose)run parameters.
- Returns:
(idx, df, deads, random_seed, suffix)whereidx(int) – worker index.df(pandas.DataFrame or None) – per-day state history with an added"id"column.deads– alwaysNonein the current implementation.random_seed(int) – actual seed used by the model.suffix(str) – file-name suffix derived fromtest_id.
- Return type:
tuple
- run_multi_experiment.demo(cf, test_id=None, model_random_seed=42, print_interval=1, n_repeat=1, n_jobs=1, output_type='ZIP')[source]¶
Run many parallel epidemic model replicates and aggregate their output.
Loads the graph once, creates
n_jobsduplicates of the model, and feeds tasks into aPool. Results arrive asynchronously and are written to disk on the fly (ZIP mode) or collected in memory and written at the end (FEATHER mode).- Parameters:
cf (ConfigFile) – Loaded configuration object describing the experiment.
test_id (str or None) – Tag appended to output file and archive names. Defaults to
None(no tag).model_random_seed (int or list[int]) – Base seed (or explicit list of seeds for each replicate). When an integer, seeds for replicates
iaremodel_random_seed + i. Defaults to42.print_interval (int) – Simulated-day interval for progress output. Defaults to
1.n_repeat (int) – Total number of replicates to run. Defaults to
1.n_jobs (int) – Number of parallel worker processes. Defaults to
1.output_type (str) – Output format –
"ZIP"(default) saves each replicate as a separate CSV inside a zip archive;"FEATHER"concatenates all results into a single feather file.
- Raises:
ValueError – If
output_typeis not"ZIP"or"FEATHER".
- run_multi_experiment.save_to_zipped_csv(test_id, idx, df, config, random_seed, suffix, zipname)[source]¶
Append a single replicate’s result DataFrame to a shared ZIP archive.
The CSV content is preceded by the configuration file text (each line prefixed with
#) and the random seed, matching the single-run CSV format produced byrun_experiment.py.- Parameters:
test_id (str) – Experiment tag (used only for internal bookkeeping; the file name is derived from
suffix).idx (int) – Replicate index (unused in the current implementation; reserved for future use).
df (pandas.DataFrame) – Per-day simulation results to write.
config (ConfigFile) – Configuration object whose string representation is prepended as comments.
random_seed (int) – Actual random seed used for this replicate.
suffix (str) – File-name suffix that forms part of the entry name inside the archive (e.g.
"_my_run_3").zipname (str) – Path to the ZIP archive to append to. The archive must already exist (created by the caller).
run_search¶
Command-line script for hyperparameter search on MAIS epidemic models.
Loads a model configuration and a hyperparameter search configuration (JSON), then runs the selected search method (grid search or CMA-ES) to minimise a specified loss function (RMSE, MAE, or R²) against observed gold data.
Typical usage:
python run_search.py config.ini gridsearch.json \
--fit_data ../data/fit_data/fit_me.csv \
--return_func rmse --n_jobs 4
Results are saved as a CSV file in the directory specified by --out_dir.
An optional evolution log can be written with -l.
- run_search.load_gold_data(csv_path, first_n_zeros=0, data_column=2, from_day=0, until_day=None, use_dates=False)[source]¶
Load and pre-process observed (gold) data for model fitting.
Reads a CSV file, aligns its time index, optionally pads the beginning with zero-valued rows, and returns a DataFrame with
"day"and"infected"columns sliced to the requested day range.Forward-fill is applied to handle any missing values in the data column.
- Parameters:
csv_path (str) – Path to the CSV file containing observed data. The file must have either a
"datum"column (parsed as dates whenuse_dates=True) or a"T"column with day indices.first_n_zeros (int) – Number of zero-valued days to prepend. The existing day indices are shifted by this value. Defaults to
0.data_column (int or str) – Either the integer column position (0-based) or the string column name of the observed values. Defaults to
2.from_day (int) – First row index (0-based) to include after all pre-processing. Defaults to
0.until_day (int or None) – Exclusive upper row index.
Nonemeans include all remaining rows. Defaults toNone.use_dates (bool) – If
True, derive day indices from the"datum"column (parsed as dates, with day 0 = the first date). Otherwise use the"T"column. Defaults toFalse.
- Returns:
DataFrame with columns
"day"(int) and"infected"(float), sliced to[from_day, until_day).- Return type:
pandas.DataFrame