Scripts¶

scripts (top level)¶

`animate`¶

Interactive animation of per-node epidemic states using graph-tool.

This script visualises a pre-computed node-state time-series (produced by enabling save_node_states in a simulation run) as an animated graph rendered with graph-tool and GTK.

Each node is rendered as a gender-specific icon (lady / man); nodes that are in an infectious state are highlighted with a red halo. Nodes that die are replaced with a zombie icon. The animation advances one day every two seconds.

Requires graph-tool, cairo, and the GTK bindings to be installed.

Typical usage:

python animate.py config.ini --nodes_file ../data/output/model/node_states.csv

`plot_experiments`¶

Command-line script for plotting MAIS simulation results.

Reads one or more simulation result files (.csv, .zip, or .feather), computes aggregate statistics (median/mean with IQR or SD shading), and saves the resulting plot. An optional fit/observed-data curve can be overlaid.

Typical usage:

python plot_experiments.py results_a.zip results_b.zip \
    --column I_d --out_file plot.png --label_names "Scenario A,Scenario B"

plot_experiments.process_zip(zip_path: str, save_feather=False)[source]¶

Extract all CSVs from a ZIP archive and concatenate them into one DataFrame.

Creates a temporary directory next to the ZIP file, extracts all *.csv entries, reads them (ignoring comment lines starting with #), adds an "id" column to each, and concatenates the results. The temporary directory is removed in a finally block regardless of errors.

Parameters:

zip_path (str) – Path to the .zip archive produced by run_multi_experiment.py.
save_feather (bool) – If True, the concatenated DataFrame is also saved as a .feather file with the same base name as the ZIP. Defaults to False.

Returns:

Concatenated DataFrame with all replicate results and an "id" column identifying each source file.

Return type:

pandas.DataFrame

plot_experiments.plot_dfs(dfs, column, figsize, out_path, xlabel, ylabel, labels=None, title=None, ymax=None, use_median=True, use_sd=False, fit_me=None, show_whole_fit=False, day_indices=None, day_labels=None)[source]¶

Create and save a multi-series line plot from a list of DataFrames.

Each DataFrame in dfs is plotted as one line with an uncertainty band (IQR or SD). An optional fit/observed data curve can be overlaid. Axis limits are inferred from the data unless overridden.

Parameters:

dfs (list[pandas.DataFrame]) – List of result DataFrames, each containing at minimum a "T" column and the column named by column.
column (str) – Name of the y-axis column to plot.
figsize (tuple[int, int]) – Figure size (width, height) in inches.
out_path (str) – File path where the plot image is saved.
xlabel (str) – Label for the x-axis.
ylabel (str) – Label for the y-axis.
labels (list[str] or None) – Legend labels, one per DataFrame in dfs. Defaults to None (no legend).
title (str or None) – Plot title. Defaults to None.
ymax (int or None) – Upper limit of the y-axis. Inferred from data if None. Defaults to None.
use_median (bool) – Use median as the central estimator when True; use mean otherwise. Defaults to True.
use_sd (bool) – Use standard-deviation shading when True; use interquartile-range shading otherwise. Defaults to False.
fit_me (pandas.DataFrame or None) – Optional DataFrame with columns "T" and column representing observed/fit data to overlay as an unshaded line. Defaults to None.
show_whole_fit (bool) – If True, include fit_me in the x-axis range calculation. Defaults to False.
day_indices (list[int] or None) – Positions along the x-axis at which to place custom tick marks. Must be combined with day_labels. Defaults to None.
day_labels (list[str] or None) – String labels for the ticks at day_indices. Defaults to None.

`preload_graph`¶

Command-line script for pre-loading and pickling a contact-network graph.

This script reads a model configuration file, constructs the graph described in the [GRAPH] section, and saves it as a pickle file at the path given by the file key in [GRAPH]. On subsequent runs, load_graph automatically detects the pickle file and skips the (potentially expensive) CSV parsing step.

Typical usage:

python preload_graph.py config.ini

`run_experiment`¶

Command-line script for running a single MAIS epidemic simulation.

This script loads a model configuration from an INI file (or a set of configurations generated by ConfigFileGenerator), runs the simulation one or more times, and writes the per-day state history to CSV files.

Typical usage:

python run_experiment.py config.ini my_run_id --n_repeat 5 -R 12345

Output files are written to the directory specified by output_dir in the config’s [TASK] section (or the current working directory if not set). Each repetition produces history_<test_id>_<i>.csv (or history.csv for a single run). Optionally, per-node state histories are saved as node_states_<test_id>.csv.

run_experiment.demo(cf, test_id=None, model_random_seed=42, print_interval=1, n_repeat=1)[source]¶

Load, run, and save an epidemic model from a config file.

Builds a ModelM from cf, runs it n_repeat times (resetting between repetitions), and writes the history CSV for each run. If the config enables save_node_states, per-node state files are written as well.

Parameters:

cf (ConfigFile) – Loaded configuration object describing the experiment.
test_id (str or None) – Tag appended to output file names. Defaults to None (no tag).
model_random_seed (int) – Random seed for the first run. Subsequent runs use the same seed (the model is reset, not re-seeded, between repetitions unless a different seed is set explicitly). Defaults to 42.
print_interval (int) – How often (in simulated days) to print a progress summary. Defaults to 1.
n_repeat (int) – Number of independent repetitions to execute. Defaults to 1.

`run_multi_experiment`¶

Command-line script for running many parallel MAIS epidemic simulations.

This script is the parallel, multi-repetition counterpart of run_experiment.py. It uses a Pool of worker processes to run n_repeat independent stochastic replicates of the same model and writes the results either into a single .zip archive (one CSV per replicate, default) or into a combined .feather file.

Typical usage:

python run_multi_experiment.py config.ini my_run --n_repeat 100 --n_jobs 8

Output files are written to the directory specified by output_dir in the config’s [TASK] section.

run_multi_experiment.evaluate_model(model, setup)[source]¶

Run one replicate of the model and return its results.

This function is designed to be called by a worker process inside utils.pool.Pool. It resets the model to a new random seed, runs the simulation, and returns the resulting DataFrame alongside bookkeeping information.

On AssertionError, the failure is logged to a .FAILED file and (idx, None, None, None, None) is returned so the pool can continue.

Parameters:

model (ModelM) – The model instance assigned to this worker.
setup (tuple) –
A five-element tuple (idx, random_seed, test_id, config, args) where
- idx (int) – worker index used to route answers back to the pool.
- random_seed (int or None) – seed to pass to model.reset. None means the seed is unchanged.
- test_id (str) – tag appended to output file names.
- config (ConfigFile) – configuration object (used only for error reporting).
- args (tuple) – (ndays, print_interval, verbose) run parameters.

Returns:

(idx, df, deads, random_seed, suffix) where

idx (int) – worker index.
df (pandas.DataFrame or None) – per-day state history with an added "id" column.
deads – always None in the current implementation.
random_seed (int) – actual seed used by the model.
suffix (str) – file-name suffix derived from test_id.

Return type:

tuple

run_multi_experiment.demo(cf, test_id=None, model_random_seed=42, print_interval=1, n_repeat=1, n_jobs=1, output_type='ZIP')[source]¶

Run many parallel epidemic model replicates and aggregate their output.

Loads the graph once, creates n_jobs duplicates of the model, and feeds tasks into a Pool. Results arrive asynchronously and are written to disk on the fly (ZIP mode) or collected in memory and written at the end (FEATHER mode).

Parameters:

cf (ConfigFile) – Loaded configuration object describing the experiment.
test_id (str or None) – Tag appended to output file and archive names. Defaults to None (no tag).
model_random_seed (int or list[int]) – Base seed (or explicit list of seeds for each replicate). When an integer, seeds for replicates i are model_random_seed + i. Defaults to 42.
print_interval (int) – Simulated-day interval for progress output. Defaults to 1.
n_repeat (int) – Total number of replicates to run. Defaults to 1.
n_jobs (int) – Number of parallel worker processes. Defaults to 1.
output_type (str) – Output format – "ZIP" (default) saves each replicate as a separate CSV inside a zip archive; "FEATHER" concatenates all results into a single feather file.

Raises:

ValueError – If output_type is not "ZIP" or "FEATHER".

run_multi_experiment.save_to_zipped_csv(test_id, idx, df, config, random_seed, suffix, zipname)[source]¶

Append a single replicate’s result DataFrame to a shared ZIP archive.

The CSV content is preceded by the configuration file text (each line prefixed with #) and the random seed, matching the single-run CSV format produced by run_experiment.py.

Parameters:

test_id (str) – Experiment tag (used only for internal bookkeeping; the file name is derived from suffix).
idx (int) – Replicate index (unused in the current implementation; reserved for future use).
df (pandas.DataFrame) – Per-day simulation results to write.
config (ConfigFile) – Configuration object whose string representation is prepended as comments.
random_seed (int) – Actual random seed used for this replicate.
suffix (str) – File-name suffix that forms part of the entry name inside the archive (e.g. "_my_run_3").
zipname (str) – Path to the ZIP archive to append to. The archive must already exist (created by the caller).

`run_search`¶

Command-line script for hyperparameter search on MAIS epidemic models.

Loads a model configuration and a hyperparameter search configuration (JSON), then runs the selected search method (grid search or CMA-ES) to minimise a specified loss function (RMSE, MAE, or R²) against observed gold data.

Typical usage:

python run_search.py config.ini gridsearch.json \
    --fit_data ../data/fit_data/fit_me.csv \
    --return_func rmse --n_jobs 4

Results are saved as a CSV file in the directory specified by --out_dir. An optional evolution log can be written with -l.

run_search.load_gold_data(csv_path, first_n_zeros=0, data_column=2, from_day=0, until_day=None, use_dates=False)[source]¶

Load and pre-process observed (gold) data for model fitting.

Reads a CSV file, aligns its time index, optionally pads the beginning with zero-valued rows, and returns a DataFrame with "day" and "infected" columns sliced to the requested day range.

Forward-fill is applied to handle any missing values in the data column.

Parameters:

csv_path (str) – Path to the CSV file containing observed data. The file must have either a "datum" column (parsed as dates when use_dates=True) or a "T" column with day indices.
first_n_zeros (int) – Number of zero-valued days to prepend. The existing day indices are shifted by this value. Defaults to 0.
data_column (int or str) – Either the integer column position (0-based) or the string column name of the observed values. Defaults to 2.
from_day (int) – First row index (0-based) to include after all pre-processing. Defaults to 0.
until_day (int or None) – Exclusive upper row index. None means include all remaining rows. Defaults to None.
use_dates (bool) – If True, derive day indices from the "datum" column (parsed as dates, with day 0 = the first date). Otherwise use the "T" column. Defaults to False.

Returns:

DataFrame with columns "day" (int) and "infected" (float), sliced to [from_day, until_day).

Return type:

pandas.DataFrame

Scripts¶

scripts (top level)¶

animate¶

plot_experiments¶

preload_graph¶

run_experiment¶

run_multi_experiment¶

run_search¶

`animate`¶

`plot_experiments`¶

`preload_graph`¶

`run_experiment`¶

`run_multi_experiment`¶

`run_search`¶