Class AnalysisGraph

Class Documentation

class AnalysisGraph

The AnalysisGraph class is the main model/interface for Delphi.

Public Functions

inline AnalysisGraph()
inline ~AnalysisGraph()
std::string to_json_string(int indent = 0)
void set_res(size_t res)
inline void set_n_kde_kernels(size_t kde_kernels)
size_t get_res()
void from_causemos_json_dict(const nlohmann::json &json_data, double belief_score_cutoff, double grounding_score_cutoff)

Construct an AnalysisGraph object from JSON exported by CauseMos.

AnalysisGraph(const AnalysisGraph &rhs)

Copy constructor

copy constructor TODO: Most probably the copy is sharing the same random number generation class RNG TODO: If at any point we make a copy of AnalysisGraph and find that the copy does not behave as intended, we might have not copied something or we might have copied something incorrectly. This is one place to look for bugs.

AnalysisGraph &operator=(AnalysisGraph rhs)

Copy assignment operator

Copy assignment operator (copy-and-swap idiom)

inline bool get_trained()
inline bool get_stopped()
inline double get_log_likelihood()
inline double get_previous_log_likelihood()
inline double get_log_likelihood_MAP()
std::string generate_create_model_response()

Generate the response for the create model request from the HMI. Calculate and return a JSON string with edge weight information for visualizing AnalysisGraph models in CauseMos. For now we always return success. We need to update this by conveying errors into this response.

Generate the response for the create model request from the HMI. For now we always return success. We need to update this by conveying errors into this response.

FormattedProjectionResult run_causemos_projection_experiment_from_json_string(std::string json_string)
FormattedProjectionResult run_causemos_projection_experiment_from_json_file(std::string filename)
unsigned short freeze_edge_weight(std::string source, std::string target, double scaled_weight, int polarity)
Parameters:
  • source – Source concept name

  • target – Target concept name

  • scaled_weight – A value in the range [0, 1]. Delphi edge weights are angles in the range [-π/2, π/2]. Values in the range ]0, π/2[ represents positive polarities and values in the range ]-π/2, 0[ represents negative polarities.

  • polarity – Polarity of the edge. Should be either 1 or -1.

Returns:

0 freezing the edge is successful 1 scaled_weight outside accepted range 2 Source concept does not exist 4 Target concept does not exist 8 Edge does not exist

std::string serialize_to_json_string(bool verbose = true, bool compact = true)
void export_create_model_json_string()
inline size_t num_vertices()

Number of nodes in the graph

inline Node &operator[](std::string node_name)
inline Node &operator[](int v)
inline size_t num_edges()
inline auto edges() const
inline Edge &edge(EdgeDescriptor e)
inline Edge &edge(int source, int target)
inline Edge &edge(int source, std::string target)
inline Edge &edge(std::string source, int target)
inline Edge &edge(std::string source, std::string target)
inline boost::range_detail::integer_iterator<unsigned long> begin()
inline boost::range_detail::integer_iterator<unsigned long> end()
inline Eigen::VectorXd &get_initial_latent_state()
inline double get_MAP_log_likelihood()
int add_node(std::string concept)
bool add_edge(CausalFragment causal_fragment)
void add_edge(CausalFragmentCollection causal_fragments)
std::pair<EdgeDescriptor, bool> add_edge(int, int)
std::pair<EdgeDescriptor, bool> add_edge(int, std::string)
std::pair<EdgeDescriptor, bool> add_edge(std::string, int)
std::pair<EdgeDescriptor, bool> add_edge(std::string, std::string)
void remove_node(std::string concept)
void remove_nodes(std::unordered_set<std::string> concepts)
void remove_edge(std::string src, std::string tgt)
void remove_edges(std::vector<std::pair<std::string, std::string>> edges)
AnalysisGraph get_subgraph_for_concept(std::string concept, bool inward = false, int depth = -1)

Returns the subgraph of the AnalysisGraph around a concept.

Parameters:
  • concept – The concept to center the subgraph about.

  • depth – : The maximum number of hops from the concept provided to be included in the subgraph. #param inward : Sets the direction of the causal influence flow to examine. False - (default) A subgraph rooted at the concept provided. True - A subgraph with all the paths ending at the concept provided.

AnalysisGraph get_subgraph_for_concept_pair(std::string source_concept, std::string target_concept, int cutoff = -1)

Returns a new AnaysisGraph related to the source concept and the target concetp provided, which is a subgraph of this graph. This subgraph contains all the simple directed paths of length less than or equal to the provided cutoff.

Parameters:
  • source_concept – The concept where the influence starts.

  • target_concept – The concept where the influence ends.

  • cutoff – : Maximum length of a directed simple path from the source to target to be included in the subgraph.

void prune(int cutoff = 2)
void merge_nodes(std::string concept_1, std::string concept_2, bool same_polarity = true)

Merges the CAG nodes for the two concepts concept_1 and concept_2 with the option to specify relative polarity.

void change_polarity_of_edge(std::string source_concept, int source_polarity, std::string target_concept, int target_polarity)
int set_indicator(std::string concept, std::string indicator, std::string source)
void delete_indicator(std::string concept, std::string indicator)
void delete_all_indicators(std::string concept)
void map_concepts_to_indicators(int n = 1, std::string country = "")

Map each concept node in the AnalysisGraph instance to one or more tangible quantities, known as ‘indicators’.

Parameters:

n – Int representing number of indicators to attach per node. Default is 1 since our model so far is configured for only 1 indicator per node.

void find_all_paths()
void set_random_seed(int seed)
void set_derivative(std::string, double)
void train_model(int start_year = 2012, int start_month = 1, int end_year = 2017, int end_month = 12, int res = 200, int burn = 10000, std::string country = "South Sudan", std::string state = "", std::string county = "", std::map<std::string, std::string> units = {}, InitialBeta initial_beta = InitialBeta::ZERO, InitialDerivative initial_derivative = InitialDerivative::DERI_ZERO, bool use_heuristic = false, bool use_continuous = true)

Train a prediction model given a CAG with indicators

Parameters:
  • start_year – : Start year of the sequence of data

  • start_month – : Start month of the sequence of data

  • end_year – : End year of the sequence of data

  • end_month – : End month of the sequence of data

  • res – : Sampling resolution. The number of samples to retain.

  • burn – : Number of samples to throw away. Start retaining samples after throwing away this many samples.

  • country – : Country where the data is about

  • state – : State where the data is about

  • county – : county where the data is about

  • units – : Units for each indicator. Maps indicator name –> unit

  • initial_beta – Criteria to initialize β

  • use_heuristic – : Informs how to handle missing observations. false => let them be missing. true => fill them. See data.hpp::get_observations_for() for missing data rules.

  • use_continuous – Choose between continuous vs discretized versions of the differential equation solution. Default is to use the continuous version with matrix exponential.

void run_train_model(int res = 200, int burn = 10000, HeadNodeModel head_node_model = HeadNodeModel::HNM_NAIVE, InitialBeta initial_beta = InitialBeta::ZERO, InitialDerivative initial_derivative = InitialDerivative::DERI_ZERO, bool use_heuristic = false, bool use_continuous = true, int train_start_timestep = 0, int train_timesteps = -1, std::unordered_map<std::string, int> concept_periods = {}, std::unordered_map<std::string, std::string> concept_center_measures = {}, std::unordered_map<std::string, std::string> concept_models = {}, std::unordered_map<std::string, double> concept_min_vals = {}, std::unordered_map<std::string, double> concept_max_vals = {}, std::unordered_map<std::string, std::function<double(unsigned int, double)>> ext_concepts = {})
void run_train_model_2(int res = 200, int burn = 10000, InitialBeta initial_beta = InitialBeta::ZERO, InitialDerivative initial_derivative = InitialDerivative::DERI_ZERO, bool use_heuristic = false, bool use_continuous = true)
inline void set_initial_latent_state(Eigen::VectorXd vec)
void set_default_initial_state(InitialDerivative id = InitialDerivative::DERI_ZERO)
Prediction generate_prediction(int start_year, int start_month, int end_year, int end_month, ConstraintSchedule constraints = ConstraintSchedule(), bool one_off = true, bool clamp_deri = true)

Given a trained model, generate this->res number of predicted observed state sequences.

Parameters:
  • start_year – : Start year of the prediction Should be >= the start year of training

  • start_month – : Start month of the prediction If training and prediction start years are equal should be >= the start month of training

  • end_year – : End year of the prediction

  • end_month – : End month of the prediction

Returns:

Predicted observed state (indicator value) sequence for the prediction period including start and end time points. This is a tuple. The first element is a std::vector of std::strings with labels for each time point predicted (year-month). The second element contains predicted values. Access it as: [ sample number ][ time point ][ vertex name ][ indicator name ]

void generate_prediction(int pred_start_timestep, int pred_timesteps, ConstraintSchedule constraints = ConstraintSchedule(), bool one_off = true, bool clamp_deri = true)
std::vector<std::vector<double>> prediction_to_array(std::string indicator)

this->generate_prediction() must be called before calling this method. Outputs raw predictions for a given indicator that were generated by generate_prediction(). Each column is a time step and the rows are the samples for that time step.

Parameters:

indicator – A std::string representing the indicator variable for which we want predictions.

Returns:

A (this->res, x this->pred_timesteps) dimension 2D array (std::vector of std::vectors)

void generate_synthetic_data(unsigned int num_obs = 48, double noise_variance = 0.1, unsigned int kde_kernels = 1000, InitialBeta initial_beta = InitialBeta::PRIOR, InitialDerivative initial_derivative = InitialDerivative::DERI_PRIOR, bool use_continuous = false)
void initialize_random_CAG(unsigned int num_obs, unsigned int kde_kernels, InitialBeta initial_beta, InitialDerivative initial_derivative, bool use_continuous)

TODO: This is very similar to initialize_parameters() method defined in parameter_initialization.cpp. Might be able to merge the two

Parameters:
  • kde_kernels – Number of KDE kernels to use when constructing beta prior distributions

  • initial_beta – How to initialize betas

  • initial_derivative – How to initialize derivatives

  • use_continuous – Whether to use matrix exponential or not

void interpolate_missing_months(std::vector<int> &filled_months, Node &n)
std::string to_dot()

Output the graph in DOT format

void to_png(std::string filename = "CAG.png", bool simplified_labels = false, int label_depth = 1, std::string node_to_highlight = "", std::string rankdir = "TB")
Parameters:
  • label_depth – Whether to create simplified labels or not.

  • node_to_highlight – Depth in the ontology to which simplified labels extend

void print_nodes()
void print_edges()
void print_name_to_vertex()
void print_indicators()
void print_A_beta_factors()
void print_latent_state(const Eigen::VectorXd&)
void print_all_paths()
void print_cells_affected_by_beta(int source, int target)
void print_training_range()
void print_MAP_estimate()
CredibleIntervals get_credible_interval(Predictions preds)
CompleteState get_complete_state()
sqlite3 *open_delphi_db(int mode = SQLITE_OPEN_READONLY)
void write_model_to_db(std::string model_id)
AdjectiveResponseMap construct_adjective_response_map(std::mt19937 gen, std::uniform_real_distribution<double> &uni_dist, std::normal_distribution<double> &norm_dist, size_t n_kernels)

This is a helper function used by construct_theta_pdfs()

void initialize_profiler(int res = 100, int kde_kernels = 1000, InitialBeta initial_beta = InitialBeta::ZERO, InitialDerivative initial_derivative = InitialDerivative::DERI_ZERO, bool use_continuous = true)
void profile_mcmc(int run = 1, std::string file_name_prefix = "mcmc_timing")
void profile_kde(int run = 1, std::string file_name_prefix = "kde_timing")
void profile_prediction(int run = 1, int pred_timesteps = 24, std::string file_name_prefix = "prediction_timing")
void profile_matrix_exponential(int run = 1, std::string file_name_prefix = "mat_exp_timing", std::vector<double> unique_gaps = {1, 2, 5}, int repeat = 30, bool multi_threaded = false)

Public Members

std::string id
std::string experiment_id = "experiment_id_not_set"
bool data_heuristic = false

Public Static Functions

static AnalysisGraph from_indra_statements_json_dict(nlohmann::json json_data, double belief_score_cutoff = 0.9, double grounding_score_cutoff = 0.0, std::string ontology = "WM")

A method to construct an AnalysisGraph object given a JSON-serialized list of INDRA statements.

Parameters:

filename – The path to the file containing the JSON-serialized INDRA statements.

static AnalysisGraph from_indra_statements_json_string(std::string json_string, double belief_score_cutoff = 0.9, double grounding_score_cutoff = 0.0, std::string ontology = "WM")
static AnalysisGraph from_indra_statements_json_file(std::string filename, double belief_score_cutoff = 0.9, double grounding_score_cutoff = 0.0, std::string ontology = "WM")
static AnalysisGraph from_causal_fragments(std::vector<CausalFragment> causal_fragments)

A method to construct an AnalysisGraph object given from a std::vector of ( subject, object ) pairs (Statements)

Parameters:

statements – A std::vector of CausalFragment objects

static AnalysisGraph from_causal_fragments_with_data(std::pair<std::vector<CausalFragment>, ConceptIndicatorAlignedData> cag_ind_data, int kde_kernels = 5)
static AnalysisGraph from_json_string(std::string)

From internal string representation output by to_json_string

static AnalysisGraph from_causemos_json_string(std::string json_string, double belief_score_cutoff = 0, double grounding_score_cutoff = 0, int kde_kernels = 4)

Construct an AnalysisGraph object from a JSON string exported by CauseMos.

static AnalysisGraph from_causemos_json_file(std::string filename, double belief_score_cutoff = 0, double grounding_score_cutoff = 0, int kde_kernels = 4)

Construct an AnalysisGraph object from a file containing JSON data from CauseMos.

static AnalysisGraph deserialize_from_json_string(std::string json_string, bool verbose = true)
static AnalysisGraph deserialize_from_json_file(std::string filename, bool verbose = true)
static void check_multithreading()
static AnalysisGraph generate_random_CAG(unsigned int num_nodes, unsigned int num_extra_edges = 0)