Question: Support for synthetic scale-free graph generation

Hello SDCD developers! I’m following the SDCD tutorial to generate synthetic data and wonder if it's possible to generate scale-free graph as well. From the SDCD paper, it looks like the experiments evaluate on Erdős–Rényi (ER) graphs, not scale-free graph. Could you please confirm if that is correct?

In the simulation function `random_model_gaussian_global_variance`, the `dag_type` argument supports "ER", but no scale-free. However, I noticed that the lower-level `random_dag` function does include a "scale_free" option. 

Is there a recommended way to generate scale-free graphs using the SDCD simulation utilities, or is the most straightforward approach to pass distribution="scale_free" to `random_dag` and have `random_model_gaussian_global_variance` call it like I show below? I’d like to confirm the best practice for generating scale-free graphs, since the tutorial only demonstrates the ER case!

Thanks very much for your guidance!

```
def random_dag(n_nodes: int = 20, n_edges: int = 20, distribution: str = "scale_free"):
    """Return a random DAG.

    Args:
        n_nodes: Number of nodes.
        n_edges: Number of edges (only used for uniform distribution).
        distribution: Distribution of the random graph, one of "uniform" (or "erdos_renyi") or "scale_free".
    """
    if distribution in ["uniform", "erdos_renyi"]:
        graph = nx.gnm_random_graph(n_nodes, n_edges, directed=False)
    elif distribution == "scale_free":
        graph = nx.scale_free_graph(n_nodes, alpha=0.41, beta=0.54, gamma=0.05)
    else:
        raise ValueError(f"Unknown distribution {distribution}.")

    return random_dag_from_undirected_graph(graph)

np.random.seed(42)

n = 10000
n_per_intervention = 500
d = 50
n_edges = 200   # d * s 


true_causal_model = random_model_gaussian_global_variance(
    d,
    n_edges,
    dag_type="ER",
    scale=0.5,
    hard=True,
)

X_df = true_causal_model.generate_dataframe_from_all_distributions(
    n_samples_control=n,
    n_samples_per_intervention=n_per_intervention,
)
X_df.iloc[:, :-1] = (X_df.iloc[:, :-1] - X_df.iloc[:, :-1].mean()) / X_df.iloc[
    :, :-1
].std() # Normalize the dat
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Support for synthetic scale-free graph generation #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question: Support for synthetic scale-free graph generation #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions