Notes of "DAGs with NO TEARS: Continuous Optimization for Structure Learning"
The paper contributed to the score-based causal discovery methods which are combinatorial constraint optimization problems given score functions. Luckily, the combinatorial constraint, i.e., the acyclicity, can be transformed into a continuous optimization problem by designing a score function of a matrix in linear cases to capture the acyclicity. Next, the problem can be solved by the standard numerical algorithms.
- In linear cases, this is an optimization problem over a matrix that represents graph structure. In (post-)nonlinear cases, the advantages may be disappeared.
- What is the philosophy of score-based methods? Why an edge in this way can be a causal relation? Can we remove the acyclicity constraint, and then what does the result mean?
- A score function should be coherent with the optimization step. In other words, the score of a graph structure should tell us more about the potential correct structure that we are looking for. This is the built-in information of a score function, which is not being used well. And then can we make use of it to search the structure? The gradient could be a useful guide, what could be the other types of such guide?
- What could be the relation between AutoML and score-based methods?