Algorithmic differentiation and vertex elimination on computational hypergraphs.

Keywords

Abstract

Algorithmic differentiation is a numerical method for computing derivatives of functions defined by computer codes. It systematically applies the chain rule to the operations executed during a program evaluation, typically represented by a directed acyclic graph. The vertices of this graph correspond to intermediate scalar values computed and stored during the program execution, and edges represent the data dependencies between these values. The process of accumulating derivatives involves successively eliminating vertices in a specific order within this graph. While the graph model captures perfectly the situation where the program executes scalar-valued operations with one or two input scalars, it does not immediately allow for raising the abstraction to a coarser level. The major contribution of this article is a novel combinatorial model that is capable of representing algorithmic differentiation for more general operations. Specifically, the model accommodates operations with multiple-input and multiple-output relations as well as intermediate values that are non-scalar. The new combinatorial model is based on computational hypergraphs, a novel class of directed hypergraphs where vertices represent non-scalar intermediate values and hyperedges correspond to operations involving multiple inputs and outputs-essentially, subroutines. Accumulating derivatives then corresponds to vertex elimination within these computational hypergraphs. In contrast to the graph model, the new elimination rules in the hypergraph model differ.