Department of Mathematics
 Search | Help | Login

Math @ Duke





.......................

.......................


Publications [#371705] of Shijun Zhang

Papers Published

  1. Shen, Z; Yang, H; Zhang, S, Neural Network Architecture Beyond Width and Depth, Advances in Neural Information Processing Systems, vol. 35 (January, 2022), ISBN 9781713871088
    (last updated on 2024/06/29)

    Abstract:
    This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyper-parameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyper-parameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height s is built with each hidden neuron activated by a NestNet of height ≤ s−1. When s = 1, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-s ReLU NestNets with O(n) parameters can approximate 1-Lipschitz continuous functions on [0,1]d with an error O(n−(s+1)/d), while the optimal approximation error of standard ReLU networks with O(n) parameters is O(n−2/d). Furthermore, such a result is extended to generic continuous functions on [0,1]d with the approximation error characterized by the modulus of continuity. Finally, we use numerical experimentation to show the advantages of the super-approximation power of ReLU NestNets.

 

dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821

Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320