MILP

Training Neural Network with Mixed Integer Linear Programing

  • Advisor: Dr. Chen Chen
  • Duration: May 2019 - Aug. 2019
  • Affiliation: Department of Integrated Systems Engineering, Ohio State University
  • Summary: In practice, neural networks are trained using gradient-based approaches. However, the structure of the linear threshold activation function lends itself naturally to mixed-integer linear programming (MILP) techniques. In this project, we built a MILP model with Gurobi Solver for the training of neural networks on classification tasks, with the objective of minimizing 0/1 loss. In a simple perceptron case, the linear thresholding can be modeled with disjunction functions: αTx + β < Mr and αTx + β ≥ −M(1 − r). Moreover, to help work around cycling among parameter configurations with the same sub-optimal loss, we adopt a loose symmetry breaking technique. We further explore a simple Timeout trick to early stop the pivoting (of Simplex algorithm) to balance the tradeoff between efficiency and accuracy. The empirical results have shown that MILP performs consistently better than gradient descent on small/medium-scale datasets with low/medium network complexity