Xinliang (Frederick) Zhang

Advisor: Dr. Chen Chen
Duration: May 2019 - Aug. 2019
Summary: In practice, neural networks are trained using gradient-based approaches. However, the structure of the linear threshold activation function lends itself naturally to mixed-integer linear programming (MILP) techniques. In this project, we built a MILP model with Gurobi Solver for the training of neural networks on classification tasks, with the objective of minimizing 0/1 loss. In a simple perceptron case, the linear thresholding can be modeled with disjunction functions: α^Tx + β < Mr and α^Tx + β ≥ −M(1 − r). Moreover, to help work around cycling among parameter configurations with the same sub-optimal loss, we adopt a loose symmetry breaking technique. We further explore a simple Timeout trick to early stop the pivoting (of Simplex algorithm) to balance the tradeoff between efficiency and accuracy. The empirical results have shown that MILP performs consistently better than gradient descent on small/medium-scale datasets with low/medium network complexity