DOI

10.17077/etd.a1iu-1h9f

Document Type

Dissertation

Date of Degree

Summer 2019

Degree Name

PhD (Doctor of Philosophy)

Degree In

Computer Science

First Advisor

Yang, Tianbao

First Committee Member

Lin, Qihang

Second Committee Member

Oliveira, Suely

Third Committee Member

Varadarajan, Kasturi

Fourth Committee Member

Xu, Weiyu

Abstract

In recent years, unprecedented growths in scale and dimensionality of data raise big computational challenges for traditional optimization algorithms; thus it becomes very important to develop efficient and effective optimization algorithms for solving numerous machine learning problems. Many traditional algorithms (e.g., gradient descent method) are black-box algorithms, which are simple to implement but ignore the underlying geometrical property of the objective function. Recent trend in accelerating these traditional black-box algorithms is to leverage geometrical properties of the objective function such as strong convexity. However, most existing methods rely too much on the knowledge of strong convexity, which makes them not applicable to problems without strong convexity or without knowledge of strong convexity. To bridge the gap between traditional black-box algorithms without knowing the problem's geometrical property and accelerated algorithms under strong convexity, how can we develop adaptive algorithms that can be adaptive to the objective function's underlying geometrical property? To answer this question, in this dissertation we focus on convex optimization problems and propose to explore an error bound condition that characterizes the functional growth condition of the objective function around a global minimum. Under this error bound condition, we develop algorithms that (1) can adapt to the problem's geometrical property to enjoy faster convergence in stochastic optimization; (2) can leverage the problem's structural regularizer to further improve the convergence speed; (3) can address both deterministic and stochastic optimization problems with explicit max-structural loss; (4) can leverage the objective function's smoothness property to improve the convergence rate for stochastic optimization. We first considered stochastic optimization problems with general stochastic loss. We proposed two accelerated stochastic subgradient (ASSG) methods with theoretical guarantees by iteratively solving the original problem approximately in a local region around a historical solution with the size of the local region gradually decreasing as the solution approaches the optimal set. Second, we developed a new theory of alternating direction method of multipliers (ADMM) with a new adaptive scheme of the penalty parameter for both deterministic and stochastic optimization problems with structured regularizers. With LEB condition, the proposed deterministic and stochastic ADMM enjoy improved iteration complexities of $\widetilde O(1/\epsilon^{1-\theta})$ and $\widetilde O(1/\epsilon^{2(1-\theta)})$ respectively. Then, we considered a family of optimization problems with an explicit max-structural loss. We developed a novel homotopy smoothing (HOPS) algorithm that employs Nesterov's smoothing technique and accelerated gradient method and runs in stages. Under a generic condition so-called local error bound (LEB) condition, it can improve the iteration complexity of $O(1/\epsilon)$ to $\widetilde O(1/\epsilon^{1-\theta})$ omitting a logarithmic factor with $\theta\in(0,1]$. Next, we proposed new restarted stochastic primal-dual (RSPD) algorithms for solving the problem with stochastic explicit max-structural loss. We successfully got a better iteration complexity than $O(1/\epsilon^2)$ without bilinear structure assumption, which is a big challenge of obtaining faster convergence for the considered problem. Finally, we consider finite-sum optimization problems with smooth loss and simple regularizer. We proposed novel techniques to automatically search for the unknown parameter on the fly of optimization while maintaining almost the same convergence rate as an oracle setting assuming the involved parameter is given. Under the Holderian error bound (HEB) condition with $\theta\in(0,1/2)$, the proposed algorithm also enjoys intermediate faster convergence rates than its standard counterparts with only the smoothness assumption.

Keywords

Convex Optimization, Local Error Bound

Pages

xiv, 223 pages

Bibliography

Includes bibliographical references (pages 211-223).

Copyright

Copyright © 2019 Yi Xu

Share

COinS