Optimization For Machine Learning: Memory-Efficient And Tractible Solutions To Large-Scale Non-Convex Systems