Neural Language Representations and Scaling Semi-Supervised Learning for Speech Recognition