Scaling Language Models - Methods, Analysis & Insights From Training Gopher

arXiv V1: Scaling Language Models: Methods, Analysis & Insights from Training Gopher 120 pages