Generative AI Data Labeling Startup Scale Raises $1B

AI training data identification and labeling startup Scale AI has raised $1 billion in a Series F funding round led by Accel. Scale AI employs machine learning boosted by human supervision to organize and tag the huge amounts of data necessary for training large language models and related generative AI projects.

Scale and Label

The eight-year-old Scale AI has firmly established itself as a key provider of packaged databases ready to train generative AI models for companies in any number of industries. The startup counts autonomous vehicle makers, defense contractors, and a wide array of generative AI model developers among its clients. Notably, Scale partnered with OpenAI on early reinforcement learning experiments with human feedback (RLHF) for GPT-2. Scale AI now helps OpenAI clients fine-tune GPT-3.5 models for composing text. The startup also counts Microsoft, Toyota, and Meta among its clients.

The new funding brings the company’s valuation to $13.8 billion, nearly double its 2021 valuation after a $325 million round of funding. That’s also remarkable, considering the company laid off a fifth of its staff last year. Scale AI pointed to how data demands have only grown with the size and complexity of models. Concerns about data scarcity helped fuel the eagerness of Scale’s investors and raise its valuation. Scale AI CEO Alexandr Wang described how the future of AI data demands not only data abundance but frontier data that supports complex reasoning and multimodal interactions. It also requires effective measurement and evaluation systems.

“Data abundance is not the default; it’s a choice. It requires bringing together the best minds in engineering, operations, and AI,” Wang explained in a statement. “Our vision is one of data abundance, where we have the means of production to continue scaling frontier LLMs many more orders of magnitude. We should not be data-constrained in getting to GPT-10.”

