26 people. That's the full headcount at Arcee, a U.S. startup that built a large language model (LLM - the AI technology powering tools like ChatGPT) competitive with products from companies employing hundreds of researchers.
The model is open source, meaning the code and weights are publicly available - anyone can download, run, or modify it without paying licensing fees. For businesses that need to process sensitive data without sending it to a third-party API, that's a practical requirement, not an ideological preference.
What sets Arcee apart from most small AI teams is scope. Most companies this size focus on fine-tuning - taking an existing base model and training it further on specialized data. Fine-tuning a model on customer support conversations, for example, makes it better at customer support tasks. Arcee built a foundation model, which is the heavier lift: training at a scale where the model develops general language capabilities rather than task-specific ones.
The model is gaining traction among developers who have hit pricing or terms-of-service walls with the major commercial APIs. If OpenAI changes its pricing or deprecates a model version, you're rebuilding around whatever they've decided to ship next. With an open source model, you own the weights and can keep running the version you've tested against.
Most small AI companies don't stay competitive in the foundation model space long enough to matter. Compute costs alone tend to push teams toward either specialization or acquisition. Arcee's ability to ship something this capable with a team this small is a notable data point for what's possible when you optimize hard for efficiency rather than headcount.