DeepSeek is testing image understanding capabilities with a limited group of users. The rollout uses what's called grayscale testing - a standard practice where a new feature goes live for a small percentage of users before a broader release, letting the company catch problems under real conditions without full public exposure.
The vision feature lets users upload images alongside text prompts, though the full scope of what the model can do with those images - reading documents, describing scenes, reasoning across complex diagrams - hasn't been confirmed at this stage.
DeepSeek made its name with R1, a reasoning model that matched GPT-4-class performance at a fraction of the development cost. Adding vision would bring it closer to parity with ChatGPT, which has offered multimodal input - the ability to process text, images, and other data types together - since late 2023. Chinese AI labs have been aggressive on closing that capability gap over the past year, and this fits that pattern.
No public launch date has been announced. Grayscale tests can run anywhere from a few days to several months depending on what issues surface.