Sanctioned Chinese AI Firm SenseTime Releases Image Model Built for Speed

0
2

SenseTime, a Chinese AI company best known for its facial recognition technology, released a new open source model on Tuesday that it claims can both generate and interpret images far faster than top models developed by US competitors. SenseNova U1 could help the company reclaim lost ground after it slipped from its place among the leading players in China’s AI development race.

The model’s secret sauce is its ability to “read” images without translating them to text first, speeding up the process and reducing the amount of computing power required. “The model’s entire reasoning process is no longer limited to text. It can reason with images as well,” Dahua Lin, cofounder and chief scientist at SenseTime, said in an interview with WIRED.

Lin, who is also a professor of information engineering at the Chinese University of Hong Kong, says that models capable of processing images directly will enable robots to better understand the physical world in the future.

Like DeepSeek’s latest flagship model, SenseTime says U1 can be powered by Chinese-made chips. “Several Chinese domestic chipmakers have finished optimizing compatibility with our new model,” Lin says. On release day, 10 Chinese chip designers, including Cambricon and Biren Technology, announced their hardware supports U1.

That flexibility matters because US export controls restrict Chinese firms from accessing the world’s most advanced AI chips, particularly those used for training, which at this point are primarily developed by Western companies like Nvidia. “We will continue to push for training on more different chips,” Lin says. But he also acknowledges that SenseTime “may still need to use the best chips to ensure the speed of our iteration.”

SenseTime released U1 for free on Hugging Face and GitHub, another sign of how Chinese companies are becoming some of the most active contributors to open source AI.

SenseTime was founded in 2014 and became a world leader in computer vision, which is used in applications like facial recognition and autonomous driving. But when ChatGPT and other AI systems powered by natural language processing became the hottest thing in the tech industry, SenseTime began struggling to turn a profit and fell behind newer Chinese startups like DeepSeek and MiniMax.

SenseTime says it hopes that releasing SenseNova-U1 publicly for anyone to use will help it catch up with both domestic and Western AI players. Lin says the company finally made the decision last year to focus on open source because of the helpful feedback it gets from researchers, which enables the company to iterate faster. “In this day and age, being open source or closed source is not the winning factor; the speed of iteration is,” Lin explains.

Going open source also helps SenseTime continue collaborating with international researchers without the interference of geopolitics. The company has been sanctioned repeatedly by the US government in recent years over allegations that its facial recognition technology helped power surveillance systems used to monitor and detain Uyghurs and other minority groups in China’s Xinjiang region. As a result, US firms are restricted from investing in SenseTime and selling certain technologies to it without a license. (SenseTime has denied the allegations.)

Seeing Clearly

In an accompanying technical report, SenseTime claims that SenseNova-U1 generates higher-quality images than all other open source models currently on the market. Its performance is comparable to leading Chinese closed source models like Alibaba’s Qwen and ByteDance’s Seedream, but it still lags behind industry leaders like GPT-Image-2.0, which came out just a week ago.

But the model’s main selling point is its ability to generate images much faster than all of those models. It relies on an innovative technical structure called NEO-Unify that SenseTime previewed earlier this year.

The model’s new architecture, which could improve efficiency and performance, is what sets U1 apart, says Adina Yakefu, an AI researcher at Hugging Face. “This is a more ambitious approach, as it still faces significant practical challenges,” she says. “It’s good that they decided to open source it, so the community can explore and test it more widely.” The model is also small enough to run on PCs and phones, making it potentially useful in many scenarios.

Lin says the technique SenseTime developed will be especially useful in robotics. When a robot tries to process the visual world, it needs to sort through an enormous amount of information. “It has to think, ‘how should I deal with all the clutter in this room? If there is a complicated machine in front of me, which button should I press?’ All of these are forms of information, and they need to be integrated into the model’s internal judgment,” he says. Because it can understand images natively, Lin is hopeful that SenseTime’s technology will help robots act faster and make fewer mistakes in complex environments.

China is in the midst of a humanoid robot boom. While SenseTime doesn’t currently develop its own robots, Lin says it is closely working with ACE Robotics, a startup led by another SenseTime cofounder. It’s also developing models that specialize in geospatial understanding, or creating simulations of the real world.

Disclaimer : This story is auto aggregated by a computer programme and has not been created or edited by DOWNTHENEWS. Publisher: wired.com