SenseTime unveils latest large model SenseNova 5.0, full-stack large model matrix
By Zheng Zheng | chinadaily.com.cn | Updated: 2024-04-25 19:25
SenseTime, a Chinese artificial intelligence pioneer, unveiled an upgraded SenseNova 5.0 large model system and "cloud-edge-device" full-stack large model product matrix at its Tech Day event on Tuesday.
"In our pursuit to push the boundaries of SenseNova's capabilities, SenseTime remains guided by the Scaling Law as we build upon our large model based on this three-tier architecture: knowledge, reasoning, and execution," said Xu Li, co-founder and CEO of SenseTime.
Since its initial release last April, SenseTime's "SenseNova" large model system has rolled out five major iterations. Based on over 10TB of token training data and extensive synthetic data, the SenseNova 5.0 adopts a hybrid expert architecture, with an effective context window of up to 200K for inference. This update primarily enhances knowledge, mathematics, reasoning and coding capabilities.
SenseNova 5.0 demonstrates improvements in creative writing, reasoning, and summarization capabilities for the humanities. With Chinese knowledge injection, it provides better understanding, summarization, and question answering, offering assistance for verticals like education and content industries, according to the company.
Regarding science and technology, SenseNova 5.0 has attained greater quantitative reasoning, coding abilities, and logical reasoning performance, providing support services for finance, data analysis, and other domains.
A highlight is SenseNova 5.0's multimodal capabilities, as its multimodal large model ranked first based on the aggregate score on the authoritative multimodality benchmark MMBench, and achieved high scores in other multimodal rankings such as MathVista, AI2D and ChartQA.
At the application level, SenseNova 5.0 supports high-resolution long-image parsing, understanding, interactive text-to-image generation, complex cross-document knowledge extraction, summarization, question-answering display, and rich multimodal interactions.
Noticing centralized computing demands extending to edge devices and enterprise AI needs, SenseTime introduced an edge-side full-stack large model product matrix. This includes the SenseTime Edge-side Large Model for terminal devices and the SenseTime Integrated Large Model (Enterprise) edge device.
The SenseNova Edge-side Large Language Model can achieve 18.3 words per second on midrange platforms, and 78.3 words per second on flagship platforms.
Tests also show that the inference speed of edge-side LDM-AI image diffusion technology takes less than 1.5 seconds on a mainstream platform, and supports the output of high-definition images with a resolution of 12 million pixels and above, as well as image editing functions such as proportional, free-form, and rotation image expansion.
The SenseTime Integrated Large Model (Enterprise) edge device is introduced for growing enterprise AI needs in finance, coding, healthcare, government, and more. The device performs accelerated searches at only 50 percent CPU utilization and can reduce inference costs by approximately 80 percent, according to data from SenseTime.
SenseTime has also been exploring the possibilities of large model applications and prospects across domains including office software, finance, and transportation with the leading companies in these industries, like Kingsoft Office, Haitong Securities and Xiaomi.
At the event's finale, three videos generated by the large model were presented, showcasing the company's achievements on the text-to-video platform and pointing to the wider potential of the large model.