Release time: 2023-08-09
Reprinted from 36Kr, author: Ben@36KR
Ruoyu-Jiutian achieves multimodal fusion of text, images, audio and video
36Kr learned that the team of the Computing and Intelligence Research Institute of Harbin Institute of Technology (Shenzhen) has established a multimodal large model research and development enterprise, Shenzhen Ruoyu Technology Co., Ltd. (hereinafter referred to as "Ruoyu Technology"), relying on the school's Harbin Asset Management Co., Ltd. to transform its achievements. Ruoyu Technology's first multimodal large model "Ruoyu-Jiutian" topped the OpenCompass multimodal large model list in its first participation.
Multimodal large model MMBench test list
01 "Ruoyu-Jiutian"
"12.3 billion parameters", "120 million image-text pairs", "5.5 million Chinese-English bilingual corpus samples", "1.2 million fine-tuning data samples", "500,000 enhanced data samples"... The improvement of core parameters has brought about a qualitative change in model capabilities. The Ruoyu-Jiutian multimodal large model has achieved remarkable performance in logical reasoning, relational reasoning, and perception capabilities. With more than 10 billion parameters, Ruoyu-Jiutian has achieved multimodal fusion of text, images, audio, and video. Its intelligent understanding and response capabilities not only cover fields such as natural language processing, computer vision, and speech recognition, but also more effectively break down the information barriers between modalities, integrating them into "Jiutian".
Multimodal large model MMBench dev list
"The Nine Heavens represents the highest heaven in ancient Chinese mythology, and symbolizes our infinite pursuit of technological progress and our yearning for an intelligent future. With its powerful understanding and response capabilities, this model transcends the boundaries of multiple modes such as text, images, audio and video, and achieves true multimodal fusion," said Dr. Sun Teng, CEO of Ruoyu Technology.
02 Establishing a top team for large models
Harbin Institute of Technology Shenzhen Campus has established an asset company to encourage faculty and staff to transform and implement their research results. Harbin Institute of Technology (Shenzhen) has policy support for the implementation of industry-university-research cooperation. When Ruoyu Technology was first established, the school participated as a start-up shareholder, providing strong support for the company's development.
Recently, IEEE Intelligent Systems, a well-known magazine in the field of artificial intelligence, announced the list of "AI's 10 to Watch" in 2022. Professor Nie Liqiang was listed among them for his contributions in the field of multimodality. Professor Nie is the winner of the Damo Academy Young Orange Award and the TR35 China Award. He said that the achievements of HIT-Shenzhen in the field of artificial intelligence cannot only exist in the laboratory, but must be transformed to serve national defense, aerospace, and society.
Another AI expert at Ruoyu Technology is co-founder Professor Zhang Min. Professor Zhang is a specially appointed assistant to the president of Harbin Institute of Technology (Shenzhen), the first outstanding young scholar in the field of NLP in China, one of the national "Million Talents", a young and middle-aged expert with outstanding contributions to the country, and enjoys a special allowance from the State Council. Harbin Institute of Technology ranks first among Chinese research institutions in the field of NLP in the authoritative computer science list CSRankings (2022-2023), and Professor Zhang is the person who has made the greatest contribution to this field at Harbin Institute of Technology.
Harbin Institute of Technology ranks first among institutions in mainland China in the field of NLP in CSRankings
Teacher Zhang Min ranked first in the academic contribution list
Dr. Sun Teng, co-founder and CEO of Ruoyu Technology, is also a core expert of the company's R&D team. Dr. Sun's research direction has always focused on multimedia computing, and related results have been published in CCF Class A conferences and IEEE/ACM Trans. Dr. Sun has previously had successful entrepreneurial experience and has full-process experience and company management experience in the application of artificial intelligence technology in vertical fields. Geng Chen, another co-founder of Ruoyu Technology, serves as the company's strategic advisor. He has been named the best technology analyst by New Fortune many times and has accumulated rich industry resources in his many years of research career. He is responsible for the company's investment and financing and the docking and landing of industry resources.
03 Core Competencies of Ruoyu Technology
"Ruoyu Technology was established at this point in time with its historical mission and ideals. As cutting-edge R&D personnel, we can deeply feel the changes that artificial intelligence will bring to the future society. The productivity explosion brought about by generative artificial intelligence will redefine the production relations in all walks of life. It is our honor and mission to have the opportunity to participate in it."
Computing power, data and talent are the three major barriers to entry for big models. Ruoyu Technology has gathered these core elements since its inception. The endogenous R&D team that cultivates leading talents has formed independent iteration capabilities. In the future, "Ruoyu-Jiutian" will continue to iterate under the leadership of technical experts.
With its top entrepreneurial team, core capabilities of self-developed multimodal large models, and successful implementation experience, Ruoyu Technology says it will bring a touch of brilliance to the "Battle of 100 Models".
04 Build a universal AI large model foundation
It has become an industry consensus to reshape each track based on large model capabilities. According to OpenAI's development path, when the model is large enough, new capabilities will emerge, especially some capabilities that have never been seen before.
Ruoyu-Jiutian will continue to iterate in the future. Dr. Sun Teng said: "Ruoyu-Jiutian is still iterating in two opposite directions: bigger and smaller. On the one hand, it is increasing the magnitude of parameters and exploring nodes that support the emergence of general multi-modal large models; on the other hand, to meet the application needs of industry users and achieve the greatest effect with the least computing power, what must be done is to lightweight compress large models and finally combine them with edge computing devices."
Based on the multimodal big model base of "Ruoyu-Jiutian", Ruoyu's business model is fundamentally different from the AI 1.0 era. In the past, the business model had to re-develop algorithms for each demand, which was a complete project-based system. "Ruoyu-Jiutian" is a unified multimodal big model foundation. It does not need to redesign the base. It only needs to be fine-tuned according to different data in the industry to get the corresponding industry model. Customers can even use data to make secondary fine-tuning according to the needs of the segmented field.
The difficulty of multimodal large models lies in the fusion of multimodal information. Common fusion methods include relatively crude means such as linear superposition and cascading, but the final effect is often not as good as the performance of a single modality. This is because some technical teams lack the experience and ability to adjust multimodal data, and fusion and alignment of multimodal features. Ruoyu-Jiutian has a self-developed full-chain model training framework for multimodal feature extraction, alignment, fusion, and reasoning, as well as a comprehensive and detailed multimodal data collection and cleaning process. The model topped the multimodal large model list, proving the team's leading strength in multimodal large models.
Robots are system-level application products in the industrial field and are the key landing direction of the "Ruoyu-Jiutian" multimodal large model base. Harbin Institute of Technology currently has a deep accumulation of industry-university-research cooperation in the field of robotics. In the future, embodied robots will need to integrate multimodal information such as voice, vision, decision-making, and control to form a closed loop. The "Ruoyu-Jiutian" multimodal large model base will conduct further research integration based on Harbin Institute of Technology's accumulated research on robots, and has currently carried out in-depth cooperation with many large listed companies in the consumer electronics/automotive fields.
With the "Ruoyu-Jiutian" multimodal large model base, Ruoyu Technology has the ability to fine-tune the existing multimodal large model base to provide personalized and customized services to users in different fields, and provide language pre-trained large models, multimodal pre-trained large models, vertical field pre-trained large models and other capabilities, and is committed to building the future AI general platform and infrastructure.
Ruoyu Technology and Harbin Institute of Technology jointly obtained support from Shenzhen KQ high-l
Ruoyu brand renewal: comprehensive upgrade of VI system
With the upgrade of corporate strategy, Ruoyu Technology has synchronously adjusted and updated its entire brand image, including brand LOGO, brand color and brand official website.
Good news! Professor Zhang Min, Chief Scientist and Co-founder of Ruoyu Technology, was elected as A
On the 11th, the list of newly elected Fellows for 2024 was announced. Professor Zhang Min, Chief Scientist of Ruoyu Technology, was selected as an ACL Fellow.
Professor Zhang Min, co-founder of Ruoyu Technology, and his team won the "Qian Weichang Chines
From November 29 to December 1, 2024, the 2024 Annual Academic Conference of the Chinese Society for Chinese Information Processing and the 3rd National Conference on Large Model Intelligent Generatio
Ruoyu Technology is listed in the "Investor Network 2024 China Value Enterprise List"
Recently, the highly anticipated "Investor Network · 2024 China Value Enterprise List" was officially announced.
Ruoyu Technology: Strengthening intellectual property protection and consolidating the foundation fo
Recently, Ruoyu Technology has made a series of progress in improving its corporate strength. The first invention patent in the field of embodied robot brain was authorized, and the Ruoyu Jiutian trad
Ruoyu Jiutian Robot Brain Receives Attention from Overseas Media
Recently, Ruoyu Technology launched the Ruoyu Jiutian robot brain, which realized group intelligence driven by a multimodal large model, and verified the technical solution through an unmanned kitchen
Nie Liqiang from Harbin Institute of Technology: Multimodal large models are the key driving force f
In short, embodied intelligence refers to a technology that combines intelligent systems with physical entities to enable them to perceive the environment, make decisions, and perform actions.
Ruoyu Technology: Striving to Become a Pioneer in Artificial Intelligence Construction
Three robots cooking together, the "black technology" behind it is group intelligence driven by a multimodal large model, in simple terms, it is "one brain, multiple bodies".
HIT incubated Ruoyu Technology to launch robot brain, realizing group intelligence driven by multimo
In recent years, the rapid development of AI big model technology has achieved results comparable to or even exceeding those of humans in some niche fields.
business@ruoyutech.com
Address:Room 903, Block A, Zhongguan Times Square, Nanshan District, Shenzhen, Guangdong, China
Copyright@ Ruoyu Technology Powered by EyouCms   粤ICP备2023060245号-2  粤公网安备44030902003927号