• <tfoot id="ii0ii"><noscript id="ii0ii"></noscript></tfoot>
  • <code id="ii0ii"></code>
  • <nav id="ii0ii"><code id="ii0ii"></code></nav>
  • <sup id="ii0ii"><code id="ii0ii"></code></sup>
    • <tfoot id="ii0ii"><dd id="ii0ii"></dd></tfoot>

      狠狠噜狠狠狠狠丁香五月,人妻出轨无码中文一区二区,狠狠色丁香久久婷婷综合五月,亚洲中久无码永久在线观看同

      Home>>

      Chinese developer launches multimodal model unifying video, image, text

      (Xinhua) 09:27, October 22, 2024

      BEIJING, Oct. 21 (Xinhua) -- The Beijing Academy of Artificial Intelligence (BAAI) on Monday released Emu3, a multimodal world model that unifies the understanding and generation of text, image, and video modalities with next-token prediction.

      Emu3 successfully validates that next-token prediction can serve as a powerful paradigm for multimodal models, scaling beyond language models and delivering state-of-the-art performance across multimodal tasks, said Wang Zhongyuan, director of BAAI, in a press release.

      "By tokenizing images, text, and videos into a discrete space, we train a single transformer from scratch on a mixture of multimodal sequences," Wang said, adding that Emu3 eliminates the need for diffusion or compositional approaches entirely.

      Emu3 outperforms several well-established task-specific models in both generation and perception tasks, according to BAAI, which has open-sourced the key technologies and models of Emu3 to the international technology community.

      Technology practitioners have said that a new opportunity has emerged to explore multimodality through a unified architecture, eliminating the need to combine complex diffusion models with large language models (LLMs).

      "In the future, the multimodal world model will promote scenario applications such as robot brains, autonomous driving, multimodal dialogue and inference," Wang said.

      (Web editor: Zhang Kaiwei, Liang Jun)

      Photos

      Related Stories

      狠狠噜狠狠狠狠丁香五月
    • <tfoot id="ii0ii"><noscript id="ii0ii"></noscript></tfoot>
    • <code id="ii0ii"></code>
    • <nav id="ii0ii"><code id="ii0ii"></code></nav>
    • <sup id="ii0ii"><code id="ii0ii"></code></sup>
      • <tfoot id="ii0ii"><dd id="ii0ii"></dd></tfoot>