Chinese AI company MiniMax releases new model it claims can compete with the industry's best

Chinese companies continue to release artificial intelligence models that rival the capabilities of systems developed by OpenAI and other U.S. artificial intelligence companies.

This week, Alibaba and Tencent-backed startup MiniMax, which has raised about $850 million in venture capital and is valued at more than $2.5 billion, launched three new models: MiniMax-Text-01, MiniMax-VL-01 and T2A . -01-HD. MiniMax-Text-01 is a text-only model, while MiniMax-VL-01 can understand images and text. Meanwhile, the T2A-01-HD generates audio—specifically, speech.

MiniMax claims that MiniMax-Text-01, with a parameter size of 456 billion, performs better than models such as Google's recently launched Gemini 2.0 Flash on benchmarks such as MATH and SimpleQA, which test a model's ability to answer math questions and facts. based questions. The parameters roughly correspond to the model's ability to solve the problem, with models with more parameters generally performing better than models with fewer parameters.

As for the MiniMax-VL-01, MiniMax says it's comparable to Anthropic's Claude 3.5 Sonnet for evaluations that require multimodal understanding, such as ChartQA, which is tasked with answering queries related to graphs and charts (e.g., " The orange line in this graph?"). Granted, the MiniMax-VL-01 doesn't exactly outperform the Gemini 2.0 Flash in many of these tests. OpenAI's GPT-4o and Meta's Llama 3.1 also beat it on multiple metrics.

It is worth noting that MiniMax-Text-01 has a very large context window. A model's context, or context window, refers to the input (such as text) that the model considers before generating output (additional text). MiniMax-Text-01's context window contains 4 million tokens and can analyze approximately 3 million words at once, or just over 5 copies of "War and Peace."

For context (no pun intended), MiniMax-Text-01's context window is approximately 31 times the size of GPT-4o and Llama 3.1.

MiniMax's final model released this week, the T2A-01-HD, is a voice-optimized tone generator. The T2A-01-HD can generate synthetic speech with adjustable rhythm, tone and intonation in approximately 17 different languages, including English and Chinese, and clone speech from as little as 10 seconds of recording.

MiniMax has not released benchmark results for the T2A-01-HD versus other audio generation models. But in the reporter's opinion, the output sound of T2A-01-HD is comparable to audio models from startups such as Meta and PlayAI.

In addition to the T2A-01-HD, which is only available through MiniMax’s API and the Conch AI platform, MiniMax’s new models can be downloaded from GitHub and the AI ​​development platform Hugging Face.

However, just because these models are "publicly" available, doesn't mean they aren't locked down in some way. MiniMax-Text-01 and MiniMax-VL-01 are not truly open source, as MiniMax has not yet released the components (such as training data) required to recreate them from scratch. In addition, they are subject to MiniMax's restrictive license, which prohibits developers from using these models to improve competitors' AI models and requires platforms with more than 100 million monthly active users to apply for a special license from MiniMax.

MiniMax was founded in 2021 by former employees of SenseTime, one of China's largest artificial intelligence companies. The company's projects include apps like Talkie, an AI-driven role-playing platform similar to Character AI, and MiniMax, a text-to-video model released in Hailo.

Some of MiniMax's products have been the subject of minor controversy.

Talkie, which featured AI avatars of public figures including Donald Trump, Taylor Swift, Elon Musk and LeBron, was pulled from Apple's App Store in December for unspecified "technical" reasons James, but none of these people seem to have agreed to appear on the app.

In December, Broadcast magazine reported that MiniMax's video generator could replicate the logos of British television channels, suggesting that MiniMax's models were trained on content from those channels. MiniMax was reportedly sued by Chinese video streaming service iQiyi, accusing MiniMax of using iQiyi's copyrighted recordings for illegal training.

MiniMax's new model comes just days after the outgoing Biden administration proposed tougher export rules and restrictions on artificial intelligence technology from Chinese companies. Companies in China are already banned from buying advanced AI chips, but if the new rules come into effect as intended, companies will face tighter restrictions on the semiconductor technology and models needed to guide complex AI systems.

On Wednesday, the Biden administration announced additional measures focused on keeping advanced chips out of the U.S. Chip foundries and packaging companies that want to export certain chips will be subject to broader licensing requirements unless they conduct stricter scrutiny and due diligence to prevent their products from reaching Chinese customers.