According to a report by The Information, Microsoft is developing a new large-scale AI language model named MAI-1 that may compete with cutting-edge models from Google, Anthropic, and OpenAI. This is the largest AI model Microsoft has created internally since it invested more than $10 billion in OpenAI in exchange for the ability to utilize the startup’s models. Microsoft Copilot and ChatGPT are powered by OpenAI’s GPT-4.
Mustafa Suleyman, a former Google AI executive who most recently held the position of CEO of the AI firm Inflection until Microsoft paid $650 million in March to buy the bulk of the company’s personnel and intellectual property, is spearheading the creation of MAI-1. According to two Microsoft workers who are acquainted with the project, MAI-1 is a completely new large language model (LLM), even though it might draw from methods brought over by former Inflection employees.
With over 500 billion parameters, MAI-1 will require more processing power and training data than Microsoft’s prior open-source models (like Phi-3, which we discussed last month). According to reports, this puts MAI-1 considerably above lesser models like Meta and Mistral’s 70 billion parameter models and in a same league as OpenAI’s GPT-4, which is thought to contain around 1 trillion parameters (in a mixture-of-experts configuration).
The creation of MAI-1 points to a dual strategy for AI at Microsoft, with an emphasis on both modest, locally run language models for mobile devices and more advanced, cloud-powered models. It’s said that Apple is looking into a comparable strategy. The statement underscores the company’s readiness to investigate AI development autonomously from OpenAI, whose technology presently drives Microsoft’s most audacious generative AI functionalities, such as an integrated chatbot within Windows.
One of The Information’s sources claims that the precise function of MAI-1 is still unknown (not even within Microsoft), and that its best application will be based on how well it performs. Microsoft has been assigning a sizable cluster of computers with Nvidia GPUs to train the model, and it has been gathering training data from other sources, including as text produced by OpenAI’s GPT-4 and publicly available Internet data.
According to one of the sources The Information quoted, Microsoft may preview MAI-1 as early as its Build developer conference later this month, depending on the progress made in the upcoming weeks.