Surpassing GPT-4, Former OpenAI Executive Launches “Strongest” Large Model
On March 5th, U.S. time Monday, the AI startup Anthropic released its latest AI model, Claude 3. The company claims that Claude 3 is their fastest and most powerful model to date. Claude 3 comes in three different versions: Opus, Sonnet, and Haiku.
According to Anthropic, among the three, Opus performs the best, outperforming OpenAI’s GPT-4 and Google’s Gemini Ultra in multiple industry benchmark tests. These tests span from undergraduate knowledge levels to graduate reasoning abilities and basic mathematics.
Claude 3 introduces multimodal support for the first time, allowing users to upload photos, charts, documents, and other unstructured data for analysis and corresponding answers.
Additionally, while Sonnet and Haiku versions have a smaller parameter size compared to Opus, they also come with a lower usage cost. Starting from this Monday, Opus and Sonnet versions will be launched in 159 countries, with the Haiku version to be released soon. Although Anthropic has not disclosed specific training times and costs for Claude 3, several enterprises, including Airtable and Asana, have participated in A/B testing of these models.
Just a year ago, Anthropic was an emerging AI startup founded by former OpenAI research executives. Despite completing Series A and B financing, its products had not yet attracted widespread market attention. However, within just a year, Anthropic has become one of the most watched AI startups in the industry, supported by tech giants like Google, Salesforce, and Amazon.
The company’s products not only compete fiercely with generative AI models like ChatGPT in the enterprise domain but have also gradually penetrated the consumer market. In the past year, the startup has completed five different rounds of financing, totaling approximately $7.3 billion.
According to data from PitchBook, the generative AI field has seen explosive growth over the past year, with investments reaching a record $29.1 billion in 2023, involving nearly 700 transactions, an increase of over 260% year-over-year. Meanwhile, generative AI has become a hot topic in corporate earnings calls. Despite concerns from scholars and ethicists about potential bias propagation trends, generative AI is rapidly infiltrating various fields such as education, online travel, healthcare, and online advertising.
In an interview, Anthropic’s co-founder Daniela Amodei revealed the company’s internal team size and work distribution. She mentioned that about 60 to 80 people are dedicated to the core AI model’s research and development, while 120 to 150 people work on related technical tasks. In a statement in July, Amodei also mentioned that a team of 30 to 35 people directly participated in the development of the latest iteration of the Claude 3 model, with a total of about 150 people providing support.
Anthropic states that the Claude 3 model has a powerful processing capability, able to handle up to about 150,000 words at a time, equivalent to the length of a long novel, such as “Moby Dick” or “Harry Potter and the Deathly Hallows”. In contrast, previous versions of the model could only handle 75,000 words. Users can input large datasets into the model and request summaries in the form of memos, letters, or stories. In comparison, the processing capacity of the ChatGPT model is about 3,000 words per instance.
In terms of risk mitigation, Amodei stated that Claude 3 has made significant progress compared to previous versions. She explained, “In the effort to create a highly harmless model, Claude 2 would avoid giving answers in some cases, especially when someone brings up sensitive or controversial topics, the answers from Claude 2 could be more conservative.”
Anthropic also noted that Claude 3 has a deeper understanding of user prompts. The multimodal functionality — adding options for inputs like photos and videos to generative AI — has quickly become a hot topic in the industry, whether uploaded by users or created by AI models.
OpenAI’s COO, Brad Lightcap, said in an interview last year, “The real world is multimodal. Considering how we humans process information and interact with the world, such as what we see, hear, and talk about, it’s clear that the world is much richer than just text. Therefore, relying solely on text and code as the single interface to showcase a model’s functionality and role is far from enough.”
However, as multimodal technology and AI model complexity continue to increase, potential risks are also growing. Google recently had to suspend the image generation feature of its Gemini chatbot due to historical inaccuracies and unsettling answers found by users, which quickly spread on social media.
Unlike this, Anthropic’s Claude 3 model does not generate images but allows users to upload images and other files for analysis. Amodei stated, “No model is perfect. We are always striving to ensure the model achieves the best balance between functionality and safety. Nevertheless, the model may still produce inaccurate outputs in certain situations.”