OpenAI Introduces o1 Series: Advanced AI Models for Complex Reasoning

September 12, 2024 Joey Pedras

OpenAI has unveiled a new series of artificial intelligence models called the OpenAI o1 series, designed to enhance the way AI systems think and reason. These models are engineered to spend more time deliberating before providing responses, enabling them to tackle complex problems more effectively than their predecessors in areas such as science, coding, and mathematics.

Enhanced Thinking Process

The o1 models mimic human problem-solving by taking additional time to contemplate questions thoroughly before answering. This extended reasoning allows the models to refine their thought processes, experiment with different strategies, and identify and correct mistakes. Through this approach, the models demonstrate a significant improvement in handling complex tasks.

Impressive Performance Metrics

In rigorous testing, the upcoming update to the o1 series performed comparably to PhD students on challenging benchmarks in physics, chemistry, and biology. In mathematics, the reasoning model achieved an 83% success rate on a qualifying exam for the International Mathematics Olympiad (IMO), a substantial leap from GPT-4o's 13% success rate. The models also excelled in coding, reaching the 89th percentile in Codeforces programming competitions.

Safety Enhancements

OpenAI has developed a new safety training method that leverages the advanced reasoning abilities of the o1 models. By enabling the AI to reason about safety rules within context, the models adhere more effectively to alignment guidelines and are better equipped to prevent misuse. In stringent jailbreak tests designed to bypass safety protocols, the o1-preview model scored 84 out of 100, significantly outperforming GPT-4o's score of 22.

To support these advancements, OpenAI has strengthened its safety measures, internal governance, and collaboration with federal entities. This includes comprehensive testing using their Preparedness Framework, extensive red teaming efforts, and oversight by their Safety & Security Committee. OpenAI has also formalized partnerships with the U.S. and U.K. AI Safety Institutes, providing them with early access to research versions of the model for evaluation and testing.

Introducing OpenAI o1-mini

Alongside the o1-preview model, OpenAI is releasing OpenAI o1-mini, a faster and more cost-effective model optimized for coding tasks. The o1-mini model is 80% cheaper than the o1-preview, offering a powerful yet economical option for developers who require advanced reasoning capabilities without the need for extensive world knowledge.

Access and Availability

Starting today, ChatGPT Plus and Team users can access both o1-preview and o1-mini models through the model selector in ChatGPT, with initial weekly rate limits of 30 messages for o1-preview and 50 messages for o1-mini. OpenAI plans to increase these limits and enable ChatGPT to automatically select the most suitable model based on the user's prompt.

Next week, ChatGPT Enterprise and Edu users will gain access to both models. Developers with API usage tier 5 can begin prototyping with the models via the API, though some features like function calling and streaming are not yet available. OpenAI also intends to extend o1-mini access to all ChatGPT Free users in the near future.

Future Developments

This release is an early preview of the o1 reasoning models within ChatGPT and the API. OpenAI plans to continue updating these models and adding features such as web browsing and file and image uploading to enhance their utility. The company will also persist in developing and releasing models within the GPT series alongside the new o1 series.

Who Can Benefit

The o1 models are particularly useful for professionals tackling complex problems in fields like science, coding, and mathematics. For instance, healthcare researchers can use o1 to annotate cell sequencing data, physicists can generate complex mathematical formulas for quantum optics, and developers can build and execute multi-step workflows with greater efficiency.