Last month, Anthropic announced the newest generation in the Claude family, Claude 3. In typical AI fashion, speculation broke out on how Claude 3 Opus compares to GPT-4. Benchmarks are showing Claude as a serious competitor, however, there are other factors to consider, such as price and functionality on a service and model level.

Note—we’ve previously touched on Claude vs OpenAI within our Amazon Bedrock vs Azure OpenAI blog but this blog will go more in-depth on Claude and OpenAI GPT models specifically.

Anthropic Claude

In 2023, Claude made waves by being the first to introduce the 100k context window during a time when the other prominent model limits were around 32k. However, it received flack in the past due to censorship issues that were prioritized for improvement with Claude 3. The response to Claude 3 has been extremely positive, with users noticing significant improvements to the prior content moderation and benchmarks showing Claude 3 Opus to be one of the top-performing language models across a variety of tasks. Claude models particularly excel at writing, summarizing, and coding.

One way to access the Claude models (that we will be focusing on in this blog) is through Amazon Bedrock. Bedrock is a marketplace of many API models from different providers, including Anthropic’s Claude, where Amazon provides additional APIs and security.

Claude Models

Model Stated Use Cases Max Tokens
Claude Instant Casual dialogue, document comprehension, summarization, and text analysis. 100K
Claude 2.0 Thoughtful dialogue, content creation, complex reasoning, creativity, and coding. 100K
Claude 2.1 Claude Instant and Claude 2.0 capabilities. Analysis, comparing documents, Q&A, summarization, and trend forecasting. 200K
Claude 3 Haiku Quick and accurate support in live interactions, content moderation, inventory management, knowledge extraction, optimize logistics, and translations. 200K
Claude 3 Sonnet RAG/search and retrieval. Code generation, forecasting, text from images, quality control, product recommendations, and targeted marketing. 200K
Claude 3 Opus Advanced analysis of charts and graphs, brainstorming and hypothesis generation, coding, financials and market trends, forecasting, task automation, and research review. 200K

Table of supported Claude models (as of 4/11/2024).
Note—Claude 3 Opus is not yet available through Bedrock.

OpenAI GPT

OpenAI’s GPT models are widely used, with many users and benchmarks still considering them the best, however, as we saw in our Gemini vs OpenAI blog, competition has been quickly catching up. Still, GPT-4 is widely regarded as the most advanced and capable language model available today.

Azure and OpenAI are closely related. Microsoft has a strategic partnership with OpenAI, which includes Azure being the exclusive cloud provider for OpenAI’s models and services. This close collaboration allows Azure to offer seamless integration and access to the latest OpenAI models (with additional features and security).

GPT Models

Model Stated Use Cases Max Tokens
GPT-3.5 Turbo Advanced complex reasoning and chat, code understanding and generation, and traditional completions tasks. 4k and 16k tokens
GPT-4 Advanced complex reasoning and chat, advanced problem solving, code understanding and generation, and traditional completions tasks. 8k and 32k tokens
GPT-4 Turbo GPT-4 capabilities, instruction following, and parallel function calling. 128k tokens
GPT-4 Turbo With Vision GPT-4 Turbo capabilities, image analysis, and Q&A. 128k tokens

Table of supported GPT models (as of 4/11/2024).

Claude vs OpenAI GPT Models Functionality

See here for a service-level comparison of Azure and Bedrock documentation/community support and no-code playgrounds. See below for the model-level comparison:

  • Max Tokens: As we mentioned previously, Claude was the first to get to a 100k token limit. Then, OpenAI leapfrogged them with an impressive 128k max. Claude now has the lead with a 200k max. That corresponds to an impressive 500 pages of information. With the continuous rapid advancements, such as Gemini’s 1M-10M context window, we expect these limits to keep increasing even more.
  • Supported Regions: This applies specifically to Bedrock and Azure. Availability may be model and feature-specific and many regions are not accounted for. See Bedrock and Azure to see if your region is supported.
  • Supported Languages: The GPT models are optimized for English, however, they are usable with other languages. Claude supports multiple languages including English, Spanish, and Japanese. Neither have a public list of all available languages.
  • Training Data Date: According to Anthropic, the Claude 3 models were trained up to August 2023. GPT-3.5 Turbo and GPT-4 were trained up to September 2021 and the GPT-4 Turbo versions were trained until April 2023.

Claude Pricing

With Bedrock, you have two options: On-Demand and Provisioned Throughput. Fine-tuning is not available. Prices are shown for the US East region.

On-Demand

The non-committal, pay-by-usage option. Charges are per input token processed and output token generated.

Model Price per 1000 Input Tokens Price per 1000 Output Tokens
Claude Instant $0.0008 $0.0024
Claude 2.0 $0.008 $0.024
Claude 2.1 $0.008 $0.024
Claude 3 Haiku $0.00025 $0.00125
Claude 3 Sonnet $0.003 $0.015
Claude 3 Opus $0.015 $0.075

Claude On-Demand pricing table. (Updated 4/11/24).
Note—Claude 3 Opus pricing is not publically available, however, we can make an educated assumption on what pricing will be based on Bedrock's Claude pricing historically matching the pricing offered directly through Anthropic.

Provisioned Throughput

You have the option to buy model units (specific throughput measured by the maximum number of input/output tokens processed per minute). Pricing is charged hourly and you can choose a one-month or six-month term. This pricing model is best suited for “large consistent inference workloads that need guaranteed throughput.”

Model Price per Hour per Model Unit With No Commitment (Max One Custom Model Unit Inference) Price per Hour per Model Unit With a One Month Commitment (Includes Inference) Price per Hour per Model Unit With a Six Month Commitment (Includes Inference)
Claude Instant $44.00 $39.60 $22.00
Claude $70.00 $63.00 $35.00

Claude Provisioned Throughput pricing table. (Updated 4/11/24).

OpenAI GPT Pricing

Charges for GPT models through Azure are fairly simple. It is a pay-as-you-go, with no commitment. There are additional customization charges. Price varies per region and is shown for the US East and US East 2 regions (GPT-4 Turbo and fine-tuning are not available in US East).

Pay-As-You-Go

Charges vary for different model types and contexts.

Model Context Price per 1k Input Tokens Price per 1k Output Tokens
GPT-3.5 Turbo 4k $0.0015 $0.002
GPT-3.5 Turbo 16k $0.0015 $0.0015
GPT-4 8k $0.03 $0.06
GPT-4 32k $0.06 $0.12
GPT-4 Turbo 128k $0.01 $0.03
GPT-4 Turbo With Vision 128k $0.01 $0.03

OpenAI GPT model pricing table (as of 4/11/2024).

Fine-Tuning

Fine-tuning charges are based on training time and hosting time.

Model Context Price for Training per Compute Hour Price for Hosting per Hour
GPT-3.5 Turbo 4k $45 $3
GPT-3.5 Turbo 16k $68 $3

OpenAI GPT fine-tuning pricing table (as of 4/11/2024).

Pricing Comparison Claude vs OpenAI

Based on benchmarks, user feedback, and use cases the most comparable models are:

Claude 3 Opus vs GPT-4 Turbo

Claude 3 Opus is relatively expensive, making GPT-4 Turbo the more economical choice, especially for use cases requiring a large number of output tokens. In this case, there are tradeoffs to consider between cost and advanced capabilities/a higher context window.

Model 1000 Input Tokens 1000 Output Tokens
Claude 3 Opus $0.015 $0.075
GPT-4 Turbo $0.01 $0.03

Claude 3 Sonnet vs GPT-4

Claude 3 Sonnet is much less expensive than GPT-4 and has similar use cases and benchmarks. Its input tokens are priced 95% lower than GPT-4’s and its output tokens are priced 87.5% lower.

Model 1000 Input Tokens 1000 Output Tokens
Claude 3 Sonnet $0.003 $0.015
GPT-4 $0.06 $0.12

Conclusion

Claude 3 Opus has the potential to surpass GPT-4 Turbo as the number one most capable LLM. However, there is a price tradeoff as the output tokens in particular are 150% more expensive than GPT-4 Turbo’s. The advanced capabilities and higher context window of Claude 3 Opus might be worth the higher price but for customers looking to balance price while still using a highly capable LLM, GPT-4 Turbo is the answer.