OpenAI GPT4.1 Mini System Card

Generated by Gemini 2.5 Pro and fact checked by perplexity

A detailed report on the gpt-4.1-mini LLM from OpenAI, based on publicly available information, is provided below. This report focuses on quantitative details and sourced information. Overview GPT-4.1-mini is a small, efficient, and capable language model from OpenAI, released on April 14, 2025. It is designed to be a faster and more cost-effective alternative to larger models like GPT-4o, while still offering strong performance across a variety of tasks, including instruction-following, coding, and vision understanding. It is part of the larger GPT-4.1 family of models, which also includes the larger GPT-4.1 and the smaller GPT-4.1-nano. Key Specifications

Feature	Detail	Source
Model Name	gpt-4.1-mini	Box Developer Documentation
API Model Name	azure__openai__gpt_4.1_mini	Box Developer Documentation
Release Date	April 14, 2025	Box Developer Documentation, Wikipedia
Knowledge Cutoff	June 2024	Box Developer Documentation, Wikipedia
Input Context Window	1 million tokens	Box Developer Documentation, AI/ML API Documentation, OpenRouter
Maximum Output Tokens	32,768 (32k) tokens	Box Developer Documentation, AI/ML API Documentation, OpenRouter
Open Source	No	Box Developer Documentation, AI/ML API Documentation

Performance and Benchmarks

GPT-4.1-mini has demonstrated significant performance improvements over its predecessors and is competitive with larger models on various benchmarks.

Benchmark	Score	Source
MMLU	87.5%	DocsBot AI
Global MMLU	78.5%	DocsBot AI
AIME 2024	49.6%	DocsBot AI
IFEval	84.1%	OpenRouter, DocsBot AI
Hard Instruction Evals	45.1%	OpenRouter
MultiChallenge	35.8%	OpenRouter
Aider’s Polyglot Diff	31.6%	OpenRouter

Pricing

The pricing for GPT-4.1-mini is designed to be cost-effective, especially for applications with high-volume or real-time needs.

Cost Type	Rate	Source
Input Tokens	$0.40 per million tokens	OpenRouter, DocsBot AI
Output Tokens	$1.60 per million tokens	OpenRouter, DocsBot AI

It has been noted that there is a 75% discount for cached inputs.

Key Features and Capabilities

Multimodality: GPT-4.1-mini can process both text and image inputs and generate text outputs.
Coding: The model shows strong coding ability and is particularly reliable at following diff formats, which can reduce cost and latency. It has been noted to more than double GPT-4o’s score on Aider’s polyglot diff benchmark.
Instruction Following: GPT-4.1-mini is trained to follow instructions more literally than previous models, making it highly steerable and responsive to well-specified prompts.
Long Context: With a 1 million token context window, the model can process and reason over large amounts of text.
Availability: GPT-4.1-mini is available through the OpenAI API and is also used as a fallback model for free users of ChatGPT when GPT-4o usage limits are reached. It is also available on Microsoft Azure.

Use Cases

GPT-4.1-mini is well-suited for a variety of applications, including:

Interactive applications with tight performance constraints.
Agentic tasks that require the model to solve coding problems or use tools.
Content creation and summarization.
Information extraction from large documents.
Customer service bots and other conversational AI.