Qwen2.5 is the latest generation of the Qwen (Tongyi Qianwen) model family, featuring advanced MoE architecture and superior performance in reasoning, mathematics, and coding tasks.
Model Information
ModelQwen2.5
AuthorQwen
Parameters
Architecturetransformer-based-moe
FormatGGUF
Size on disk7.20 GB
Quantization
License
./README.md
Qwen2.5
Qwen2.5 is the latest generation of the Qwen (Tongyi Qianwen) model family, developed by Alibaba Cloud. Released in January 2025, it represents a significant advancement in large language model capabilities, particularly excelling in reasoning, mathematics, and coding tasks.
Model Variants
- Qwen2.5-Max: The flagship model that outperforms several leading foundation models in key benchmarks
- Qwen2.5-7B: Base model with 7 billion parameters
- Qwen2.5-14B: Intermediate model with 14 billion parameters
- Qwen2.5-72B: Large model with 72 billion parameters
Key Features
- Enhanced Reasoning: Improved performance on complex reasoning tasks and mathematical problem-solving
- Extended Context Length: Supports processing of longer text sequences
- Mixture of Experts (MoE): Utilizes advanced MoE architecture for better efficiency and performance
- Multi-modal Capabilities: Supports text, code, and vision-language tasks
- Multilingual Support: Strong performance in both Chinese and English languages
Technical Specifications
- Architecture: Transformer-based with MoE implementation
- Training Data: Curated dataset including web text, code, and academic content
- License: Apache 2.0 for open-source variants
- Hardware Requirements: Varies by model size (detailed requirements in documentation)
Performance Highlights
- Top-tier performance in Chinese language understanding and generation
- Strong capabilities in code generation and mathematical reasoning
- Competitive results against leading global models in standard benchmarks
- Improved efficiency in resource utilization through MoE architecture
Use Cases
- Natural Language Processing
- Code Generation and Analysis
- Mathematical Problem Solving
- Multi-modal Applications
- Enterprise AI Solutions
- Research and Development
Citation
If you use Qwen2.5 in your research, please cite:
@article{qwen2.5,
title={Qwen2.5: Exploring the Intelligence of Large-scale MoE Model},
author={Qwen Team},
journal={arXiv preprint},
year={2025}
}
License
Qwen2.5 are open-source models, except for the 3B and 72B variants, are licensed under Apache 2.0.
Model Information
ModelQwen2.5
AuthorQwen
Parameters
Architecturetransformer-based-moe
FormatGGUF
Size on disk7.20 GB
Quantization
License