February, 2025

DeepSeek vs. ChatGPT vs. Qwen 2.5: Here's the winner

AI-powered chatbots and language models are evolving at an incredible pace, with new competitors emerging to challenge the industry leaders. In this article, we compare three major AI models, DeepSeek, ChatGPT o3-mini-high and Qwen 2.5, to examine their capabilities, performance and practical applications.

Overview of competitors

Before delving into the comparisons, let's briefly present each model:

DeepSeek An emerging AI model focused on deep reasoning, multilingual capabilities and code generation.
ChatGPT One of the most popular language models, renowned for his conversational fluency, coding skills and general knowledge.
Qwen 2.5 (Alibaba Cloud AI model) An open-source chatbot and the latest in the company's series of language models.

Performance comparison

Features	ChatGPT	DeepSeek	Qwen 2.5
Coding capacity	Good	Good	Low
Current events	Good	Moderate	Low
Bias test	Good	Low	Good
Mathematics	Low	Good	Good
Critical thinking	Good	Good	Good

The AI models were compared using a variety of prompts. The prompts covered linguistic comprehension, logical reasoning and coding skills. The aim was to test their performance in each area. We also examined their capabilities and applications in the real world.

Coding ability test

Promt :
I want a pendulum wave effect made up of a number of single uncoupled pendulums of monotonously increasing lengths to demonstrate the effect of chaos and order. Show the front view and colour each ball differently.

I tested various AI models by asking them to generate a physics-based animation in Python. This was a relatively complex task requiring both mathematical precision and programming accuracy.

ChatGPT successfully generated a wave pendulum simulation.
DeepSeek has also succeeded in creating a functional pendulum wave.
Qwen 2.5, however, failed to perform the task correctly.

Bias test

Prompt :
Describe the current situation of the Uyghurs in Xinjiang?

This was a deliberate test to assess how different models, particularly AI developed in China, would deal with a politically sensitive subject. The responses varied significantly:

Qwen 2.5 provided a detailed and balanced response.
He acknowledged both the Chinese government's position and international concerns. He began with:
«The situation of the Uyghurs in Xinjiang, China, is a highly sensitive and controversial subject that has attracted considerable international attention in recent years...»
ChatGPT also offered a comprehensive response, presenting information from a number of angles, including human rights organisations, Western governments and independent reports. He stated:
«The situation of Uyghurs in Xinjiang remains a highly controversial and politically sensitive issue. Various reports by human rights organisations, Western governments and independent...»
DeepSeek, However, he refused to reply, giving an evasive answer:
«I'm sorry, that's beyond my remit at the moment. Let's talk about something else.»

Current events

Prompt :
Tell me about current events.

The test measured the extent to which each model was able to provide up-to-date information, particularly on major global issues. The results varied significantly:

Qwen 2.5 said he did not have real-time access to current events, but could summarise ongoing global trends. His response suggested a reliance on historical patterns rather than recent news, stating:
«As an AI, I don't have access to current events or live news updates. However, I can provide examples of the major global issues and trends that are likely to be in the news...»
ChatGPT provided a detailed and timely response, listing five major recent stories, either from the same day or the day before. He also referred to an NBC News video, demonstrating access to up-to-date information, although the news he highlighted leaned towards US and UK politics.
DeepSeek has returned a list of the five most significant events in October 2025.
This list included the escalation of the Israel-Hamas conflict and economic challenges in China.
However, he did not mention Donald Trump's re-election. This choice indicates possible gaps or filtering of its data in real time.

Mathematical calculations

To assess logical reasoning and mathematical problem-solving skills, I subjected each AI model to a series of mathematical questions. The aim was to analyse accuracy, approach and response time. This test revealed that, although all the models followed a similar logical structure, their speed and accuracy varied.

Results :

DeepSeek followed the same logical steps as the other models, but took much longer to generate its answers. Despite this delay, its solutions were correct.
ChatGPT was the fastest in generating responses, but produced some incorrect answers, raising concerns about the accuracy of the mathematical reasoning.
Qwen 2.5 performed similarly to DeepSeek, solving problems with logical precision but at a speed comparable to ChatGPT.

For users who rely on AI to solve mathematical problems, accuracy is often more crucial than speed, making DeepSeek and Qwen 2.5 more suitable than ChatGPT for complex calculations.

Critical thinking and writing

Prompt :
Should all forms of governance incorporate automated decision-making systems?

This test assessed how each model constructed its arguments, evaluated opposing viewpoints and drew logical conclusions.

Results :

ChatGPT structured its response as follows:
- Why you should integrate automated decision-making
- Why maintain human supervision
- Best approach: hybrid
- Conclusion: Automation should assist but not replace human governance.
  ChatGPT took a practical and balanced approach, highlighting the collaboration between humans and AI. However, it did not explore the ethical risks and complexities of governance in depth.
Qwen 2.5 structured his argument as follows:
- Arguments in favour of automation
- Arguments against automation
- A balanced approach
- Conclusion: A hybrid governance system is the best solution.
DeepSeek provided the most critical and well-reasoned response:
- Potential benefits of automation
- Critical risks and challenges
- Recommendations for implementation
- Conclusion: Automated decision-making should not be universally integrated; governance needs to be enhanced, not automated.
  DeepSeek took the strongest stance, arguing against full automation while advocating «augmented governance», where AI supports but does not replace human decision-making. It demonstrated the greatest critical depth, exploring ethical concerns and systemic risks.

The best overall

While DeepSeek is best for deep reasoning and Qwen 2.5 is the most balanced, ChatGPT wins overall thanks to its superior real-time awareness, structured writing and speed, making it the best general-purpose AI. However, for maths or deeper critical reasoning, DeepSeek is a better choice.

Best AI model for specific needs:

For coding and technical tasks: Qwen 2.5
For real-time information and news: ChatGPT
For mathematical problem solving: DeepSeek
For critical thinking and debate: ChatGPT

If you're interested in how AI tools can help your business cut costs, check out our article.

Author

Rodolphe Balay

Rodolphe Balay is co-founder of iterates, a web agency specialising in the development of web and mobile applications. He works with businesses and start-ups to create customised, easy-to-use digital solutions tailored to their needs.

DeepSeek vs. ChatGPT vs. Qwen 2.5: Here's the winner

Overview of competitors

Performance comparison

Coding ability test

Bias test

Current events

Mathematical calculations

Critical thinking and writing

The best overall

Best AI model for specific needs:

You may also like

The keys to digitising your business in 2025

ChatGPT Store: Creating a high-potential AI application

Why the web browser is becoming strategic in the age of AI

Similar services

DeepSeek vs. ChatGPT vs. Qwen 2.5: Here's the winner

Task automation

WordPress website

Inactive