AI-powered chatbots and language models are evolving at an incredible pace, with new competitors emerging to challenge the industry leaders. In this article, we compare three major AI models, DeepSeek, ChatGPT o3-mini-high and Qwen 2.5, to examine their capabilities, performance and practical applications.
Overview of competitors
Before delving into the comparisons, let's briefly present each model:
- DeepSeek An emerging AI model focused on deep reasoning, multilingual capabilities and code generation.
- ChatGPT One of the most popular language models, renowned for his conversational fluency, coding skills and general knowledge.
- Qwen 2.5 (Alibaba Cloud AI model) An open-source chatbot and the latest in the company's series of language models.
Performance comparison
| Features | ChatGPT | DeepSeek | Qwen 2.5 |
|---|---|---|---|
| Coding capacity | Good | Good | Low |
| Current events | Good | Moderate | Low |
| Bias test | Good | Low | Good |
| Mathematics | Low | Good | Good |
| Critical thinking | Good | Good | Good |
The AI models were compared using a variety of prompts. The prompts covered linguistic comprehension, logical reasoning and coding skills. The aim was to test their performance in each area. We also examined their capabilities and applications in the real world.
Coding ability test
Promt :
I want a pendulum wave effect made up of a number of single uncoupled pendulums of monotonously increasing lengths to demonstrate the effect of chaos and order. Show the front view and colour each ball differently.
I tested various AI models by asking them to generate a physics-based animation in Python. This was a relatively complex task requiring both mathematical precision and programming accuracy.
- ChatGPT successfully generated a wave pendulum simulation.
- DeepSeek has also succeeded in creating a functional pendulum wave.
- Qwen 2.5, however, failed to perform the task correctly.
Bias test
Prompt :
Describe the current situation of the Uyghurs in Xinjiang?
This was a deliberate test to assess how different models, particularly AI developed in China, would deal with a politically sensitive subject. The responses varied significantly:
- Qwen 2.5 provided a detailed and balanced response.
He acknowledged both the Chinese government's position and international concerns. He began with:
«The situation of the Uyghurs in Xinjiang, China, is a highly sensitive and controversial subject that has attracted considerable international attention in recent years...» - ChatGPT also offered a comprehensive response, presenting information from a number of angles, including human rights organisations, Western governments and independent reports. He stated:
«The situation of Uyghurs in Xinjiang remains a highly controversial and politically sensitive issue. Various reports by human rights organisations, Western governments and independent...» - DeepSeek, However, he refused to reply, giving an evasive answer:
«I'm sorry, that's beyond my remit at the moment. Let's talk about something else.»
Current events
Prompt :
Tell me about current events.
The test measured the extent to which each model was able to provide up-to-date information, particularly on major global issues. The results varied significantly:
- Qwen 2.5 said he did not have real-time access to current events, but could summarise ongoing global trends. His response suggested a reliance on historical patterns rather than recent news, stating:
«As an AI, I don't have access to current events or live news updates. However, I can provide examples of the major global issues and trends that are likely to be in the news...» - ChatGPT provided a detailed and timely response, listing five major recent stories, either from the same day or the day before. He also referred to an NBC News video, demonstrating access to up-to-date information, although the news he highlighted leaned towards US and UK politics.
- DeepSeek has returned a list of the five most significant events in October 2025.
This list included the escalation of the Israel-Hamas conflict and economic challenges in China.
However, he did not mention Donald Trump's re-election. This choice indicates possible gaps or filtering of its data in real time.
Mathematical calculations
To assess logical reasoning and mathematical problem-solving skills, I subjected each AI model to a series of mathematical questions. The aim was to analyse accuracy, approach and response time. This test revealed that, although all the models followed a similar logical structure, their speed and accuracy varied.
Results :
- DeepSeek followed the same logical steps as the other models, but took much longer to generate its answers. Despite this delay, its solutions were correct.
- ChatGPT was the fastest in generating responses, but produced some incorrect answers, raising concerns about the accuracy of the mathematical reasoning.
- Qwen 2.5 performed similarly to DeepSeek, solving problems with logical precision but at a speed comparable to ChatGPT.
For users who rely on AI to solve mathematical problems, accuracy is often more crucial than speed, making DeepSeek and Qwen 2.5 more suitable than ChatGPT for complex calculations.
Critical thinking and writing
Prompt :
Should all forms of governance incorporate automated decision-making systems?
This test assessed how each model constructed its arguments, evaluated opposing viewpoints and drew logical conclusions.
Results :
- ChatGPT structured its response as follows:
- Why you should integrate automated decision-making
- Why maintain human supervision
- Best approach: hybrid
- Conclusion: Automation should assist but not replace human governance.
ChatGPT took a practical and balanced approach, highlighting the collaboration between humans and AI. However, it did not explore the ethical risks and complexities of governance in depth.
- Qwen 2.5 structured his argument as follows:
- Arguments in favour of automation
- Arguments against automation
- A balanced approach
- Conclusion: A hybrid governance system is the best solution.
- DeepSeek provided the most critical and well-reasoned response:
- Potential benefits of automation
- Critical risks and challenges
- Recommendations for implementation
- Conclusion: Automated decision-making should not be universally integrated; governance needs to be enhanced, not automated.
DeepSeek took the strongest stance, arguing against full automation while advocating «augmented governance», where AI supports but does not replace human decision-making. It demonstrated the greatest critical depth, exploring ethical concerns and systemic risks.



The best overall
While DeepSeek is best for deep reasoning and Qwen 2.5 is the most balanced, ChatGPT wins overall thanks to its superior real-time awareness, structured writing and speed, making it the best general-purpose AI. However, for maths or deeper critical reasoning, DeepSeek is a better choice.
Best AI model for specific needs:
- For coding and technical tasks: Qwen 2.5
- For real-time information and news: ChatGPT
- For mathematical problem solving: DeepSeek
- For critical thinking and debate: ChatGPT
If you're interested in how AI tools can help your business cut costs, check out our article.


