Hands-on Test of Qwen 3.7-Max: Hand-Coded All SVG Icons, Cloned a macOS Web Interface With Native Window Management
On May 22, 2026, AI testing channel WorldofAI released hands-on test results for Qwen 3.7-Max: this Agentic coding model from Alibaba generated a macOS-style web OS with multiple apps and native window management, and all app icons are hand-written SVG code from scratch instead of static assets. Its performance is close to cutting-edge models like Claude and GPT, sparking discussions in the tech community about the model's code reasoning capabilities.
On May 22, 2026, WorldofAI, a YouTube channel focused on hands-on testing of cutting-edge AI models with 219,000 subscribers, released hands-on test content for Alibaba's Qwen 3.7-Max. This is a model positioned as an Agentic coding model — meaning it is an AI system that can autonomously plan and execute long-cycle coding tasks, rather than just outputting code snippets. Six hours after the test video was released, its view count exceeded 10,000, and it has accumulated 414 likes and 30 comments.
Instead of relying on benchmark scores, this test directly asked the model to generate a complete macOS-style web OS. The hands-on result shows that the web page output by the model includes multiple runnable built-in apps, smooth native window management logic, and layout details that conform to macOS specifications. What attracted the most attention from the tech community is: all app icons are SVG code written by the model from scratch, rather than calling static image resources from training data.
Community user TokenPony pointed out that the significance of this detail goes far beyond UI fidelity: SVG is structured graphic description code, and the model's ability to generate it from scratch means it outputs logical code structures, rather than extracting memorized ready-made resources from training data. This characteristic directly reflects the model's code reasoning ability, rather than simple resource stitching.
Official test data released by the Qwen team shows that the model has completed a 35-hour autonomous optimization task, during which it called tools more than 1,100 times. The model also has a 1 million token context window, supporting the processing of multi-file, long-cycle coding tasks. WorldofAI's full test also covered scenarios such as multi-file refactoring, 3D graphics generation (Three.js), and long-cycle workflow execution, and compared it with cutting-edge models including Claude Opus 4.6, Gemini 3.1, and DeepSeek v4. The channel stated that Qwen 3.7-Max has reached a level close to the first echelon. You can watch the full test video here: [Qwen 3.7 Max: NEW Powerful AI Model! Beats Opus 4.6, Gemini 3.1, Deepseek v4! (Fully Tested)](https://youtu.be/UXar6lNCNcc)
Feedback from the tech community focuses on several dimensions:
- Developer JMoon shared practical advice: after this type of model generates a complex project, it is recommended to add a fresh context verification step to check whether the functions claimed by the model match the actual output
- User AiDdictive mentioned that the model's instruction following ability currently ranks second globally
- User Lalo asked about generation speed: Alibaba's previous models have been criticized for slow response speed, and this test did not explicitly mention relevant data
- Testing institution tokensmind stated that cross-model tests using the same prompt show that Qwen 3.7-Max performs exceptionally well on tasks that combine UI generation and coding
- Multiple other users mentioned that the coding capability of open-source models is catching up faster than expected
Relevant links:
- Official Qwen blog: [https://qwen.ai/blog?id=qwen3.7](https://qwen.ai/blog?id=qwen3.7)
- Qwen online chat entry: [https://chat.qwen.ai/](https://chat.qwen.ai/)
- Official X account of WorldofAI: [https://x.com/intheworldofai](https://x.com/intheworldofai)
发布时间: 2026-05-22 21:40