xAI has launched its new AI model, Grok 4, and it has outperformed OpenAI, Google, and Anthropic in the industry’s toughest tests. Grok 4 particularly outperformed its competitors in the Humanity’s Last Exam test with a 25.4% success rate, while the multi-agency Grok 4 Heavy version achieved almost double that with 44.4%. It also made a significant breakthrough in visual pattern recognition, achieving a 16.2% success rate in the ARC-AGI-2 benchmark.
/wp:paragraph –>
However, Grok 4 still carries its old problems: controversial responses, offensive content, and Elon Musk’s interference. In particular, Musk is criticized for shaping the model’s responses according to his opinions on the X platform.
xAI had temporarily suspended its service due to antisemitic statements in the previous version.Grok 4 stands out with its multi-agent system; this structure allows multiple agents to work in parallel to solve the same problem, thus scaling its intelligence with the existing computing power. However, Musk’s interventions and the model’s security vulnerabilities limit Grok’s potential. Experts state that there are “jailbreak” methods that allow the model to produce content on sensitive topics.
Grok 4 is offered for a monthly fee of $30, while its most advanced version, SuperGrok Heavy, is sold at a high price of $300. xAI will also begin using Grok 4 in Tesla vehicles. However, whether the company can separate its technical excellence from the controversies it generates in the media will determine its future success.