Member-only story
OpenAI’s o1 vs. Sonnet 3.5: Round One in Comparing their Coding Abilities
I wrote before about comparing ChatGPT to Claude. Back then ChatGPT won for me due to the wider feature set it hadincluding Internet search, Custom GPTs, Code Interpreter, and image generation. But the quality of the Sonnet 3.5 model was impressive and the UX of Artifacts was better than in ChatGPTs Code Interpreter.
Today OpenAI announced and released a new family of models: o1-preview and o1-mini. Models specifically trained for reasoning. More expansive, slower, but better at reasoning. How much better? According to the OpenAI blog gpt4-o could only solve 13% of the Internation International Math Olympiad. What about o1? How about 83%? 6x improvement! Or maybe, 600% makes it clearer how much better that is.
Do not have a medium account? Read here
The Challenge: Building a Physics-Based Parking Simulator
I wanted to test the capabilities of this new model. But how? Well, the best way to understand something is to find a comparison for it. So I tried to remember what was the most impressive thing I had done to…