“Notably, O3-MINI, despite being one of the best reasoning models, frequently skipped essential proof steps by labeling them as “trivial”, even when their validity was crucial.”

  • bitofhope@awful.systems
    link
    fedilink
    English
    arrow-up
    0
    ·
    22 hours ago

    Oh, sorry, I got so absorbed into reading the riveting material about features predicting state name tokens to predict state capital tokens I missed that we were quibbling over the word “next”. Alright they can predict tokens out of order, too. Very impressive I guess.