

Yeah for some reason they never covered that in the stats lectures
Yeah for some reason they never covered that in the stats lectures
I thought part of the schtick is that according to the rationalist theory of mind, a simulated version of you suffering is exactly the same as the real you suffering. This relies on their various other philosophical claims about the nature of consciousness, but if you believe this then empathy doesn’t have to be a concern.
If the growth is superexponential, we make it so that each successive doubling takes 10% less time.
(From AI 2027, as quoted by titotal.)
This is an incredibly silly sentence and is certainly enough to determine the output of the entire model on its own. It necessarily implies that the predicted value becomes infinite in a finite amount of time, disregarding almost all other features of how it is calculated.
To elaborate, suppose we take as our “base model” any function f which has the property that lim_{t → ∞} f(t) = ∞. Now I define the concept of “super-f” function by saying that each subsequent block of “virtual time” as seen by f, takes 10% less “real time” than the last. This will give us a function like g(t) = f(-log(1 - t)), obtained by inverting the exponential rate of convergence of a geometric series. Then g has a vertical asymptote to infinity regardless of what the function f is, simply because we have compressed an infinite amount of “virtual time” into a finite amount of “real time”.
You’re totally misunderstanding the context of that statement. The problem of classifying an image as a certain animal is related to the problem of generating a synthetic picture of a certain animal. But classifying an image of as a certain animal is totally unrelated to generating a natural-language description of “information about how to distinguish different species”. In any case, we know empirically that these LLM-generated descriptions are highly unreliable.
Yes - on the theoretical side, they do have an actual improvement, which is a non-asymptotic reduction in the number of multiplications required for the product of two 4x4 matrices over an arbitrary noncommutative ring. You are correct that the implied improvement to omega is moot since theoretical algorithms have long since reduced the exponent beyond that of Strassen’s algorithm.
From a practical side, almost all applications use some version of the naive O(n^3) algorithm, since the asymptotically better ones tend to be slower in practice. However, occasionally Strassen’s algorithm has been implemented and used - it is still reasonably simple after all. There is possibly some practical value to the 48-multiplications result then, in that it could replace uses of Strassen’s algorithm.
And sure enough, just within the last day the user “Hand of Lixue” has rewritten large portions of the article to read more favorably to the rationalists.