Memory is All You Need.

In recent months, chat experiences like OpenAI's ChatGPT have added the ability for the models to build up a memory of you, in order to help create more rich responses. This pattern at scale unlocks creating actually useful business intelligence.

MoE is probably good enough.

After much deliberation on ways to make sparsity work in dense LLM's I have come to the realization that, while MoE likely leaves a bit of intelligence on the table, its probably good enough...

What should AI models actually do?

Currently, the largest frontier models are jacks of all trades, with huge models having all information in the world embued into their parameters, but is that how it should be?