Moe Loubani Software Developer / Development Consultant

Ramblings About AI Reaching Peak Learning

I listen to this podcast on Spotify called The AI Daily Brief (https://open.spotify.com/show/7gKwwMLFLc6RmjmRpbMtEO) and one of the recent topics I heard mentioned was that people were starting to talk about LLMs running out of training material.

I don’t know, I’ve been reading about this and hearing about it but I think that it isn’t the case, at least it isn’t the case that things are stuck and there isn’t anything to train off of.

I think things are good enough now that people start using them and from that use comes its own training data that I would imagine is even more valuable to the LLM than ingesting things in bulk so to speak. I’m sure that the big players are already heavily using that feedback to improve their models and as long as things are good enough, even for some things, there won’t be any lack of new data.

But aside from that there are other advancements that can be made, especially with improving context size and other things like more persistent memory, a better way to relate what it sees to things in memory that could be related.

And then you could say that if something like virtualizing testing was made more reliable then that in itself could be a generator of data – you don’t need to know that all the data you generate is good, you just would need to know that the testing tool that you’ve made generates good enough tests.

Good enough is a pretty huge milestone in and of itself and I think that that part is lost on lots of people. And to add to that: maybe the amount of training data shouldn’t be the limiting factor of how good an LLM can be (well, lets say how good an AI system can be) and this could be the catalyst for finding new ways to use existing data.

Just some ramblings that I didn’t think warranted a LinkedIn post.