Last year, we planned to write a blog post about the impending war between fine-tuning LLMs and retrieval augmentation. We never quite got around writing that post, and in the meantime, the world shifted under our feet. The current consensus is not either-or but both-and. Fine-tune a model to teach it a skill, and use RAG to ensure freshness of data. This is (roughly) the technique we’re using in our own product at RunLLM.
Nice read. I think we need to clearly call out the quality of the training data on the foundational models. Also, what are the limitations of the specific foundation models, and what is the reason we are fine -tuning the model. Is it for small footprint of resources or domain specific knowledge scenarios ? We will see different mechanism for these and target segments - one where we need smaller LLMs fine-tuned vis-a-vis fine-tuning for domain specific knowledge. New architectures will emerge, as we keep moving the bottleneck by bringing in additional layers for specific requirements for customization potential, cost structure, skills required, privacy and ownership requirements.
All great points — thanks for the comment! Generally, we've been thinking about fine-tuning as teaching the model a skill, which aligns with what you're suggesting as well.
Fully agree as well that we'll need/want smaller LLMs for fine-tuning purposes in particular. We've been making that argument for a while — excited to see what comes out this year on that front.
Nice read. I think we need to clearly call out the quality of the training data on the foundational models. Also, what are the limitations of the specific foundation models, and what is the reason we are fine -tuning the model. Is it for small footprint of resources or domain specific knowledge scenarios ? We will see different mechanism for these and target segments - one where we need smaller LLMs fine-tuned vis-a-vis fine-tuning for domain specific knowledge. New architectures will emerge, as we keep moving the bottleneck by bringing in additional layers for specific requirements for customization potential, cost structure, skills required, privacy and ownership requirements.
All great points — thanks for the comment! Generally, we've been thinking about fine-tuning as teaching the model a skill, which aligns with what you're suggesting as well.
Fully agree as well that we'll need/want smaller LLMs for fine-tuning purposes in particular. We've been making that argument for a while — excited to see what comes out this year on that front.