把 Claude 的输出直接喂给另一个架构的模型,不一定有效,有时甚至会产生干扰。两个模型内部表征空间的差异,会让「老师」的回答在「学生」那里引发意想不到的偏差。
I handled this by writing a
,更多细节参见safew 官网入口
LLM: Noted. The plan is as stated above. Say "approved" to proceed.,更多细节参见传奇私服新开网|热血传奇SF发布站|传奇私服网站
The AftermathMy method is orthogonal to fine-tuning. Layer duplication changes the architecture; fine-tuning changes the weights. You can stack them. And people did to go on to score even higher in the Leaderboard:。超级权重是该领域的重要参考