This is interesting and I would appreciate if you could elaborate with a few examples, and perhaps mention what inspired this, and/or what’s similar and/or analogous to your method.
Despite the sophisticated theoretical frameworks and algorithmic safeguards, the core vulnerability remains: in an autonomous workflow, the LLM can hallucinate the input or fabricate the 'proofs' for the verification sandbox. This is essentially building elaborate scaffolding ontop a fundamental flaw.
But I reckon while this doesn't eliminate the need for human oversight, it might help supervisors sitting at human-in-the-loop checkpoints.
> fabricate the 'proofs' for the verification sandbox
Not sure: It seems the tool here allows for non-llm code to perform validation inbetween the steps
So in this sense, it allows you to manage the LLM.
Of course if you run the workflow through an outer agentic loop this would be different.
In this case, I think the reasonable approach would be to let the orchestrating tool write its results to a space that the agents themselves have no access to.
Interesting. Seems you are automating my qwen workflow. Every output stage is verified through mathematical proof whenever possible, before being fed to the next step in transforming ideas into code. Except for when qwen decides to go in a very unusual direction, its working reasonably well at producing provably correct code. It's slowish though, with lots of nested iterations, and when qwen goes strange it takes a lot of effort to get it back on task.
This is interesting and I would appreciate if you could elaborate with a few examples, and perhaps mention what inspired this, and/or what’s similar and/or analogous to your method.
Despite the sophisticated theoretical frameworks and algorithmic safeguards, the core vulnerability remains: in an autonomous workflow, the LLM can hallucinate the input or fabricate the 'proofs' for the verification sandbox. This is essentially building elaborate scaffolding ontop a fundamental flaw.
But I reckon while this doesn't eliminate the need for human oversight, it might help supervisors sitting at human-in-the-loop checkpoints.
> hallucinate the input
True
> fabricate the 'proofs' for the verification sandbox
Not sure: It seems the tool here allows for non-llm code to perform validation inbetween the steps
So in this sense, it allows you to manage the LLM. Of course if you run the workflow through an outer agentic loop this would be different.
In this case, I think the reasonable approach would be to let the orchestrating tool write its results to a space that the agents themselves have no access to.
Interesting. Seems you are automating my qwen workflow. Every output stage is verified through mathematical proof whenever possible, before being fed to the next step in transforming ideas into code. Except for when qwen decides to go in a very unusual direction, its working reasonably well at producing provably correct code. It's slowish though, with lots of nested iterations, and when qwen goes strange it takes a lot of effort to get it back on task.
Couple of simplified examples for us, mere mortals, not mathematicians would be nice.