-
Notifications
You must be signed in to change notification settings - Fork 20
fix: decouple mocking from evals #1148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| "args": args, | ||
| "kwargs": kwargs, | ||
| }, | ||
| "agentInfo": { # This is incomplete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akshaylive why do we need this agentInfo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prompts were mostly copied over from URT (check L41). It does make sense for mocker to know about the agent's context -- especially the eval inputs. Maybe we need to create a separate class like this?:
class MockItem(BaseModel): # Equivalent to evaluation item
inputs: Any
name: str = Field(default="debug", ..)
mocking_strategy: MockingStrategy
a583ab4 to
0f86c8f
Compare
0f86c8f to
631956b
Compare
| evaluation_context: ContextVar[EvaluationItem | None] = ContextVar( | ||
| "evaluation", default=None | ||
| mocking_strategy_context: ContextVar[MockingStrategy | None] = ContextVar( | ||
| "mocking_strategy", default=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we actually need this, or can we rely on mocker_context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The reason is because the mocking strategy can be stateful. For example, a mockito strategy could say "when input is any, return "foo" for the first invocation and "bar" for the second invocation". Without keeping the reference to MockingStrategy in the context, this is hard.
akshaylive
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall. We do need to figure out the details regarding agentInfo and evaluation_criterias in a debug context.
| self, eval_item: EvaluationItem, runtime: UiPathRuntimeProtocol | ||
| ) -> EvaluationItem: | ||
| """Use LLM to generate a mock input for an evaluation item.""" | ||
| expected_output = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's strange for input mocker to be using expectation values. The prompts were reused so we didn't think much of this. Do you know what happens in prod during simulations -- is it {}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secondly, evaluation_criterias is a map of evaluator_id -> criterias. For URT/"legacy evaluation items", these are repeated during load, so this object will have a lot of repeated values.
My $0.02 is that we should completely get rid of these fields from input simulation but we can do that in a separate PR. @bai-uipath : could you follow up on this with the right POCs?
| "args": args, | ||
| "kwargs": kwargs, | ||
| }, | ||
| "agentInfo": { # This is incomplete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prompts were mostly copied over from URT (check L41). It does make sense for mocker to know about the agent's context -- especially the eval inputs. Maybe we need to create a separate class like this?:
class MockItem(BaseModel): # Equivalent to evaluation item
inputs: Any
name: str = Field(default="debug", ..)
mocking_strategy: MockingStrategy
Description
Decouples mocking from evaluation context.
Ref: #1117
Development Package
uipath pack --nolockto get the latest dev build from this PR (requires version range).