fix: decouple mocking from evals #1148

cristipufu · 2026-01-18T06:34:55Z

Description

Decouples mocking from evaluation context.

Separates the mocking infrastructure from the evaluation system, enabling mocking to be used independently.
Mocking can now be configured without requiring a full evaluation item.

Development Package

Use uipath pack --nolock to get the latest dev build from this PR (requires version range).
Add this package as a dependency in your pyproject.toml:

[project]
dependencies = [
  # Exact version:
  "uipath==2.5.10.dev1011483997",

  # Any version from PR
  "uipath>=2.5.10.dev1011480000,<2.5.10.dev1011490000"
]

[[tool.uv.index]]
name = "testpypi"
url = "https://test.pypi.org/simple/"
publish-url = "https://test.pypi.org/legacy/"
explicit = true

[tool.uv.sources]
uipath = { index = "testpypi" }

[tool.uv]
override-dependencies = [
    "uipath>=2.5.10.dev1011480000,<2.5.10.dev1011490000",
]

cristipufu · 2026-01-18T06:35:37Z

src/uipath/_cli/_evals/mocks/llm_mocker.py

                        "args": args,
                        "kwargs": kwargs,
                    },
                    "agentInfo": {  # This is incomplete


@akshaylive why do we need this agentInfo?

The prompts were mostly copied over from URT (check L41). It does make sense for mocker to know about the agent's context -- especially the eval inputs. Maybe we need to create a separate class like this?:

class MockItem(BaseModel): # Equivalent to evaluation item inputs: Any name: str = Field(default="debug", ..) mocking_strategy: MockingStrategy

cristipufu · 2026-01-18T07:40:42Z

src/uipath/_cli/_evals/mocks/mocks.py

-evaluation_context: ContextVar[EvaluationItem | None] = ContextVar(
-    "evaluation", default=None
+mocking_strategy_context: ContextVar[MockingStrategy | None] = ContextVar(
+    "mocking_strategy", default=None


do we actually need this, or can we rely on mocker_context?

Yes. The reason is because the mocking strategy can be stateful. For example, a mockito strategy could say "when input is any, return "foo" for the first invocation and "bar" for the second invocation". Without keeping the reference to MockingStrategy in the context, this is hard.

akshaylive

Looks great overall. We do need to figure out the details regarding agentInfo and evaluation_criterias in a debug context.

akshaylive · 2026-01-18T17:27:47Z

src/uipath/_cli/_evals/_runtime.py

        self, eval_item: EvaluationItem, runtime: UiPathRuntimeProtocol
    ) -> EvaluationItem:
        """Use LLM to generate a mock input for an evaluation item."""
+        expected_output = (


It's strange for input mocker to be using expectation values. The prompts were reused so we didn't think much of this. Do you know what happens in prod during simulations -- is it {}?

Secondly, evaluation_criterias is a map of evaluator_id -> criterias. For URT/"legacy evaluation items", these are repeated during load, so this object will have a lot of repeated values.

My $0.02 is that we should completely get rid of these fields from input simulation but we can do that in a separate PR. @bai-uipath : could you follow up on this with the right POCs?

akshaylive · 2026-01-18T17:43:34Z

src/uipath/_cli/_evals/mocks/llm_mocker.py

                        "args": args,
                        "kwargs": kwargs,
                    },
                    "agentInfo": {  # This is incomplete


The prompts were mostly copied over from URT (check L41). It does make sense for mocker to know about the agent's context -- especially the eval inputs. Maybe we need to create a separate class like this?:

class MockItem(BaseModel): # Equivalent to evaluation item inputs: Any name: str = Field(default="debug", ..) mocking_strategy: MockingStrategy

cristipufu requested a review from akshaylive January 18, 2026 06:34

cristipufu self-assigned this Jan 18, 2026

cristipufu added the build:dev Create a dev build from the pr label Jan 18, 2026

github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Jan 18, 2026

cristipufu commented Jan 18, 2026

View reviewed changes

cristipufu force-pushed the fix/decouple_mocking branch 3 times, most recently from a583ab4 to 0f86c8f Compare January 18, 2026 06:43

cristipufu requested a review from mjnovice January 18, 2026 06:43

fix: decouple mocking from evals

631956b

cristipufu force-pushed the fix/decouple_mocking branch from 0f86c8f to 631956b Compare January 18, 2026 07:33

cristipufu commented Jan 18, 2026

View reviewed changes

akshaylive reviewed Jan 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: decouple mocking from evals #1148

fix: decouple mocking from evals #1148

Uh oh!

cristipufu commented Jan 18, 2026 •

edited

Loading

Uh oh!

cristipufu Jan 18, 2026

Uh oh!

akshaylive Jan 18, 2026

Uh oh!

cristipufu Jan 18, 2026

Uh oh!

akshaylive Jan 18, 2026

Uh oh!

akshaylive left a comment

Uh oh!

akshaylive Jan 18, 2026

Uh oh!

akshaylive Jan 18, 2026

Uh oh!

akshaylive Jan 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: decouple mocking from evals #1148

Are you sure you want to change the base?

fix: decouple mocking from evals #1148

Uh oh!

Conversation

cristipufu commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Development Package

Uh oh!

cristipufu Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

akshaylive Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

cristipufu Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

akshaylive Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

akshaylive left a comment

Choose a reason for hiding this comment

Uh oh!

akshaylive Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

akshaylive Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

akshaylive Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cristipufu commented Jan 18, 2026 •

edited

Loading