-
Notifications
You must be signed in to change notification settings - Fork 26
Closed
Description
Summary
When using pandas 3.0, Enum variable encoding fails with KeyError: 0 because encode() uses array[0] for element access, which does label-based lookup on pandas Series with StringDtype index instead of positional access.
Error
policyengine_core/enums/enum.py:69: in encode
if len(array) > 0 and isinstance(array[0], Enum):
^^^^^^^^
pandas/core/series.py:959: in __getitem__
return self._get_value(key)
...
E KeyError: 0
Reproduction
This fails in policyengine-us CI when running county.yaml tests with pandas 3.0:
- PR: Add pandas 3.0 compatibility tests policyengine-us#7233
- CI logs show 6 failing tests in
policyengine_us/tests/policy/baseline/household/demographic/geographic/county/county.yaml
Root Cause
In policyengine_core/enums/enum.py:69:
if len(array) > 0 and isinstance(array[0], Enum):With pandas 3.0, string columns use StringDtype by default. When a Series has a string index, array[0] does label-based lookup (looking for key "0") instead of positional access, causing KeyError: 0.
Proposed Fix
Use .iloc[0] for positional access when dealing with pandas Series:
first_elem = array.iloc[0] if hasattr(array, 'iloc') else array[0]
if len(array) > 0 and isinstance(first_elem, Enum):Or convert to numpy array first:
if hasattr(array, 'values'):
array = np.asarray(array)
if len(array) > 0 and isinstance(array[0], Enum):Related
- policyengine-us PR enabling pandas 3.0: Add pandas 3.0 compatibility tests policyengine-us#7233
- Previous pandas 3.0 fixes in core: StringDtype handling in
filled_array, StringArray handling inVectorialParameterNodeAtInstant
Metadata
Metadata
Assignees
Labels
No labels