Skip to content

Dependency installation failure in SKLearn 1.4-2 with SDK >=2.256.0 (v2) #5512

@maxiuboldi

Description

@maxiuboldi

PySDK Version

  • PySDK V2 (2.x)
  • PySDK V3 (3.x)

Describe the bug
The change introduced in PR (#5406) (specifically in SDK version >=2.256.0) breaks the installation of dependencies via requirements.txt when using the SKLearn 1.4-2 framework version. The generated runproc.sh script attempts to install dependencies using the python command, which doesn't exist in the SKLearn 1.4-2 container image. This is related to a known issue in the container (aws/sagemaker-scikit-learn-container#258).

The container has pip available, but the SDK-generated installation script tries to use python instead, causing the dependency installation to fail.

Expected behavior
Dependencies listed in requirements.txt should be installed successfully.

Screenshots or logs
The error occurs during the dependency installation phase when the generated runproc.sh script attempts to execute commands with python instead of pip.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: >=2.256.0 (bug introduced), working on <=2.255.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): SKLearn
  • Framework version: 1.4-2
  • Python version: 3.10
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
The workaround currently is to pin the SDK version to <=2.255.0, but this prevents using newer SDK features and fixes. The issue stems from the SKLearn 1.4-2 container not having the python symlink available, which the new runproc.sh generation logic assumes exists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions