Ci: Random-versions Ci Failure

by ADMIN 31 views

Introduction

Continuous Integration (CI) is a crucial aspect of software development, ensuring that code changes do not break existing functionality. However, when a CI pipeline fails, it can be challenging to identify the root cause of the issue. In this article, we will investigate a CI failure related to random versions and provide a step-by-step guide to resolve the problem.

CI Failure Analysis

The provided stacktrace indicates a failure in the test_constructors function, which is part of the v1_test.py file. The error occurs when trying to create a new series using the nw_v1.new_series function. The specific error message is:

DeprecationWarning: is_sparse is deprecated and will be removed in a future version. Check isinstance(dtype, pd.SparseDtype) instead.

Understanding the Error

The error message suggests that the is_sparse function is deprecated and will be removed in a future version. This function is used to check whether an array-like is a 1-D pandas sparse array. The recommended alternative is to use the isinstance(dtype, pd.SparseDtype) function instead.

Resolving the Issue

To resolve this issue, we need to update the is_sparse function to use the recommended alternative. We can do this by replacing the is_sparse function with the following code:

def is_sparse(arr) -> bool:
    """
    Check whether an array-like is a 1-D pandas sparse array.

    Parameters
    ----------
    arr : array-like
        Array-like to check.

    Returns
    -------
    bool
        Whether or not the array-like is a pandas sparse array.
    """
    return isinstance(arr.dtype, pd.SparseDtype)

Updating the CI Pipeline

Once we have updated the is_sparse function, we need to update the CI pipeline to use the new function. We can do this by updating the test_constructors function to use the new is_sparse function.

def test_constructors() -> None:
    pytest.importorskip("pyarrow")
    if PANDAS_VERSION < (2, 2):
        pytest.skip()
    assert nw_v1.new_series("a", [1, 2, 3], backend="pandas").to_list() == [1, 2, 3]
    arr: np.ndarray[tuple[int, int], Any] = np.array([[1, 2], [3, 4]])  # pyright: ignore[reportAssignmentType]
    assert_equal_data(
        nw_v1.from_numpy(arr, schema=["a", "b"], backend="pandas"),
        {"a": [1, 3], "b": [2, 4]},
    )
    assert_equal_data(
        nw_v1.from_dict({"a": [1, 2, 3]}, backend="pandas"), {"a": [1, 2, 3]}
    )
    assert_equal_data(
        nw_v1.from_arrow(pd.DataFrame({"a": [1, 2, 3]}), backend="pandas"),
        {"a": [1, 2, 3]},
    )

Conclusion

In this article, we investigated a CI failure related to random versions and provided a step-by-step guide to resolve the problem. We updated the is_sparse function to use the recommended alternative and updated the CI pipeline to use the new function. By following these steps, we can ensure that our CI pipeline is up-to-date and running smoothly.

Additional Information

The provided stacktrace indicates that the CI pipeline is running on Python 3.9.22, with the following dependencies:

  • attrs==25.3.0
  • covdefaults==2.3.0
  • coverage==7.8.0
  • exceptiongroup==1.2.2
  • hypothesis==6.131.9
  • importlib-metadata==8.6.1
  • iniconfig==2.1.0
  • -e file:///home/runner/work/narwhals/narwhals
  • numpy==1.26.4
  • packaging==25.0
  • pandas==2.2.2
  • pip==25.0.1
  • pluggy==1.5.0
  • polars==0.20.18
  • pyarrow==12.0.0
  • pytest==8.3.5
  • pytest-cov==6.1.1
  • pytest-env==1.1.5
  • pytest-randomly==3.16.0
  • python-dateutil==2.9.0.post0
  • pytz==2025.2
  • setuptools==58.1.0
  • six==1.17.0
  • sortedcontainers==2.4.0
  • tomli==2.2.1
  • tzdata==2025.2
  • zipp==3.21.0

CI Pipeline Configuration

The CI pipeline is configured to run on Python 3.9.22, with the following environment variables:

  • PANDAS_VERSION=2.2.2

CI Pipeline Logs

The CI pipeline logs indicate that the failure occurred in the test_constructors function, with the following error message:

Q: What is the cause of the CI failure?

A: The CI failure is caused by a deprecation warning in the is_sparse function. The function is used to check whether an array-like is a 1-D pandas sparse array, but it is deprecated and will be removed in a future version.

Q: What is the recommended alternative to the is_sparse function?

A: The recommended alternative to the is_sparse function is to use the isinstance(dtype, pd.SparseDtype) function instead.

Q: How do I update the is_sparse function to use the recommended alternative?

A: To update the is_sparse function, you can replace the function with the following code:

def is_sparse(arr) -> bool:
    """
    Check whether an array-like is a 1-D pandas sparse array.

    Parameters
    ----------
    arr : array-like
        Array-like to check.

    Returns
    -------
    bool
        Whether or not the array-like is a pandas sparse array.
    """
    return isinstance(arr.dtype, pd.SparseDtype)

Q: How do I update the CI pipeline to use the new is_sparse function?

A: To update the CI pipeline, you need to update the test_constructors function to use the new is_sparse function. You can do this by replacing the is_sparse function with the new function in the test_constructors function.

Q: What are the dependencies required to run the CI pipeline?

A: The dependencies required to run the CI pipeline are:

  • attrs==25.3.0
  • covdefaults==2.3.0
  • coverage==7.8.0
  • exceptiongroup==1.2.2
  • hypothesis==6.131.9
  • importlib-metadata==8.6.1
  • iniconfig==2.1.0
  • -e file:///home/runner/work/narwhals/narwhals
  • numpy==1.26.4
  • packaging==25.0
  • pandas==2.2.2
  • pip==25.0.1
  • pluggy==1.5.0
  • polars==0.20.18
  • pyarrow==12.0.0
  • pytest==8.3.5
  • pytest-cov==6.1.1
  • pytest-env==1.1.5
  • pytest-randomly==3.16.0
  • python-dateutil==2.9.0.post0
  • pytz==2025.2
  • setuptools==58.1.0
  • six==1.17.0
  • sortedcontainers==2.4.0
  • tomli==2.2.1
  • tzdata==2025.2
  • zipp==3.21.0

Q: What is the environment variable required to run the CI pipeline?

A: The environment variable required to run the CI pipeline is:

  • PANDAS_VERSION=2.2.2

Q: What are the CI pipeline logs indicating?

A: The CI pipeline logs indicate that the failure occurred in the `test_constructors function, with the following error message:

DeprecationWarning: is_sparse is deprecated and will be removed in a future version. Check isinstance(dtype, pd.SparseDtype) instead.

Q: How do I resolve the CI failure?

A: To resolve the CI failure, you need to update the is_sparse function to use the recommended alternative, update the CI pipeline to use the new is_sparse function, and ensure that the dependencies and environment variables are correctly configured.