[Bug] When I Use The Function Call, The Service Throws An Error And Exits.

by ADMIN 75 views

Introduction

We are experiencing an issue where the service crashes and exits when using tools. The service works fine as long as we don't use tools. We are unsure if we are using something incorrectly.

Describe the Bug

After starting the service with the following command, it works fine as long as we don't use tools. However, if we pass in tools, the service crashes and exits.

docker run \
    --gpus "\"$GPUS\"" \
    --shm-size "$SHM_SIZE" \
    -it --rm \
    -p "$SERVER_PORT:$SERVER_PORT" \
    -v "$MODEL_PATH":/models \
    "$IMAGE" \
    python3 -m sglang.launch_server \
    --model-path "/models" \
    --host "0.0.0.0" \
    --port "$SERVER_PORT" \
    --tp "$TP" \
    --trust-remote-code \
    --served-model-name "$MODEL_NAME"\
    --enable-metrics \
    --enable-torch-compile \
    --tool-call-parser 'qwen25'

The error message is as follows:

[2025-04-30 11:01:39 TP0] Scheduler hit an exception: Traceback (most recent call last):
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2013, in run_scheduler_process
    scheduler.event_loop_overlap()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 640, in event_loop_overlap
    batch = self.get_next_batch_to_run()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1145, in get_next_batch_to_run
    new_batch = self.get_new_batch_prefill()
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1271, in get_new_batch_prefill
    self.log_prefill_stats(adder, can_run_list, running_bs)
  File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1010, in log_prefill_stats
    total_queue_latency += req.queue_time_end - req.queue_time_start
TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'

[2025-04-30 11:01:39] Received sigquit from a child process. It usually means the child failed.
[2025-04-30 11:01:39] ERROR:    Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete
  File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete
  File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever
  File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run
  File "uvloop/handles/poll.pyx", line 216, in uvloop.loop.__on_uvpoll_event
  File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
  File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run
  File "uvloop/loop.pyx", line 399, in uvloop.loop.Loop._read_from_self
  File "uvloop/loop.pyx", line 404, in uvloop.loop.Loop._invoke_signals
  File "uvloop/loop.pyx", line 379, in uvloop.loop.Loop._ceval_process_signals
  File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/engine.py", line 474, in sigquit_handler
    kill_process_tree(os.getpid())
  File "/sgl-workspace/sglang/python/sglang/srt/utils.py", line 686, in kill_process_tree
    sys.exit(0)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 714, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 734, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 73, in app
    response = await f(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
  File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/http_server.py", line 570, in openai_v1_chat_completions
    return await v1_chat_completions(_global_state.tokenizer_manager, raw_request)
  File "/sgl-workspace/sglang/python/sglang/srt/openai_api/adapter.py", line 1699, in v1_chat_completions
    ret = await tokenizer_manager.generate_request(
  File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 385, in generate_request
    async for response in self._wait_one_response(obj, request):
  File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 580, in _wait_one_response
    await asyncio.wait_for(state.event.wait(), timeout=4)
  File "/usr/lib/python3.10/asyncio/tasks.py", line 432, in wait_for
    await waiter
asyncio.exceptions.CancelledError
[2025-04-30 11:01:39] INFO:     192.168.252.3:59006 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[2025-04-30 11:01:39] ERROR:    Exception in ASGI application

Reproduction

We can reproduce the issue by running the following command:

#!/bin/bash

GPUS=${GPUS:-"device=0"}
SERVER_PORT=${SERVER_PORT:-30020}
MODEL_PATH=${MODEL_PATH:-"/mnt/data1/models/QwQ-32B"}
MODEL_NAME=${MODEL_NAME:-"QwQ-32B"}
IMAGE=${IMAGE:-"docker.io/lmsysorg/sglang:v0.<br/>
**Q&A: Bug Report - Service Crashes and Exits When Using Tools**
===========================================================

**Q: What is the issue with the service?**
----------------------------------------

A: The service crashes and exits when using tools. It works fine as long as we don't use tools.

**Q: What is the error message?**
------------------------------

A: The error message is as follows:

[2025-04-30 11:01:39 TP0] Scheduler hit an exception: Traceback (most recent call last): File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2013, in run_scheduler_process scheduler.event_loop_overlap() File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 640, in event_loop_overlap batch = self.get_next_batch_to_run() File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1145, in get_next_batch_to_run new_batch = self.get_new_batch_prefill() File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1271, in get_new_batch_prefill self.log_prefill_stats(adder, can_run_list, running_bs) File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1010, in log_prefill_stats total_queue_latency += req.queue_time_end - req.queue_time_start TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'

[2025-04-30 11:01:39] Received sigquit from a child process. It usually means the child failed. [2025-04-30 11:01:39] ERROR: Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run File "uvloop/handles/poll.pyx", line 216, in uvloop.loop.__on_uvpoll_event File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run File "uvloop/loop.pyx", line 399, in uvloop.loop.Loop._read_from_self File "uvloop/loop.pyx", line 404, in uvloop.loop.Loop._invoke_signals File "uvloop/loop.pyx", line 379, in uvloop.loop.Loop._ceval_process_signals File "/sgl-workspace/sglang/sglang/srt/entrypoints/engine.py", line 474, in sigquit_handler kill_process_tree(os.getpid()) File "/sgl-workspace/sglang/python/sglang/srt/utils.py", line 686, in kill_process_tree sys.exit(0) SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait await fut asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi result = await app( # type: ignore[func-returns-value] File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in call return await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 112, in call await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 165, in call await self.app(scope, receive, _send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in call await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 714, in call await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 734, in app await route.handle(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 288, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 76, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 73, in app response = await f(request) File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app raw_response = await run_endpoint_function File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function return await dependant.call(**values) File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/http_server.py", line 570, in openai_v1_chat_completions return await v1_chat_completions(_global_state.tokenizer_manager, raw_request) File "/sgl-workspace/sglang/python/sglang/srt/openai_api/adapter.py", line 1699, in v1_chat_completions ret = await tokenizer_manager.generate_request( File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 385, in generate_request async for response in self._wait_one_response(obj, request): File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 580, in _wait_one_response await asyncio.wait_for(state.event.wait(), timeout=4) File "/usr/lib/python3.10/asyncio/tasks.py", line 432, in wait_for await waiter asyncio.exceptions.CancelledError [2025-04-30 11:01:39] INFO: 192.168.252.3:59006 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error [2025-04-30 11:01:39] ERROR: Exception in ASGI application


**Q: How can we reproduce the issue?**
-----------------------------------------

A: We can reproduce the issue by running the following command:

```bash
#!/bin/bash

GPUS=${GPUS:-"device=0"}
SERVER_PORT=${SERVER_PORT:-30020}
MODEL_PATH=${MODEL_PATH:-"/mnt/data1/models/QwQ-32B"}
MODEL_NAME=${MODEL_NAME:-"QwQ-32B"}
IMAGE=${IMAGE:-"docker.io/lmsysorg/sglang:v0.4.5.post3-cu124"}
TP=${TP:-1}
SHM_SIZE=${SHM_SIZE:-"32g"}

# 打印当前配置
echo "Launching container with:"
echo "  GPU: $GPUS"
echo "  DIST_INIT_ADDR:$DIST_INIT_ADDR"
echo "  SERVER PORT: $SERVER_PORT"
echo "  Model Path: $MODEL_PATH"
echo "  Model Name: $MODEL_NAME"
echo "  Image: $IMAGE"
echo "  TP: $TP"
echo "  Nodes: $NNODES"
echo "  Node Rank: $NODE_RANK"
echo "  SHM Size: $SHM_SIZE"

# 启动容器
docker run \
    --gpus "\"$GPUS\"" \
    --shm-size "$SHM_SIZE" \