[Bug] When I Use The Function Call, The Service Throws An Error And Exits.
Introduction
We are experiencing an issue where the service crashes and exits when using tools. The service works fine as long as we don't use tools. We are unsure if we are using something incorrectly.
Describe the Bug
After starting the service with the following command, it works fine as long as we don't use tools. However, if we pass in tools, the service crashes and exits.
docker run \
--gpus "\"$GPUS\"" \
--shm-size "$SHM_SIZE" \
-it --rm \
-p "$SERVER_PORT:$SERVER_PORT" \
-v "$MODEL_PATH":/models \
"$IMAGE" \
python3 -m sglang.launch_server \
--model-path "/models" \
--host "0.0.0.0" \
--port "$SERVER_PORT" \
--tp "$TP" \
--trust-remote-code \
--served-model-name "$MODEL_NAME"\
--enable-metrics \
--enable-torch-compile \
--tool-call-parser 'qwen25'
The error message is as follows:
[2025-04-30 11:01:39 TP0] Scheduler hit an exception: Traceback (most recent call last):
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2013, in run_scheduler_process
scheduler.event_loop_overlap()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 640, in event_loop_overlap
batch = self.get_next_batch_to_run()
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1145, in get_next_batch_to_run
new_batch = self.get_new_batch_prefill()
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1271, in get_new_batch_prefill
self.log_prefill_stats(adder, can_run_list, running_bs)
File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1010, in log_prefill_stats
total_queue_latency += req.queue_time_end - req.queue_time_start
TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'
[2025-04-30 11:01:39] Received sigquit from a child process. It usually means the child failed.
[2025-04-30 11:01:39] ERROR: Traceback (most recent call last):
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete
File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete
File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever
File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run
File "uvloop/handles/poll.pyx", line 216, in uvloop.loop.__on_uvpoll_event
File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run
File "uvloop/loop.pyx", line 399, in uvloop.loop.Loop._read_from_self
File "uvloop/loop.pyx", line 404, in uvloop.loop.Loop._invoke_signals
File "uvloop/loop.pyx", line 379, in uvloop.loop.Loop._ceval_process_signals
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/engine.py", line 474, in sigquit_handler
kill_process_tree(os.getpid())
File "/sgl-workspace/sglang/python/sglang/srt/utils.py", line 686, in kill_process_tree
sys.exit(0)
SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
await fut
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in __call__
await super().__call__(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 112, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 165, in __call__
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 714, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 734, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 73, in app
response = await f(request)
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/http_server.py", line 570, in openai_v1_chat_completions
return await v1_chat_completions(_global_state.tokenizer_manager, raw_request)
File "/sgl-workspace/sglang/python/sglang/srt/openai_api/adapter.py", line 1699, in v1_chat_completions
ret = await tokenizer_manager.generate_request(
File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 385, in generate_request
async for response in self._wait_one_response(obj, request):
File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 580, in _wait_one_response
await asyncio.wait_for(state.event.wait(), timeout=4)
File "/usr/lib/python3.10/asyncio/tasks.py", line 432, in wait_for
await waiter
asyncio.exceptions.CancelledError
[2025-04-30 11:01:39] INFO: 192.168.252.3:59006 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
[2025-04-30 11:01:39] ERROR: Exception in ASGI application
Reproduction
We can reproduce the issue by running the following command:
#!/bin/bash
GPUS=${GPUS:-"device=0"}
SERVER_PORT=${SERVER_PORT:-30020}
MODEL_PATH=${MODEL_PATH:-"/mnt/data1/models/QwQ-32B"}
MODEL_NAME=${MODEL_NAME:-"QwQ-32B"}
IMAGE=${IMAGE:-"docker.io/lmsysorg/sglang:v0.<br/>
**Q&A: Bug Report - Service Crashes and Exits When Using Tools**
===========================================================
**Q: What is the issue with the service?**
----------------------------------------
A: The service crashes and exits when using tools. It works fine as long as we don't use tools.
**Q: What is the error message?**
------------------------------
A: The error message is as follows:
[2025-04-30 11:01:39 TP0] Scheduler hit an exception: Traceback (most recent call last): File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 2013, in run_scheduler_process scheduler.event_loop_overlap() File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 640, in event_loop_overlap batch = self.get_next_batch_to_run() File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1145, in get_next_batch_to_run new_batch = self.get_new_batch_prefill() File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1271, in get_new_batch_prefill self.log_prefill_stats(adder, can_run_list, running_bs) File "/sgl-workspace/sglang/python/sglang/srt/managers/scheduler.py", line 1010, in log_prefill_stats total_queue_latency += req.queue_time_end - req.queue_time_start TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'
[2025-04-30 11:01:39] Received sigquit from a child process. It usually means the child failed. [2025-04-30 11:01:39] ERROR: Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run File "uvloop/handles/poll.pyx", line 216, in uvloop.loop.__on_uvpoll_event File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run File "uvloop/loop.pyx", line 399, in uvloop.loop.Loop._read_from_self File "uvloop/loop.pyx", line 404, in uvloop.loop.Loop._invoke_signals File "uvloop/loop.pyx", line 379, in uvloop.loop.Loop._ceval_process_signals File "/sgl-workspace/sglang/sglang/srt/entrypoints/engine.py", line 474, in sigquit_handler kill_process_tree(os.getpid()) File "/sgl-workspace/sglang/python/sglang/srt/utils.py", line 686, in kill_process_tree sys.exit(0) SystemExit: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait await fut asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi result = await app( # type: ignore[func-returns-value] File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 60, in call return await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1054, in call await super().call(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 112, in call await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 165, in call await self.app(scope, receive, _send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/cors.py", line 85, in call await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 62, in call await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 714, in call await self.middleware_stack(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 734, in app await route.handle(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 288, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 76, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/usr/local/lib/python3.10/dist-packages/starlette/_exception_handler.py", line 42, in wrapped_app await app(scope, receive, sender) File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 73, in app response = await f(request) File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app raw_response = await run_endpoint_function File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function return await dependant.call(**values) File "/sgl-workspace/sglang/python/sglang/srt/entrypoints/http_server.py", line 570, in openai_v1_chat_completions return await v1_chat_completions(_global_state.tokenizer_manager, raw_request) File "/sgl-workspace/sglang/python/sglang/srt/openai_api/adapter.py", line 1699, in v1_chat_completions ret = await tokenizer_manager.generate_request( File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 385, in generate_request async for response in self._wait_one_response(obj, request): File "/sgl-workspace/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 580, in _wait_one_response await asyncio.wait_for(state.event.wait(), timeout=4) File "/usr/lib/python3.10/asyncio/tasks.py", line 432, in wait_for await waiter asyncio.exceptions.CancelledError [2025-04-30 11:01:39] INFO: 192.168.252.3:59006 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error [2025-04-30 11:01:39] ERROR: Exception in ASGI application
**Q: How can we reproduce the issue?**
-----------------------------------------
A: We can reproduce the issue by running the following command:
```bash
#!/bin/bash
GPUS=${GPUS:-"device=0"}
SERVER_PORT=${SERVER_PORT:-30020}
MODEL_PATH=${MODEL_PATH:-"/mnt/data1/models/QwQ-32B"}
MODEL_NAME=${MODEL_NAME:-"QwQ-32B"}
IMAGE=${IMAGE:-"docker.io/lmsysorg/sglang:v0.4.5.post3-cu124"}
TP=${TP:-1}
SHM_SIZE=${SHM_SIZE:-"32g"}
# 打印当前配置
echo "Launching container with:"
echo " GPU: $GPUS"
echo " DIST_INIT_ADDR:$DIST_INIT_ADDR"
echo " SERVER PORT: $SERVER_PORT"
echo " Model Path: $MODEL_PATH"
echo " Model Name: $MODEL_NAME"
echo " Image: $IMAGE"
echo " TP: $TP"
echo " Nodes: $NNODES"
echo " Node Rank: $NODE_RANK"
echo " SHM Size: $SHM_SIZE"
# 启动容器
docker run \
--gpus "\"$GPUS\"" \
--shm-size "$SHM_SIZE" \