Exception Of `ngx.print` And `ngx.flush` In Access Phase Under Boundary Conditions
Current Behavior
In OpenResty version 1.27.1.1, an error occurs when sending a streaming response using ngx.print
and ngx.flush
in the access phase. This error is triggered by the yieldable phase checking code in ngx.print
or ngx.flush
, which incorrectly assumes that it is in the body_filter
phase instead of the access phase.
Expected Behavior
There should be no error when using ngx.print
and ngx.flush
in the access phase.
Steps to Reproduce
To reproduce this issue, follow these steps:
- Start OpenResty with the following nginx.conf:
master_process on;
worker_processes 1;
error_log logs/error.log warn;
worker_rlimit_nofile 20480;
events {
accept_mutex off;
worker_connections 10620;
}
worker_rlimit_core 16G;
worker_shutdown_timeout 240s;
http {
server {
http2 on;
listen 0.0.0.0:9080 default_server;
location / {
access_by_lua_block {
while true do
ngx.print(("hello world"):rep(100000))
ngx.flush(true)
end
}
body_filter_by_lua_block {
ngx.log(ngx.DEBUG, "body_filter_by_lua_block")
}
}
}
}
- Access
127.0.0.1:9080
This will spit out some response, after which the connection is closed, and looking at error.log, you can find the error mentioned above.
Debugging
To debug this issue, I used GDB to analyze the call stack and identify the root cause of the problem.
The error is triggered by the yieldable phase checking code in ngx.print
or ngx.flush
, which incorrectly assumes that it is in the body_filter
phase instead of the access phase. This is due to the phase checking code incorrectly restoring the context after the Lua code logic is executed.
Fix
To fix this issue, I propose adding ctx->context = old_context
at the relevant location in the ngx_http_lua_bodyfilterby.c
file. This will ensure that the context is restored correctly even when the ngx_http_lua_body_filter
call is triggered by the fd's epoll writable event.
Conclusion
In conclusion, the issue of ngx.print
and ngx.flush
in the access phase under boundary conditions is caused by the yieldable phase checking code incorrectly assuming that it is in the body_filter
phase instead of the access phase. To fix this issue, we need to restore the context correctly even when the ngx_http_lua_body_filter
call is triggered by the fd's epoll writable event.
Additional Information
- The issue is triggered by the
ngx_http_lua_wev_handler
function, which causes an output_filter (body_filter) call with an in of NULL to be made becausectx->busy_bufs
is not empty inngx_http_lua_flush_pending_output
. - The issue is reproducible when sending a large response with a slow downstream, resulting in data accumulating in the kernel TCP buffer or insideINX.
- The fix involves adding
ctx->context = old_context
at the relevant location in thengx_http_lua_bodyfilterby.c
file to restore the context correctly.
Related Issues
- #2411: Fix the issue of
ngx.print
andngx.flush
in the access phase under boundary conditions.
Code Changes
To fix this issue, the following code changes are proposed:
// In ngx_http_lua_bodyfilterby.c
void ngx_http_lua_body_filter(ngx_http_request_t *r, ngx_chain_t *out, ngx_chain_t **out_last)
{
// ...
if (ctx->busy_bufs) {
// ...
ctx->context = old_context; // Add this line to restore the context correctly
// ...
}
// ...
}
Q: What is the issue with ngx.print
and ngx.flush
in the access phase?
A: The issue is that ngx.print
and ngx.flush
incorrectly assume that they are in the body_filter
phase instead of the access phase, resulting in an error.
Q: What is the expected behavior?
A: The expected behavior is that there should be no error when using ngx.print
and ngx.flush
in the access phase.
Q: How can I reproduce this issue?
A: To reproduce this issue, follow these steps:
- Start OpenResty with the following nginx.conf:
master_process on;
worker_processes 1;
error_log logs/error.log warn;
worker_rlimit_nofile 20480;
events {
accept_mutex off;
worker_connections 10620;
}
worker_rlimit_core 16G;
worker_shutdown_timeout 240s;
http {
server {
http2 on;
listen 0.0.0.0:9080 default_server;
location / {
access_by_lua_block {
while true do
ngx.print(("hello world"):rep(100000))
ngx.flush(true)
end
}
body_filter_by_lua_block {
ngx.log(ngx.DEBUG, "body_filter_by_lua_block")
}
}
}
}
- Access
127.0.0.1:9080
This will spit out some response, after which the connection is closed, and looking at error.log, you can find the error mentioned above.
Q: What is the root cause of the issue?
A: The root cause of the issue is that the yieldable phase checking code in ngx.print
or ngx.flush
incorrectly assumes that it is in the body_filter
phase instead of the access phase.
Q: How can I fix this issue?
A: To fix this issue, you need to restore the context correctly even when the ngx_http_lua_body_filter
call is triggered by the fd's epoll writable event. This can be done by adding ctx->context = old_context
at the relevant location in the ngx_http_lua_bodyfilterby.c
file.
Q: What is the fix for this issue?
A: The fix involves adding ctx->context = old_context
at the relevant location in the ngx_http_lua_bodyfilterby.c
file to restore the context correctly.
Q: What are the related issues?
A: The related issues are:
- #2411: Fix the issue of
ngx.print
andngx.flush
in the access phase under boundary conditions.
Q: What are the code changes required to fix this issue?
A: The code changes required to fix this issue are:
// In ngx_http_lua_bodyfilterby.c
void ngx_http_lua_body_filter(ngx_http_request_t *r, ngx_chain_t *out, ngx_chain_t **out_last)
{
// ...
if (ctx->busy_bufs) {
// ...
ctx->context = old_context; // this line to restore the context correctly
// ...
}
// ...
}
Note that the above code changes are proposed as a fix for the issue, and the actual code changes may vary depending on the specific implementation and requirements.