Sql/tests: TestRandomSyntaxFunctions Failed
Introduction
The CockroachDB team has encountered a test failure in the sql/tests
suite, specifically in the TestRandomSyntaxFunctions
test. This test failure has been observed on the release-25.1.6-rc
branch and is attributed to a fatal error that occurs after 1 hour of test execution.
Error Details
The error message indicates that the test timed out after 1 hour, with the following stacktrace:
goroutine 5306095 [running]:
testing.(*M).startAlarm.func1()
GOROOT/src/testing/testing.go:2366 +0x385
created by time.goFunc
GOROOT/src/time/sleep.go:177 +0x2d
The stacktrace suggests that the issue is related to a timing-related problem in the testing
package.
Log Preceding Fatal Error
The log preceding the fatal error provides additional context:
* goroutine 3139799 [chan send, 10 minutes]:
* github.com/cockroachdb/cockroach/pkg/sql.(*planner).fingerprintSpanFanout.(*planner).fingerprintSpanFanout.func1.func2({0x8a9b538, 0xc01f45ef50})
* pkg/sql/fingerprint_span.go:178 +0x3b9
* github.com/cockroachdb/cockroach/pkg/sql.(*planner).fingerprintSpanFanout.Group.GoCtx.func3()
* pkg/util/ctxgroup/ctxgroup.go:189 +0x8b
* golang.org/x/sync/errgroup.(*Group).Go.func1()
* external/org_golang_x_sync/errgroup/errgroup.go:78 +0x56
* created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 3139751
* external/org_golang_x_sync/errgroup/errgroup.go:75 +0x96
*
* goroutine 2726155 [select]:
* github.com/cockroachdb/cockroach/pkg/util/admission.initWorkQueue.func2()
* pkg/util/admission/work_queue.go:424 +0x85
* created by github.com/cockroachdb/cockroach/pkg/util/admission.initWorkQueue in goroutine 2725791
* pkg/util/admission/work_queue.go:421 +0x3be
*
* goroutine 3657096 [chan send, 9 minutes]:
* github.com/cockroachdb/cockroach/pkg/sql.(*planner).fingerprintSpanFanout.(*planner).fingerprintSpanFanout.func1.func2({0x8a9b538, 0xc0232205f0})
* pkg/sql/fingerprint_span.go:178 +0x3b9
* github.com/cockroachdb/cockroach/pkg/sql.(*planner).fingerprintSpanFanout.Group.GoCtx.func3()
* pkg/util/ctxgroup/ctxgroup.go:189 +0x8b
* golang.org/x/sync/errgroup.(*Group).Go.func1()
* external/org_golang_x_sync/errgroup/errgroup.go:78 +0x56
* created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 3656915
* external/org_golang_x_sync/errgroup/errgroup.go:75 +0x96
*
* goroutine 5174980 [semacquire, 2 minutes]:
* sync.runtime_Semacquire(0xcb444a0?)
* GOROOT/src/runtime/sema.go:62 +0x25
* sync.(*WaitGroup).Wait(0x8a3d7e0?)
* GOROOT/src/sync/waitgroup.go:116 +0x48
* github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*Server).serveImpl(0xc029f20a90, {0x8a9b538, 0xc02c6f0140}, 0xc01116e008, 0xc00c707f08, {0x0, 0x4, {0x1, {0x7190c4e, 0x3}, ...}, ...}, ...)
* pkg/sql/pgwire/server.go:1452 +0xa7d
* github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*Server).ServeConn(0xc029f20a90, {0x8a9b538, 0xc02c6f0140}, {0x8aebb20, 0xc014544a88}, {0x0, 0x4, 0x0, 0xc00c707ea8, {{{{...}}, ...}, ...}})
* pkg/sql/pgwire/server.go:958 +0xb36
* github.com/cockroachdb/cockroach/pkg/server.(*systemServerWrapper).serveConn(0xc01654a6f8, {0x8a9cb88, 0xc006f3b680}, {0x8aebb20, 0xc014544a88}, {0x0, 0x4, 0x0, 0xc00c707ea8, {{{{...}}, ...}, ...}})
* pkg/server/server_controller_sql.go:170 +0xf7
* github.com/cockroachdb/cockroach/pkg/server.(*serverController).sqlMux(0xc0091b4000, {0x8a9cb88, 0xc006f3b680}, {0x8aebb20, 0xc014544a88}, {0x0, 0x4, 0x0, 0xc00c707ea8, {{{{...}}, ...}, ...}})
* pkg/server/server_controller_sql.go:90 +0x2f8
* github.com/cockroachdb/cockroach/pkg/server.startServeSQL.func1.1({0x8a9cb88, 0xc023059040}, {0x8aec290, 0xc0000fdc08})
* pkg/server/server_sql.go:1989 +0x30c
* github.com/cockroachdb/cockroach/pkg/util/netutil.(*TCPServer).ServeWith.func1({0x8a9cb88, 0xc023059040})
* pkg/util/netutil/net.go:186 +0x102
* github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2({0x71b96c0?, 0x0?})
* pkg/util/stop/stopper.go:498 +0x1f0
* created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx in goroutine 2727844
* pkg/util/stop/stopper.go:488 +0x47c
*
*
Help
For more information on how to investigate a Go test failure, please refer to the following resource:
Same Failure on Other Branches
This test failure has also been observed on other branches, including:
- #146421 sql/tests: TestRandomSyntaxFunctions failed [C-test-failure O-robot T-sql-foundations branch-release-25.1 release-blocker]
CC
@cockroachdb/sql-foundations
Related Issues
- Jira issue: CRDB-50510
Additional Resources
- This test on roachdash
- Improve this report!
Q&A: sql/tests: TestRandomSyntaxFunctions failed =====================================================
Q: What is the cause of the TestRandomSyntaxFunctions test failure?
A: The cause of the TestRandomSyntaxFunctions test failure is a fatal error that occurs after 1 hour of test execution. The error message indicates that the test timed out, and the stacktrace suggests that the issue is related to a timing-related problem in the testing
package.
Q: What is the significance of the stacktrace?
A: The stacktrace provides a detailed view of the function calls that led to the fatal error. It shows that the error occurred in the testing
package, specifically in the startAlarm.func1()
function. This suggests that the issue is related to the way the test is being executed, rather than a specific function or module.
Q: What are the possible causes of the timing-related problem?
A: There are several possible causes of the timing-related problem, including:
- Infinite loops: The test may be stuck in an infinite loop, causing it to timeout.
- Deadlocks: The test may be experiencing a deadlock, where two or more threads are blocked, waiting for each other to release a resource.
- Resource leaks: The test may be experiencing a resource leak, where a resource is not being released properly, causing the test to timeout.
Q: How can I investigate the issue further?
A: To investigate the issue further, you can:
- Review the test code: Review the test code to see if there are any obvious issues or areas where the test may be stuck.
- Use a debugger: Use a debugger to step through the test code and see where the issue is occurring.
- Collect more logs: Collect more logs to see if there are any other errors or issues that may be related to the timing-related problem.
Q: What are the next steps to resolve the issue?
A: The next steps to resolve the issue are:
- Identify the root cause: Identify the root cause of the timing-related problem.
- Fix the issue: Fix the issue by making changes to the test code or configuration.
- Verify the fix: Verify that the fix has resolved the issue by re-running the test.
Q: How can I prevent similar issues in the future?
A: To prevent similar issues in the future, you can:
- Implement better testing practices: Implement better testing practices, such as using a testing framework that can detect timing-related problems.
- Use a code review process: Use a code review process to catch issues before they make it into production.
- Monitor test performance: Monitor test performance to catch issues before they become major problems.
Q: What resources are available to help me resolve the issue?
A: The following resources are available to help you resolve the issue:
- CockroachDB documentation: The CockroachDB documentation provides information on how to troubleshoot and resolve common issues.
- CockroachDB community: The CockroachDB community is a great resource for getting help and advice from other users and experts.
- CockroachDB support: CockroachDB support is available to help resolve issues and answer questions.