Merge remote-tracking branch 'origin/topic/awelzel/fix-flaky-terminate-while-queueing'

* origin/topic/awelzel/fix-flaky-terminate-while-queueing:
  cluster/websocket: Stop and wait for reply thread during Terminate()
This commit is contained in:
Arne Welzel 2025-05-07 13:21:39 +02:00
commit 135acc7c6d
4 changed files with 19 additions and 2 deletions

12
CHANGES
View file

@ -1,3 +1,15 @@
8.0.0-dev.50 | 2025-05-07 13:21:39 +0200
* cluster/websocket: Stop and wait for reply thread during Terminate() (Arne Welzel, Corelight)
The terminate-while-queueing test added for #4428 failed spuriously
indicating that sometimes WebSocket clients receive code 1000 instead of 1001.
This happens if the ixwebsocket server is shutdown before the reply thread had a
chance to process queued close messages.
Fix by signaling and waiting for the dispatcher's reply thread to terminate
before returning from Terminate().
8.0.0-dev.46 | 2025-05-06 14:20:54 +0200
* testing/btest: Fix double commented @TEST- lines (Arne Welzel, Corelight)

View file

@ -1 +1 @@
8.0.0-dev.46
8.0.0-dev.50

View file

@ -249,6 +249,11 @@ void WebSocketEventDispatcher::Terminate() {
clients.clear();
onloop->Close();
// Wait for the reply_msg_thread to process any outstanding
// WebSocketReply messages before returning.
reply_msg_thread->SignalStop();
reply_msg_thread->WaitForStop();
}
void WebSocketEventDispatcher::QueueForProcessing(WebSocketEvent&& event) {

View file

@ -89,7 +89,7 @@ def run(ws_url):
tc.send_json(wstest.build_event_v1("/test/pings/", "ping", [f"tc{idx}", i]))
except websockets.exceptions.ConnectionClosedOK as e:
print("connection closed ok")
assert e.code == 1001 # Remote going away
assert e.code == 1001, f"expected code 1001, got {e.code} - {e}" # Remote going away
i -= 1
saw_closed_ok.add(idx)