Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure App service sticky session #1302

Closed
BrennanConroy opened this issue Jun 14, 2021 · 13 comments
Closed

Azure App service sticky session #1302

BrennanConroy opened this issue Jun 14, 2021 · 13 comments

Comments

@BrennanConroy
Copy link
Collaborator


Issue moved from dotnet/aspnetcore#33440


From @Zquiz on Thursday, June 10, 2021 6:53:05 PM

Hello,
I hope someone can help or just tell me that's how it works. But we have moved our Blazor Serverside project from an on-premise hosted environment to using Azure App service.

If we try and use these settings

Autoscale when the memory hit x % usage (works)

Always on to true

ARR affinity off

Web sockets ON

Followed this documentation for this

https://docs.microsoft.com/en-us/aspnet/core/blazor/host-and-deploy/server?view=aspnetcore-5.0

We're always using an Azure SignalR Service as recommended by Mircosoft itself for this. Added Microsoft.Azure.SignalR.ServerStickyMode.Required; in our code, so it should be a sticky session.

The issue we're seeing is after 10-15 mins a browser fan will disconnect if the user is not on that fan but working on another
image

But if we enable ARR affinity cookies. Then it seems like it can negotiate just fine with SignalR. But we lose the effect of making people get onto a new instance if autoscale creates a new instance. So the load won't evenly be distributed.

Is it something we're doing wrong on our end or is it just some missed information for some failsafe built into SignalR?

Bonus info is that on-premise was we using Session sticky session LB and that worked fine for the most time.

Currently, auto-scaling to 9 instances in the morning to even out the load since the ARR cookie will let them stay on the instance for the whole workday.

Documentation I have been following
https://docs.microsoft.com/en-us/aspnet/core/signalr/publish-to-azure-web-app?view=aspnetcore-5.0
https://docs.microsoft.com/en-us/aspnet/core/blazor/host-and-deploy/server?view=aspnetcore-5.0
https://docs.microsoft.com/en-us/aspnet/core/signalr/publish-to-azure-web-app?view=aspnetcore-5.0

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @BrennanConroy on Thursday, June 10, 2021 8:02:23 PM

What is a browser fan? What version of Blazor are you using?
Are you saying that if you enable ARR affinity cookies the issue doesn't happen after 10-15 minutes?

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @Zquiz on Thursday, June 10, 2021 8:09:11 PM

It happens in all browsers that the users have available. So that's Chrome, Edge, Firefox all fully updated.
We're using the .NET 5 version of Blazor and all package is updated

Edit.
Yes with ARR affinity cookie enabled the user will only disconnect if I restart the app service.

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @BrennanConroy on Thursday, June 10, 2021 8:19:43 PM

Can you gather client side logs to see why the client disconnected

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @Zquiz on Friday, June 11, 2021 11:31:09 AM

Yeah sure.
Just the from SignalR or everything happen up to when the user disconnect?

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @BrennanConroy on Friday, June 11, 2021 5:05:25 PM

Ideally SignalR only, but we can probably parse out the other logs if it's not too bad. Also, debug or trace logs please.

@BrennanConroy
Copy link
Collaborator Author


Issue moved from dotnet/aspnetcore#33440


From @Zquiz on Monday, June 14, 2021 3:25:10 PM

Sorry for the late delay.
not that easy to collect a live log while in the middle of moving from on premise to Azure and learn how to dump logs and so on.
Had to use the Live trace tool in Azure SignalR Service
SignalRTrace002
SignalRTrace001
If there is a better way to get the trace out in App Service please show us :)

<style> </style>
Id Message MessageTemplate Level TimeStamp Exception Properties                                                                              
33390606 Received message null for connection "AdTghTJ49Xd8eAVzP3nJZQ865a96eb1" which does not exist. Received message {tracingId} for connection {TransportConnectionId} which does not exist. Warning 2021-06-14 16:21:32.020 NULL <properties><property key='tracingId'></property><property key='TransportConnectionId'>AdTghTJ49Xd8eAVzP3nJZQ865a96eb1</property><property key='EventId'><structure type=''><property key='Id'>10</property><property key='Name'>ReceivedMessageForNonExistentConnection</property></structure></property><property key='SourceContext'>Microsoft.Azure.SignalR.ServiceConnection</property></properties>

@JialinXin
Copy link
Contributor

@Zquiz when you're using Azure SignalR service, App Service side "ARR affinity" and "Web Socket" settings are not required.

And if you're enabling auto-scale, when server is scale-down, related clients will be affected and disconnected. So it's likely to be an expected behavior. Please check if these things were happening together when you found the issues.

@Zquiz
Copy link

Zquiz commented Jun 15, 2021

Thanks for the reply.
Thanks for the clarification. Both in production and on our test environment is it happening after the auto scale has happened.
Even if I have set the scale out to 2 manual in test and wait 15 min afterward is it happening

@Zquiz
Copy link

Zquiz commented Jun 16, 2021

Silly question. Can this behavior be because of the healthcheck in the app service?

@JialinXin
Copy link
Contributor

It's not related to app service health check.

Client disconnect is by design if related server connection is dropped, like due to auto-scale or some server side connectivity issue. Basically with client auto reconnect logic enabled, the impact in client would be minor. Clients will reconnect and route to a usable server connection.

If you do care about client disconnect, please disable auto-scale for stability.
And if you want to have a graceful client migration logic, you can check with GracefulShutdown options.

@Zquiz
Copy link

Zquiz commented Jun 16, 2021

Okay. However this bahavior happens when no auto scaling is happening, but when the instans is higher than 1.
So I set this up for a test on our test app service to test if it was due to out scale down. But it wasn't

Setup
image
Waited 30 mins to make sure both instance was on and double checked with our app where I can see the instance name

image
Can see the instance name in our app and changed with every F5

image
The code to add SignalR with sticky session enabled and some other things to handle a issue with some document system.

image
The setup in app service to what the documentation said to do.
And I still see the disconnect when not on that Browser tab after 10-15 mins.

If however I just turned on ARR affinity. Then it will stary connected till the user closes that tab. That's not ideal for us, since we got multiple instance between 6am till 8pm where most of the users are online. So is it some default timeout setting or just as designed when using Azure SignalR as backplane?

@JialinXin
Copy link
Contributor

JialinXin commented Jun 16, 2021

The settings are for local hosted SignalR. When using Azure SignalR service, clients are setup WebSocket connection to Azure SignalR service directly, so the ARR/Web sockets settings is not in use.

Would you share me your resourceId to let me(jixin[at]microsoft.com) have a further check?

@JialinXin
Copy link
Contributor

Offline synced that the issue is not applied to default Blazor Server template created project and should be related to customer's server side logic. So close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants