Part I - Troubleshooting 4xx Errors
Debugging and Troubleshooting Overview
The API Management is nothing but a proxy which help to forward the request from client side to destination API service. It has the ability to modify the request or process based on the inputs from the client side before it reaches the destination. In an ideal scenario, APIs configured within an APIM service are expected to return successful responses (mostly 200 OK) along with the accurate data that is expected from the API.
In case of failures, you may see an incorrect response code along with a precise error message of what went wrong during the API call.
However, there may be scenarios where you may observe API requests failing with generic 4xx or 5xx errors without a detailed error message, and it could be difficult to narrow down or isolate the source of the error.
In such cases, the first point is to isolate whether the error code is thrown by APIM or the backend configured by the APIM. This proves to be an important method as most of the error codes are generated by the backend and APIM being a proxy forwards the response (error codes) back to the users who initiated the request. This makes the user think that the error code is thrown from the APIM.
Troubleshooting Azure APIM Failed Requests
Let's suppose you have initiated an API request to your APIM service and the request eventually fails with a “HTTP 500 – Internal Server Error” message.
With generic error messages such as above, it becomes very difficult to isolate the cause or the source of the failed API request since there are several internal and external components that participate during an API invocation process.
Inspector Trace
If the issue is reproducible on demand, then your best option would be to enable tracing for your APIM API requests. Azure APIM services have the option of enabling the “Ocp-Apim-Trace” for your API requests. This generates a descriptive trace containing detailed information that helps you inspect the request processing step-by-step in detail and gives you a head-start on the source of the error.
Reference: https://docs.microsoft.com/en-us/azure/api-management/api-management-howto-api-inspector
Diagnostic Logging to Azure Monitor Log Analytics
You could also enable diagnostic logging for your APIM services. Diagnostic Logs can be archived to a storage account, streamed to an Event Hub resource, or be sent to Azure Monitor Log Analytics logs which could be further queried as per the scenario and requirement.
These logs provide rich information about operations and errors that are important for auditing as well as troubleshooting purposes. The best part about the diagnostic logs is that they provide you with granular level per-request logs for each of your API requests and assist you with further troubleshooting.
Reference Article: https://docs.microsoft.com/en-us/azure/api-management/api-management-howto-use-azure-monitor#resourc...
While storage accounts and event hubs work as single targeted destinations for diagnostic log collection/streaming, if you choose to enable APIM diagnostic settings with the destination as Log Analytics Workspace, you would be offered with the below 2 modes of resource log collection:
Reference Article: https://docs.microsoft.com/en-us/azure/azure-monitor/platform/resource-logs#send-to-log-analytics-wo...
If you want the resource logs to be ported to the ApiManagementGatewayLogs table, you would have to choose the option ‘Resource specific’ as highlighted in the sample screenshot below:
Below are the sample diagnostic logs generated on the Log Analytics Workspace. These logs would provide granular level details for your API requests such as the timestamp, request status, api/operation id, time taken values, caller/client IP, method, url invoked, backend url invoked, response code, backend response code, request size, response size, error source, error reason, error message, et cetera.
NOTE: Post initial configuration, it may take a couple of hours for the diagnostic logs to be streamed to the destination by the resource provider.
Depending on your mode of log collection, here are a few sample queries that could be used for querying the logs pertaining to diagnostic data for your API requests. You can also choose to filter through the logs by fine-tuning the query to retrieve data specific to an API ID or specific to a response code, et cetera.
Maneuver to Azure Portal a APIM service a Logs blade under “Diagnostic Settings” section to execute the queries
AzureDiagnostics
| where TimeGenerated > ago(24h)
| where_ResourceId == “apim-service-name”
| limit 100
ApiManagementGatewayLogs
| where TimeGenerated > ago(24h)
| limit 100
Log to Application Insights
Another option is to integrate APIM service with Application Insights for generating diagnostic log data.
Integration of APIM with Application Insights - https://docs.microsoft.com/en-us/azure/api-management/api-management-howto-app-insights
Below is a sample query that can be used for querying the “requests” table that can retrieve the diagnostic data concerned with Azure APIM API requests
Maneuver to the respective Application Insights resource a Click on Logs under “Monitoring” section.
requests
| where timestamp > ago(24h)
| limit 100
Alternatively, the error handling in APIM can be carried out using the API management error handling policy - https://docs.microsoft.com/en-us/azure/api-management/api-management-error-handling-policies
Now that we have enabled diagnostic logs in order to retrieve details about the different types of errors and errors messages for failed API requests, let’s walk through a couple of commonly observed 4xx and 5xx errors with APIM services.
This troubleshooting series focuses on
Troubleshooting 4xx and 5xx errors with APIM services
The very first pivotal step with troubleshooting failed API requests is to investigate the source of the response code that is being returned.
If you have enabled diagnostic logging for your APIM service, then the columns “ResponseCode” and “BackendResponseCode” would divulge this primary information.
If the 4xx or the 5xx response being returned to the client is primarily being returned by the backend API (review “BackendResponseCode” column), then the issue has to troubleshoot more often from the backend perspective since the APIM service would then forward the same response back to the client without actually contributing to the issue.
4xx Errors:
Error code: 400
Scenario 1
Symptoms:
The API Management has been working fine during its implementation. It is now throwing a ‘400 Bad Request’ when invoked using the ‘Test’ option under the API Management in Azure portal. While accessing it using a client app or application, the desired result is yielded.
Troubleshooting:
Now, from the above scenario, we understand that the API is throwing a ‘400 Bad Request’ when invoke only from API Management under the Azure portal.
But the other method of invoking is yielding results. The error message clearly states that the endpoint could not be resolved. In case, if it was an issue with the endpoint, then the issue should occur across the invoking methods of the API. Since it is not our case, let us try verifying the endpoint. You can either try to resolve the endpoint from the same machine using command prompt or try a ping test.
Resolution:
In this kind of scenario’s, it is always recommended to check if the API Management is present within a Virtual Network and also notice that it will be configured in the internal mode.
As per the official documentation, “The Test console available on the Azure Portal will not work for Internal VNET deployed service, as the Gateway Url is not registered on the Public DNS. You should instead use the Test Console provided on the Developer portal.”
Scenario 2
Symptoms:
While invoking the API present under the API Management, we encounter ‘Error: The remote server returned an error: (400) Invalid client certificate’.
Troubleshooting:
Let us analyze the scenario,
This issue occurs when the customer has implemented mutual client certificate authentication, in this case client should pass the valid certificate as per the condition written in the policy
<policies>
<inbound>
<base />
<choose>
<when condition="@(context.Request.Certificate == null || !context.Request.Certificate.Verify() || context.Request.Certificate.Issuer.Contains("*.azure-api.net") || !context.Request.Certificate.SubjectName.Name.Contains("*.azure-api.net")
|| context.Request.Certificate.Thumbprint != "4BB206E17EE41820B36112FD76CAE3E0F7104F36") ">
<return-response>
<set-status code="403" reason="Invalid client certificate" />
</return-response>
</when>
</choose>
</inbound><backend><base />
</backend><outbound><base /></outbound><on-error>
<base /></on-error>
</policies>
To check whether the certificate is passed or not we can enable the ocp-apim-trace. The below trace shows that no client certificate received.
Resolution:
Issue resolved after adding the valid client certificate.
Similar Scenario’s:
Scenario 3
Error Reason: OperationNotFound
Error message: Unable to match incoming request to an operation.
Error Section: Backend
Resolution:
Make sure that the operation which is invoked for the API is configured or present in the API Management. If not, add the operation or modify the request accordingly.
Scenario 4
Error Reason: ExpressionValueEvaluationFailure
Error message: Expression evaluation failed. EXPECTED400: URL cannot contain query parameters. Provide root site url of your project site (Example: https://sampletenant.sharepoint.com/teams/sampleteam )
Error Section: inbound
Resolution:
Ensure that the URL contains only the query parameter defined in the API according to the configuration in the API Management. Any mismatch might lead to such error messages. For example, if the expected input value is integer and we supply a string, this scenario might lead to the error.
Error code: 401 - Unauthorized issues
Scenario 1
Symptoms: The Echo API suddenly started throwing HTTP 401 - Unauthorized error while invoking the operations under it.
Message-
HTTP/1.1 401 Unauthorized
{ "statusCode": 401, "message": "Access denied due to missing subscription key. Make sure to include subscription key when making requests to an API."}
{
"statusCode": 401,
"message": "Access denied due to invalid subscription key. Make sure to provide a valid key for an active subscription."
}
You may get the following error message:
HTTP/1.1 401 Unauthorized
Content-Length: 152
Content-Type: application/json
Date: Sun, 29 Jul 2018 14:29:50 GMT
Vary: Origin
WWW-Authenticate: AzureApiManagementKey realm="https://pratyay.azure-api.net/echo",name="Ocp-Apim-Subscription-Key",type="header" {
"statusCode": 401,
"message": "Access denied due to missing subscription key. Make sure to include subscription key when making requests to an API."
}
Resolution:
Developers must first subscribe to a product to get access to the API. When they subscribe, they get a subscription key that is good for any API in that product. If you created the APIM instance, you are an administrator already, so you are subscribed to every product by default.
Error code: 401 Unauthorized issues
Scenario
Symptoms:
The Echo API has enabled OAuth 2.0 user authorization in the Developer Console. Before calling the API, the Developer Console will obtain an access token on behalf of the user from Authorization header in the Request.
Message :
<inbound>
<base />
<validate-jwt header-name="Authorization" failed-validation-httpcode="401" failed-validation-error-message="Unauthorized. Access token is missing or invalid.">
<openid-config url="https://login.microsoftonline.com/common/v2.0/.well-known/openid-configuration" />
<required-claims>
<claim name="aud">
<value>bf795850-70c6-4f22- </value>
</claim></required-claims>
</validate-jwt>
</inbound>
Resolution:
The claim name provided in the Claim section does not match with the APP registered in the AAD.
Provide the Client app registered Application ID in the Claims section to fix the authorization error.
After providing the valid app id, the HTTP response results with HTTP/1.1 200 OK.
Error code: 403 - Forbidden issues
Symptoms:
GetSpeakers API operation fetches the details of speakers based on the value provided in the parameter. After few days of using it, The Operation started throwing HTTP 403- Forbidden error whereas the other operations are working fine as expected.
Message:
HTTP/1.1 403 Forbidden
{
"statusCode": 403,
"message": "Forbidden"
}
We notice the existence of a “ip-filter” policy that filters(allow/denies) call from specific IP address ranges.
<inbound>
<base /><choose>
<when condition="@(context.Operation.Name.Equals("GetSpeakers"))">
<ip-filter action="allow">
<address-range from="13.66.140.128" to="13.66.140.143" />
</ip-filter>
</when></choose>
</inbound>
Resolution:
HTTP 403 - Forbidden error can be thrown when there is any access restriction policy implemented.
As we can see the IP address is not whitelisted in the error screenshot, we need to allow the IP address in the Policy to make it work.
Before:
<ip-filter action="allow">
<address-range from="13.66.140.128" to="13.66.140.143" />
</ip-filter>
After:
<ip-filter action="allow">
<address>13.91.254.72</address>
<address-range from="13.66.140.128" to="13.66.140.143" />
</ip-filter>
Once we allow the IP address in the IP-Filter Policy we would be able to receive the response.
Error code: 404
Symptoms:
The Demo API is being invoked by either of the means below,
- Developer portal
- ‘Test’ option under API Management
- Client app like PostMan
- Using user code
The result of the call is a 404 Not Found error code.
Troubleshooting:
Make sure that the issue is existing to proceed with the troubleshooting steps.
Note: The API Management is not present in any Virtual Network which eliminates the option of Network elements causing the issue.
According to the API Management configuration, below are the settings
Name of the API – Demo API
Web Service URL - http://echoapi.cloudapp.net/api
Subscription Required – Yes
Below is the error scenario for the 404 error code using the API Management and the PostMan.
Postman:
API Management portal:
Based on the trace file, we can see that the error code is thrown from the forward-request section and we do not obtain much insights from it.
The configured web service URL is also reachable, and it displays us a visible content.
Web Service URL:
Hence, we proceed on collecting the browser trace while replicating the issue in the API Management section in Azure portal.
Steps to collect browser trace:
- Replicate the issue in the browser (chrome, steps for other browsers might differ slightly)
- Press F12 and navigate to the network tab.
- Make sure that the actions are recorded.
- Right click on any one of the actions and select the last option (Save all as HAR with content).
From the trace, we could see the below information which is show in preview state.
The Requested URL does not lead to a proper content over the mentioned Web Service URL. This is the reason that though the Web Service URL is reachable, the API was still throwing a 404 Not found error code when it was invoked.
Resolution:
Make sure that the Web Service URL leads to a valid destination which helps in the issue resolution. The best approach is to create a proper backend structure which hosts the APIs and then map it to the respective API of the API Management and not vice versa.
The following pointers are the main reason to encounter a 404 Not found error message from an API Management.
In our case, the error is in correspondence with the second point where the configured URL is not pointing to the destination. This has been confirmed by the Browser trace too and hence correcting the URL/path will resolve the issue.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.