Issues with WSO2 API Manager 3.2.0 high availability endpoints - wso2-api-manager

We have configured the high availability for API end points as mentioned here https://apim.docs.wso2.com/en/3.2.0/learn/design-api/endpoints/high-availability-for-endpoints/#configuring-load-balancing-endpoints
In Load balance and Failover Configurations we have chosen the "EndpointType" as "Load Balanced". We could see the requests are routed to these load balanced end points successfully. However when we stop any one of the end-point node, 2 requests are still routed to the stopped node before the remaining requests are successfully routed to the active node. This is happening again and again when we receive new requests. The particular failed end-point is not marked as inactive or down
The error response are
{"fault":{"code":101503,"type":"Status report","message":"Runtime Error","description":"Error connecting to the back end"}}
The entries from carbon logs are attached below
TID: [-1234] [] [2022-12-14 09:51:23,307] WARN {org.wso2.carbon.apimgt.gateway.handlers.throttling.ThrottleHandler} - Error while getting throttling information for resource and http verb
TID: [-1] [] [2022-12-14 09:51:23,308] WARN {org.apache.synapse.transport.passthru.ConnectCallback} - Connection refused or failed for : /100.66.2.32:7010
TID: [-1234] [] [2022-12-14 09:51:23,309] WARN {org.apache.synapse.endpoints.EndpointContext} - Endpoint : NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1 with address http://100.66.2.32:7010/mcm-provider will be marked SUSPENDED as it failed
TID: [-1234] [] [2022-12-14 09:51:23,309] WARN {org.apache.synapse.endpoints.EndpointContext} - Suspending endpoint : NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1 with address http://100.66.2.32:7010/mcm-provider - current suspend duration is : 30000ms - Next retry after : Wed Dec 14 09:51:53 UTC 2022
TID: [-1234] [] [2022-12-14 09:51:23,310] WARN {org.apache.synapse.endpoints.LoadbalanceEndpoint} - Endpoint [NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint] Detect a Failure in a child endpoint : Endpoint [NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1]
TID: [-1234] [] [2022-12-14 09:51:23,310] INFO {org.apache.synapse.mediators.builtin.LogMediator} - {api:admin--NewMCMInboundChannel-RESTAPIService:vv2} STATUS = Executing default 'fault' sequence, ERROR_CODE = 101503, ERROR_MESSAGE = Error connecting to the back end

Here what you are experiencing is the default behavior of the endpoint suspension. Any endpoint created with API Manager can be in 3 states.
Active
Timeout
Suspended
In your configuration, since you have configured two endpoints in a load-balanced manner, initially both endpoints are in the active state. Both endpoints share the load. Once the endpoint 2 is stopped, the next request routed to the endpoint failed with an error code and that has put the endpoint from active to suspended state.
There are three configurations you can set in such suspended situation.
Error code (The error code which puts the endpoint from active to suspended state)
Initial suspension duration.
Maximum duration
Progression Factor
In the default configurations, initial suspension duration is set to 30 seconds. This means, server will remove the suspended state of the endpoint after 30 seconds from the endpoint failure and will put it back to the active state. That's why you can observe that endpoint is getting active from time to time. This is expected as the server tries to determine whether the endpoint is active or not.
You can increase this suspension time with the configurations and suspension time is calculated considering the other 3 configurations.
Endpoint suspension time = Min(current suspension duration *
progressionFactor, maximumDuration)
With each failed attempt, the progression factor will increase the suspension duration until the maximum time. This will reset once the endpoint has served at least one successful request.
You can configure all the in the publisher UI, endpoints section as below.
More information on endpoint suspension can be found in here 1.
1 - https://docs.wso2.com/display/EI660/Endpoint+Error+Handling

Related

Enable second wso2 Publisher/Store instance

I have WSO2 API MAnager 2.6.0 setup with 2 instances running for a while.
Instance one - Traffic Manager, Store, Publisher, Key Manager
Instance two - gateway
For some reasons I need to start using Store and Publisher on the gateway node for one particular case. According to manuals, there should not be any issues with that since I haven't done optimization and profile startup. just like HA setup, but without huge traffic on second gateway Store/Publisher.
The issue is, whet I open https://Instance two:9444/store (offset+1 on that server) I see only 2 APIs out of 7 published on Instance one.
The DB datasources are noted in config as well as synapse files are on same server. At Instance two startup logs I see that APIs are initialized..
Any ideas?
TID: [-1234] [] [2019-08-20 09:53:56,759] INFO {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker} - Endpoint : URApiProxy--v2.0.0_APIproductionEndpoint was added to the Synapse configuration successfully {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker}
TID: [-1234] [] [2019-08-20 09:53:56,760] INFO {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker} - Endpoint : URApiProxy--v2.0.0_APIsandboxEndpoint was added to the Synapse configuration successfully {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker}
further
TID: [-1234] [] [2019-08-20 09:53:57,528] INFO {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker} - API : admin--URApiProxy:v2.0.0 was added to the Synapse configuration successfully {org.wso2.carbon.mediation.dependency.mgt.DependencyTracker}
further
TID: [-1234] [] [2019-08-20 09:54:20,692] INFO {org.apache.synapse.rest.API} - Initializing API: admin--URApiProxy:v2.0.0 {org.apache.synapse.rest.API}
Make sure you have the same configs in registry.xml in instance 2 according to instance 1. Optionally you can try reindexing the server. Please refer WSO2 API Manager issues with solr

AWS SES receipt rules for something#example.com not being triggered as expected

EDIT: My client has returned with an error message (2 day timeout) `Delivery attempt history for your mail:
Sun, 02 Sep 2018 21:20:35 +0000 (GMT)
TCP active open: Failed connect() to TCP port 25 of 104.27.139.56 104.27.138.56 last Error: Connection timed out`
The IP it mentions is Cloudflare, I've posted in their forum but thought worth asking as i do think this is the problem I'm experiencing and perhaps need to remove Cloudflare altogether
My scenario is:
SES is no longer in sandbox mode.
I've verified my domain and my personal email address.
I've added the TXT / Name records to cloudflare (I'm wondering if they are the issue)
I've set up rules such that 'something#example.com' and 'anything#example.com' should trigger by sending the email to an S3 bucket.
S3 isn't my ideal solution but as a proof of concept it'll do.
However, emails sent to 'something#example.com' are not getting through, nothing is in S3 and I'm receiving (via my personal email) a
Delivery Status Notification (Failure)
3 hours ago at 12:58 PM
From
MAILER-DAEMON#amazonses.com
To
myemail#me.com
An error occurred while trying to deliver the mail to the following recipients:
something#example.com
Can anyone perhaps shed some light on what I may be doing wrong?
MX records (in Cloudflare) have been updated to these that I obtained from SES:
amazonses - 10 feedback-smtp.us-east-1.amazonses.com - proxy is automatic

Cannot borrow client for ssl://10.10.183.27:9714. org.wso2.carbon.databridge.agent.exception

I'm trying to connect my wso2 API manager to external LDAP/Active Directory. LDAP connection is fine but I'm getting this error while starting the server.
9714 is SSL port
[2018-07-13 14:52:03,250] ERROR - JMSListener Unable to continue server startup as it seems the JMS Provider is not yet started. Please start the JMS provider now.
[2018-07-13 14:52:03,251] ERROR - JMSListener Connection attempt : 5 for JMS Provider failed. Next retry in 320 seconds
[2018-07-13 14:52:27,889] WARN - DataEndpointGroup No receiver is reachable at reconnection, will try to reconnect every 30 sec
[2018-07-13 14:52:27,906] INFO - DataBridge user admin connected
[2018-07-13 14:52:27,914] ERROR - AuthenticationServiceImpl Invalid User : admin
[2018-07-13 14:52:27,915] ERROR - DataEndpointConnectionWorker Error while trying to connect to the endpoint. Cannot borrow client for ssl://10.10.183.27:9714.
org.wso2.carbon.databridge.agent.exception.DataEndpointLoginException: Cannot borrow client for ssl://10.10.183.27:9714.
at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.connect(DataEndpointConnectionWorker.java:134)
at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.run(DataEndpointConnectionWorker.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.wso2.carbon.databridge.agent.exception.DataEndpointLoginException:
Error while trying to login to data receiver :/10.10.183.27:9714
at org.wso2.carbon.databridge.agent.endpoint.binary.BinaryDataEndpoint.login(BinaryDataEndpoint.java:50) at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.connect(DataEndpointConnectionWorker.java:128)
... 6 more
This error usually is due to the Analytics configuration or Throttling configuration.
Try to disable the Analytics publisher in the api-manager.xml file, if enabled, or review the connection details under DASServerUrl element.
<Analytics>
<!-- Enable Analytics for API Manager -->
<Enabled>false</Enabled>
<!-- Server URL of the remote DAS/CEP server used to collect statistics. Must
be specified in protocol://hostname:port/ format.
An event can also be published to multiple Receiver Groups each having 1 or more receivers. Receiver
Groups are delimited by curly braces whereas receivers are delimited by commas.
Ex - Multiple Receivers within a single group
tcp://localhost:7612/,tcp://localhost:7613/,tcp://localhost:7614/
Ex - Multiple Receiver Groups with two receivers each
{tcp://localhost:7612/,tcp://localhost:7613},{tcp://localhost:7712/,tcp://localhost:7713/} -->
<DASServerURL>{tcp://localhost:7612}</DASServerURL>
<!--DASAuthServerURL>{ssl://localhost:7712}</DASAuthServerURL-->
<!-- Administrator username to login to the remote DAS server. -->
<DASUsername>${admin.username}</DASUsername>
<!-- Administrator password to login to the remote DAS server. -->
<DASPassword>${admin.password}</DASPassword>
You can also try to disable, or review the configuration in ReceiverUrlGroup and AuthUrlGroup, of the Advanced Throttling feature.
<ThrottlingConfigurations>
<EnableAdvanceThrottling>false</EnableAdvanceThrottling>
<DataPublisher>
<Enabled>true</Enabled>
<Type>Binary</Type>
<ReceiverUrlGroup>tcp://${carbon.local.ip}:${receiver.url.port}</ReceiverUrlGroup>
<AuthUrlGroup>ssl://${carbon.local.ip}:${auth.url.port}</AuthUrlGroup>
<Username>${admin.username}</Username>
<Password>${admin.password}</Password>
I faced the same issue. Check the log of traffic manager, if there arn't any exceptions. Kill & start traffic manager helped me.

Routing failure when resumn instance - BizTalk

I have created a small BizTalk application which has a single dynamic send port with the Delivery Notification == transmitted.
The send port is configured to a folder path and when the folder is not exist it suspends the orchestration. When I try to resume the orchestration after creating the folder. I'm getting two instances in the BizTalk query expression. the instance err messages are
Status : Suspend(Not resumable)
Error code : 0xC0C01B4e (Routing Failure Report)
Routing Failure Report for "Routing Failure Report for """
Status : Suspend(resumable)
Error code : 0xc0c01b02
Error Description: The published message could not be routed because no subscribers were found.
NOTE:
I'm getting this error message only when I set the Delivery notification == transmitted
This works fine in some environment.
it might be because the pipeline define in the receive location was not set to be an XML pipeline. Change the pipeline and it should work.
There will always be 2 failure instances in case of routing failure:
First is a non resumable, information only "Routing failure report".
Second is the actual message which was not routed because no subscriber found. Hence "the published message could not be routed because no subscribers were found." This is similar to "Exception occurred when persisting state to the database." error message we see in case of Orchestration routing failures.
Now whats is the difference:
The Suspend(resumable)instance will have the message body present, and you will be able see the contents(body) of the message. However the context properties of that message won't help you finding the reason of routing failure.
However the Suspended(Not resumable) message will have the required "Context properties" which will help you determine the reason of routing failure. That's why we see following
"This service instance exists to help debug routing failures for instance "{your message instance id}". The context of the message associated with this instance contains all the promoted properties at the time of the routing failure."

Request timed out - Request timed out

My website is running on Asp.net v4 , IIS 7 , Windows server 2008.
My cpu is running on 20-30% and the site is responding quickly.
Every 2-5 mins I'm receiving the following error:
Event code: 3001
Event message: The request has been aborted.
Exception type: HttpException
Exception message: Request timed out. ,
Request information:
Request URL: http://www.xxxx.com/Services/AxRefresh.asmx/AxUpdate
Request path: /Services/AxRefresh.asmx/AxUpdate
User host address: 84.110.251.198
User:
Is authenticated: False
Authentication Type:
Thread account name: NT AUTHORITY\NETWORK SERVICE
i read that the error is related to the maximum concurrent requests limit
http://support.microsoft.com/kb/821268
but then i found out that on IIS 7 this limitation is changed and not relevant.
http://msdn.microsoft.com/en-us/library/dd560842(VS.100).aspx
Any other ideas what can be the problem or where to start looking ?
update:
found another link saying that all the below parameters:
maxWorkerThreads
minWorkerThreads
maxIoThreads
minFreeThreads
minLocalRequestFreeThreads
maxconnection
executionTimeout
are NOT relevant for IIS7+Asp.NET 4
here is the link
http://learn.iis.net/page.aspx/381/aspnet-20-breaking-changes-on-iis-70
**I still get the 100's errors daily on my iis7 **
The second link you found has no relationship with the first one, so you still need to apply the changes in the first article and see if it helps.

Resources