Proxy APIs timeout on API Key due to Cassandra connection issue - apigee

I'd appreciate help triaging and solving this:
I'm getting frequent periods where all the Proxy APIs hang and the trace shows "???" for the HTTP status code for requests and I get this response after 30 seconds:
Status Code: 504 Gateway Timeout
Content-Length: 177
Content-Type: text/xml; charset=UTF-8
<?xml version='1.0' encoding='UTF-8'?><fault><faultstring>The Service is temporarily unavailable</faultstring><detail><errorcode>SERVICE_UNAVAILABLE</errorcode></detail></fault>
Here's what I see in the system.log for all three Cassandra servers
> 2014-04-01 14:29:20,124 org: env: Apigee-Main-36 ERROR m.p.c.c.c.HThriftClient - HThriftClient.close() : Could not flush
> transport (to be expected if the pool is shutting down) in close for
> client: CassandraClient<10.49.192.52:9160-829>
> org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe
> at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
> ~[libthrift-0.7.0.jar:0.7.0]
> at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
> ~[libthrift-0.7.0.jar:0.7.0]
> at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:125)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:38)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.connection.HConnectionManager.closeClient(HConnectionManager.java:325)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:273)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.sliceInternal(ThriftColumnFamilyTemplate.java:88)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.service.template.ThriftColumnFamilyTemplate.doExecuteSlice(ThriftColumnFamilyTemplate.java:46)
> [hector-core-1.1-3.jar:na]
> at me.prettyprint.cassandra.service.template.ColumnFamilyTemplate.queryColumns(ColumnFamilyTemplate.java:113)
> [hector-core-1.1-3.jar:na]
> at com.apigee.datastore.client.CassandraClient.get(CassandraClient.java:169)
> [datastore-1.0.0.jar:na]
> at com.apigee.keymanagement.dao.nosql.impl.AppDaoImpl.getCredential(AppDaoImpl.java:123)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.dao.nosql.impl.AppDaoImpl.getConsumerKeyStatus(AppDaoImpl.java:77)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.util.ResourceUtil.validateConsumerKey(ResourceUtil.java:490)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.util.ResourceUtil.validateConsumerKey(ResourceUtil.java:475)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.util.ResourceUtil.getConsumerDetails(ResourceUtil.java:526)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.util.ResourceUtil.getConsumerDetailsForApiKey(ResourceUtil.java:596)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.keymanagement.service.OAuth2RuntimeServiceImpl.getConsumerForApiKey(OAuth2RuntimeServiceImpl.java:81)
> [keymanagement-1.0.0.jar:na]
> at com.apigee.oauth.v2.connectors.LocalOAuthServiceConnector.getClientAttributesForApiKey(LocalOAuthServiceConnector.java:173)
> [oauthV2-1.0.0.jar:na]
> at com.apigee.oauth.v2.OAuthServiceImpl.getClientAttributesForApiKey(OAuthServiceImpl.java:506)
> [oauthV2-1.0.0.jar:na]
> at com.apigee.steps.oauth.v2.OAuthStepExecution.execute(OAuthStepExecution.java:401)
> [oauthV2-1.0.0.jar:na]
> at com.apigee.messaging.runtime.steps.StepExecution.execute(StepExecution.java:97)
> [message-processor-1.0.0.jar:na]
> at com.apigee.flow.execution.AsyncExecutionStrategy$AsyncExecutionTask.call(AsyncExecutionStrategy.java:69)
> [message-flow-1.0.0.jar:na]
> at com.apigee.flow.execution.AsyncExecutionStrategy$AsyncExecutionTask.call(AsyncExecutionStrategy.java:51)
> [message-flow-1.0.0.jar:na]
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> [na:1.6.0_32]
> at java.util.concurrent.FutureTask.run(FutureTask.java:138) [na:1.6.0_32]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> [na:1.6.0_32]
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> [na:1.6.0_32]
> at java.util.concurrent.FutureTask.run(FutureTask.java:138) [na:1.6.0_32]
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> [na:1.6.0_32]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> [na:1.6.0_32]
> at java.lang.Thread.run(Thread.java:662) [na:1.6.0_32]
> Caused by: java.net.SocketException: Broken pipe
> at java.net.SocketOutputStream.socketWrite0(Native Method) ~[na:1.6.0_32]
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> ~[na:1.6.0_32]
> at java.net.SocketOutputStream.write(SocketOutputStream.java:136) ~[na:1.6.0_32]
> at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
> ~[libthrift-0.7.0.jar:0.7.0]
> ... 31 common frames omitted
> 2014-04-01 14:29:20,126 org: env: Apigee-Main-36 ERROR m.p.c.c.HConnectionManager - HConnectionManager.markHostAsDown() :
> MARK HOST AS DOWN TRIGGERED for host 10.49.192.52(10.49.192.52):9160
> 2014-04-01 14:29:20,126 org: env: Apigee-Main-36 ERROR m.p.c.c.HConnectionManager - HConnectionManager.markHostAsDown() :
> Pool state on shutdown:
> <ConcurrentCassandraClientPoolByHost>:{10.49.192.52(10.49.192.52):9160};
> IsActive?: true; Active: 1; Blocked: 0; Idle: 2; NumBeforeExhausted: 9
> 2014-04-01 14:29:20,127 org: env: Apigee-Main-36 ERROR m.p.c.c.c.HThriftClient - HThriftClient.close() : Could not flush
> transport (to be expected if the pool is shutting down) in close for
> client: CassandraClient<10.49.192.52:9160-828>
> org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection timed out
> at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
> ~[libthrift-0.7.0.jar:0.7.0]
> at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
> ~[libthrift-0.7.0.jar:0.7.0]
> at me.prettyprint.cassandra.connection.client.HThriftClient.close(HThriftClient.java:125)
> [hector-core-1.1-3.jar:na]

This can happen when there is load on cassandra and it will mark itself down.
During this process it will close its connections in the pool.
Please restart the Message processors to re-establish the connection.
Let me know how it goes.
Regards,
Jagjyot.

Since I'm a paying Apigee customer, I also opened a case....
originally, they weren't sure there was a keep alive function or a connection TTL that would force a drop/re-establish a connection
Here's what I got back:
Set the following on your Router/Message Processor nodes: /proc/sys/net/ipv4/tcp_keepalive_time to 1800 second (30 minutes).
To do this:
echo 1800 > /proc/sys/net/ipv4/tcp_keepalive_time
BE AWARE: This change is not persisted across reboots so you would like to edit your /etc/sysctl.conf file and put this in there.
Then do the command :
sysctl -p
to make those values load from that file.
You can use the following to check if the value got updated
sysctl net.ipv4.tcp_keepalive_time
restart your Message Processors.
So the fix that has been put in place was a keep alive probe in the Hector client in the message processor.
The probe does a keep alive ping based on the interval set in the tcp_keepalive_time OS setting.
So, the reasoning to set this to 30 minutes is based on your firewall setting for the idle time out being 3600 seconds.
The keep alive probes need to happen faster than the firewall's idle timeout so that it keeps the connection in the established state.

Related

Karate fails to close earlier started Crome processes

In my project I use Karate 0.9.5, Oracle JDK8.
From time to time in our pipeline I see a problem. Karate fails to close earlier started Crome processes. Is there any solution to this problem?
I try to solved by explicitly call close() and quit() it didn't help. After running process was finished I found a couple of active chrome process on the server.
Here's a log:
12:51:45.227 preferred port 9222 not available, will use: 52436
12:51:47.524 request:
1 > GET http://localhost:52436/json
1 > Accept-Encoding: gzip,deflate
1 > Connection: Keep-Alive
1 > Host: localhost:52436
1 > User-Agent: Apache-HttpClient/4.5.11 (Java/1.8.0_181)
12:51:47.584 response time in milliseconds: 58,31
1 < 200
1 < Content-Length: 361
1 < Content-Type: application/json; charset=UTF-8
[ {
"description": "",
"devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:52436/devtools/page/4BD6A5C19E01B01D88995CD69367F81F",
"id": "4BD6A5C19E01B01D88995CD69367F81F",
"title": "",
"type": "page",
"url": "about:blank",
"webSocketDebuggerUrl": "ws://localhost:52436/devtools/page/4BD6A5C19E01B01D88995CD69367F81F"
} ]
12:51:47.584 root frame id: 4BD6A5C19E01B01D88995CD69367F81F
12:51:47.632 >> {"method":"Target.activateTarget","params":{"targetId":"4BD6A5C19E01B01D88995CD69367F81F"},"id":1}
12:51:47.638 << {"id":1,"result":{}}
12:51:47.639 >> {"method":"Page.enable","id":2}
12:51:47.883 << {"id":2,"result":{}}
12:51:47.885 >> {"method":"Runtime.enable","id":3}
12:51:47.889 << {"method":"Runtime.executionContextCreated","params":{"context":{"id":1,"origin":"://","name":"","auxData":{"isDefault":true,"type":"default","frameId":"4BD6A5C19E01B01D88995CD69367F81F"}}}}
12:51:47.890 << {"id":3,"result":{}}
12:51:47.890 >> {"method":"Target.setAutoAttach","params":{"autoAttach":true,"waitForDebuggerOnStart":false,"flatten":true},"id":4}
12:51:47.892 << {"id":4,"result":{}}
12:52:02.894 << timed out after milliseconds: 15000 - [id: 4, method: Target.setAutoAttach, params: {autoAttach=true, waitForDebuggerOnStart=false, flatten=true}]
12:52:02.917 driver config / start failed: failed to get reply for: [id: 4, method: Target.setAutoAttach, params: {autoAttach=true, waitForDebuggerOnStart=false, flatten=true}], options: {type=chrome, showDriverLog=true, httpConfig={readTimeout=60000}, headless=true, target=null}
I think the easiest way to solve it is to extend root web driver interface com.intuit.karate.driver.Driver by adding additional method like getPID(). This method must return PID of launched process.
Can you log a bug and also provide some information as to what happened before this, are you using the Docker image etc. It looks like the previous Scenario did not shut down clearly.
But before that do you mind upgrading to 0.9.6.RC3 (just released) and trying that ? There have been a couple of tweaks to the life-cycle especially trying to close Chrome without waiting for a response. Also it may be worth adding a couple of changes a) Target.setAutoAttach without waiting and b) add configure option to not start Chrome if the port is in use

Use R Selenium to deal with pop-ups

How can I navigate to a link, click a link, and scrape data from there?
I tried this code, without success.
library("RSelenium")
startServer()
mybrowser <- remoteDriver()
mybrowser$open()
mybrowser$navigate("https://finance.yahoo.com/quote/SBUX/balance-sheet?p=SBUX")
# click 'Quarterly' button...
Something that is kind of close, is this.
Testing updated code now; results below.
> rm(list=ls())
>
>
> library("RSelenium")
> startServer()
Error: startServer is now defunct. Users in future can find the function in
file.path(find.package("RSelenium"), "examples/serverUtils"). The
recommended way to run a selenium server is via Docker. Alternatively
see the RSelenium::rsDriver function.
> mybrowser <- remoteDriver()
> mybrowser$open()
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4444: Connection refused
> mybrowser$navigate("https://finance.yahoo.com/quote/SBUX/balance-sheet?p=SBUX")
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
> mybrowser$findElement("xpath", "//button[text() = '
+
+ OK
+ ']")$clickElement()
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
> mybrowser$findElement("xpath", "//span[text() = 'Quarterly']")$clickElement()
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
>
I think it might be the case you run into this on the website.
You can just "click" the OK button via:
mybrowser$findElement("xpath", "//button[text() = '
OK
']")$clickElement()
And then you can click "quarterly" via:
mybrowser$findElement("xpath", "//span[text() = 'Quarterly']")$clickElement()
(Hint: To identify these kind of errors it can be helpful to check the current state of the browser via: remDr$screenshot(TRUE).)
I am not sure it is up to date, but certain data is also available via the API, you could check the quantmod package to get an easier access.
Full example:
library("RSelenium")
startServer()
mybrowser <- remoteDriver()
mybrowser$open()
mybrowser$navigate("https://finance.yahoo.com/quote/SBUX/balance-sheet?p=SBUX")
mybrowser$findElement("xpath", "//button[text() = '
OK
']")$clickElement()
mybrowser$findElement("xpath", "//span[text() = 'Quarterly']")$clickElement()

Asterisk Warning-Sound not beeing played to callee

I am connecting two channels with the Bridge Application Bridge(Local/XXXXXXXX#sipname, FL(180000,120000)x). The warning sound should be played 120 seconds before the call ends. In the past this worked perfectly but now suddenly the sound is not beeing played to the callee. I checked everything but can't find the reason for it.
In the CLI it shows everything correctly but still it is not beeing played to the callee.
> Limit Data for this call:
> timelimit = 180000 ms (180.000 s)
> play_warning = 120000 ms (120.000 s)
> play_to_caller = yes
> play_to_callee = yes
> warning_freq = 0 ms (0.000 s)
> start_sound = enter
> warning_sound = gong
> end_sound =
I added a /n to the Context in the Dial-Cmd that fixed it for me.
https://www.voip-info.org/asterisk-local-channels/

rJava: failing to load awt.events

Right now, I try to implement some event methods of awt.events, however I cannot get it to load, since I always get an ClassNotFoundException error.
> library(rJava)
> .jinit()
[1] 0
> jEvents <- .jnew("java.awt.event")
Error in .jnew("java.awt.event") : java.lang.ClassNotFoundException
Edit:
Even if I try a specific class, I get an error message:
> library(rJava)
> .jinit()
> jEvents <- .jnew("java.awt.event.ActionEvent")
Error in .jnew("java.awt.event.ActionEvent") :
java.lang.NoSuchMethodError: <init>
It seems you are trying to instantiate package instead of class:
https://docs.oracle.com/javase/7/docs/api/java/awt/event/package-summary.html
Maybe you are looking for:
java.awt.event.ActionEvent
You can try this one:
library(rJava)
.jinit()
> EVT <- J("java.awt.event.ActionEvent")
> aEVT <- new(EVT, "StringObject", 1001L, "Hello")
> aEVT
[1] "Java-Object{java.awt.event.ActionEvent[ACTION_PERFORMED,cmd=Hello,when=0,modifiers=] on Str}"
You have to call the constructor with given parameters. Note that ActionEvent doesn't have default constructor.
You can find nice source here:
http://www.deducer.org/pmwiki/pmwiki.php?n=Main.Development#wwjoir

CAPI2 The Proxy Auto-configuration URL was not found

I have a asp.net mvc application. running on windows server 2008 R2 and framework 4.5.
On one controller. I use httpclient post xml to remote soap server via https channel. but very slow. it take 6 seconds to finish the request. I enable CAPI2 log. found error:
> System
>
> - Provider
>
> [ Name] Microsoft-Windows-CAPI2
> [ Guid] {5bbca4a8-b209-48dc-a8c7-b23d3e5216fb}
>
> EventID 53
>
> Version 0
>
> Level 2
>
> Task 53
>
> Opcode 2
>
> Keywords 0x4000000000000036
>
> - TimeCreated
>
> [ SystemTime] 2013-03-06T00:09:12.342400000Z
>
> EventRecordID 226
>
> Correlation
>
> - Execution
>
> [ ProcessID] 4716
> [ ThreadID] 3620
>
> Channel Microsoft-Windows-CAPI2/Operational
>
> Computer WIN-PFVACQLB4A9
>
> - Security
>
> [ UserID] S-1-5-82-100694679-852442941-408577778-1461352480-452402374
>
>
>- UserData
>
> - CryptRetrieveObjectByUrlWire
>
> - URL http://gtssldv-aia.geotrust.com/gtssldv.crt
>
> [ scheme] http
>
> - Object
>
> [ type] CONTEXT_OID_CERTIFICATE
> [ constant] 1
>
> Timeout PT15S
>
> - Flags
>
> [ value] 286005
> [ CRYPT_RETRIEVE_MULTIPLE_OBJECTS] true
> [ CRYPT_WIRE_ONLY_RETRIEVAL] true
> [ CRYPT_LDAP_SCOPE_BASE_ONLY_RETRIEVAL] true
> [ CRYPT_OFFLINE_CHECK_RETRIEVAL] true
> [ CRYPT_AIA_RETRIEVAL] true
> [ CRYPT_PROXY_CACHE_RETRIEVAL] true
>
> - AuxInfo
>
> [ maxUrlRetrievalByteCount] 100000
> [ fProxyCacheRetrieval] true
>
> - AdditionalInfo
>
> - NetworkConnectivityStatus
>
> [ value] 1
> [ _SENSAPI_NETWORK_ALIVE_LAN] true
>
> - Action
>
> [ name] Call_WinHttpGetProxyForUrl
> - Error The Proxy Auto-configuration URL was not found.
>
> [ value] 2F94
>
>
> - Action
>
> [ name] NoProxy
>
> - Action
>
> [ name] Call_WinHttpGetProxyForUrl
> + Error The Proxy Auto-configuration URL was not found.
>
> [ value] 2F94
>
>
> - Action
>
> [ name] NoProxy
>
> - HTTPRequestHeadersInfo
>
> Header GET /gtssldv.crt HTTP/1.1
>
> Header Accept: */*
>
> Header User-Agent: Microsoft-CryptoAPI/6.1
>
> Header Connection: Keep-Alive
>
>
> - HTTPResponseHeadersInfo
>
> Header HTTP/1.0 200 OK
>
> Header Connection: Keep-Alive
>
> Header Date: Tue, 05 Mar 2013 23:43:39 GMT
>
> Header Via: NS-248
>
> Header Via: 1.0 hostname:80 (squid/2.6.STABLE21)
>
> Header Content-Length: 1022
>
> Header Content-Type: text/plain
>
> Header Last-Modified: Wed, 21 Jul 2010 18:25:03 GMT
>
> Header Accept-Ranges: bytes
>
> Header Age: 540
>
> Header ETag: "25f7fb-3fe-48be9eb969dc0"
>
> Header Server: Apache/2.2.3 (Red Hat)
>
> Header X-Pad: avoid browser bug
>
> Header X-Cache: MISS from hostname
>
> Header X-Cache-Lookup: HIT from hostname:80
>
>
> - Action
>
> [ name] WriteToCache
> - Error The system cannot find the file specified.
>
> [ value] 80070002
>
>
>
> - CacheInfo
>
> [ lastSyncTime] 2013-03-06T00:09:12.342Z
> - URLCacheResponseInfo
>
> [ responseType] CRYPTNET_URL_CACHE_RESPONSE_HTTP
> [ lastModifiedTime] 2010-07-21T18:25:03Z
> [ eTag] "25f7fb-3fe-48be9eb969dc0"
> [ proxyId] DEAE4085
>
>
> - RetrievedObjects
>
> - Certificate
>
> [ fileRef] BAE30B15DBB1544CF194D076B75B7BB9E3D6B760.cer
> [ subjectName] GeoTrust DV SSL CA
>
>
> - EventAuxInfo
>
> [ ProcessName] w3wp.exe
>
> - CorrelationAuxInfo
>
> [ TaskId] {2B4FC750-01D1-4D7E-BEEF-A3F9DD09B0E8}
> [ SeqNumber] 4
>
> - Result
>
> [ value] 0
I also unchecked Internet Exploer->Internet Opitions->Connections->LAN Settings-> Automatically detect settings. nothing change.
Looking for help.
Thanks
Probably it's too late but the way I solved the same exact problem was to find the user ID, in your case:
[ UserID] S-1-5-82-100694679-852442941-408577778-1461352480-452402374
inside the registry, in the following location:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList .
It will tell you which user is invoking the Crypto API 2 (CAPI2), in my case was "Classic .NET Application Pool" or similar.
I fixed it by impersonating (running the process as another user with enough rights), then I did not have anymore this write errors and therefore not the red error messages in the event log.
For this, I have created an user with administrator rights in the local computer (I know it is not the most secure option, but I had to try, and it worked). In the end, seemed that the problem was the proxy but was that particular error:.
> [ name] WriteToCache
> - Error The system cannot find the file specified.
>
> [ value] 80070002
>
It had no write rights somewhere, but the Administrator does :P
Also another thing I did (as the crypto API were requesting always the CRL), in the Local Group Policy Editor, following the path:
Computer Configuration > Windows Settings > Security Settings > Public Keys > Certificate Path Validation Settings > "Revocation" tab. Enable "Define these policy settings" and also enable "Allow CRL and OCSP responses to be valid longer than their lifetime".
Then I extended the validity of this CRL into one year (8760 hours), I know it is also too long, but it depends on your needs or security concerns, in my case is safe enough.
I hope it helps you out
My answer is taken from here: http://social.technet.microsoft.com/Forums/en-US/winserversecurity/thread/82c52c69-1a53-4f48-8c7e-37bc7364dca3/
Setup an IIS server such that it would respond with javascript inside http://wpad.mylab.local/wpad.dat - don't forget to add the MIME type for wpad.dat (application/x-javascript is fine). If you don't have the need for a proxy, then the javascript in your wpad.dat file should look like the following:
function FindProxyForURL(url, host)
{
return "DIRECT";
}
After that, you should no longer have event 53 in the CAPI2 logs when using certutil.

Resources