Is there a recommended/best way to generate a report for time spent on each policy for an API proxy?
Currently my approach is to use JS to collect the timestamps and calculate the delay around each policy, and then report it using the stats collection policy.
That's too invasive for performance checks and my data collection alone adds time to the overall response.
What would be the best no invasive way to report on the time taken for each step when analyzing the data across many requests (the ui, in the trace mode does show the time for each policy on an individual request basis)
Thanks,
Ricardo
There's not a public API supported to calculate this information and return a nice, clean response of aggregated policy execution time data. Best bet is to try using Analytics reports with request_processing_latency and response_processing_latency measures. (http://apigee.com/docs/content/analytics-reference). Then, if needed, utilize trace to identify policy execution times.
Alternatively, you can try downloading the trace session and parsing the timestamps between policies to build your information, but trace in UI does this already..
You consider using debug API. http://apigee.com/docs/api/debug-sessions
First you'll need to start a session for example:
curl -H "Content-type:application/octet-stream" -X POST https://api.enterprise.apigee.com/v1/organizations/{org}/environments/{env}/apis/{api_name}/revisions/{revision #}/debugsessions?"session=MySession" \
-u $ae_username:$ae_password
Get info from session:
curl -X GET -H "Accept:application/json" \
https://api.enterprise.apigee.com/v1/organizations/{org}/environments/{env}/apis/{api_name}/revisions/{revision #}/debugsessions/MySession/data \
-u $ae_username:$ae_password
The time spent for each policy can be found using the debug trace in UI.
Please see the below screenshot for the same.
Also as Diego said you can use debugsession API call to get a debug session.
For the debug session you can also define the time limit as to how much time you want to debug session to run. With this if you are running your performance test for 1 hour you can create debug session for that much amount of time.
curl -v -u jhans#apigee.com http://management:8080/v1/organizations/weatherapi/environments/prod/apis/ForeCast/revisions/6/debugsessions?session=ab\&timeout=300 -X POST
From the UI you can download the trace session which would contain an XML having the timestamp for each policy
<Point id="Condition">
<DebugInfo>
<Timestamp>05-02-14 04:38:14:088</Timestamp>
<Properties>
<Property name="ExpressionResult">true</Property>
</Point>
<Point id="StateChange">
The above is an example for getting timestamps for any policy in debug trace from UI.
Ricardo,
here is what I suggest.
Disclaimer: It is very meticulous and time consuming. I would recommend this approach only when you are really blocked on a performance issue and there is no other solution.
Let us say your proxy has few policies a service callout to external service and a backend.
So the total latency would be Sum of time taken by (p1, p2, p3...) + service callout target + time taken by your backend.
Very 1st step would be to stub out the external dependencies. You can use a null target (a stub proxy on Apigee edge without any logic) to do so.
Now disable all other policies ( enable = false on the policy schema ). Conduct a load test and benchmark your proxy performance against stubbed endpoints. Also at this time no policies are active.
Start activating the policies one by one or a few at a time, and re-run the load test each time.
Finally you can run the load test against real backends (removing the stubs)
At the end of this series of load test you will know which policy, backend is making the most significant performance impact.
Related
My original carbon storage-schema config was set to 10s:1w, 60s:1y and was working fine for months. I've recently updated it to 1s:7d, 10s:30d, 60s,1y. I've resized all my whisper files to reflect the new retention schema using the following bit of bash:
collectd_dir="/opt/graphite/storage/whisper/collectd/"
retention="1s:7d 1m:30d 15m:1y"
find $collectd_dir -type f -name '*.wsp' | parallel whisper-resize.py \
--nobackup {} $retention \;
I've confirmed that they've been updated using whisper-info.py with the correct retention and data points. I've also confirmed that the storage-schema is valid using a storage-schema validation script.
The carbon-cache{1..8}, carbon-relay, carbon-aggregator, and collectd services have been stopped before the whisper resizing, then started once the resizing was complete.
However, when checking in on a Grafana dashboard, I'm seeing empty graphs with correct data points (per sec, but no data) on collectd plugin charts; but with the graphs that are providing data, it's showing data and data points every 10s (old retention), instead of 1s.
The /var/log/carbon/console.log is looking good, and the collectd whisper files all have carbon user access, so no permission denied issues when writing.
When running an ngrep on port 2003 on the graphite host, I'm seeing connections to the relay, along with metrics being sent. Those metrics are then getting relayed to a pool of 8 caches to their pickle port.
Has anyone else experienced similar issues, or can possibly help me diagnose the issue further? Have I missed something here?
So it took me a little while to figure this out. It had nothing to do with the local_settings.py file like some of the old responses, but it had to do with the Interval function in the collectd.conf.
A lot of the older responses mentioned that you needed to include 'Interval 1' inside each Plugin container. I think this would have been great due to the control of each metric. However, that would create config errors in my logs, and break the metric. Setting 'Interval 1' at top level of the config resolved my issues.
I'd like to put iccube under solid monitoring so that we know when a) cube load failure or b) cube last update time exceeded the expected.
is there an api i can use to integrate with standard monitoring tools?rest, command-line etc ...
thanks in advance, assaf
Regarding the schema load failure you can check the notification service (www); you can for example receive an eMail on failure. Note that you can implement (JAVA) your own transport service to receive notifications. There is no "notification" for last update time exceeded but if you could use an external LOAD command (www) for loading your schema; in that case you will know the last update time and perform whatever logic required.
Edit: XMLA commands can be sent via any tools (e.g., Bash).
Hope that helps.
I have read the example of scrapy-redis but still don't quite understand how to use it.
I have run the spider named dmoz and it works well. But when I start another spider named mycrawler_redis it just got nothing.
Besides I'm quite confused about how the request queue is set. I didn't find any piece of code in the example-project which illustrate the request queue setting.
And if the spiders on different machines want to share the same request queue, how can I get it done? It seems that I should firstly make the slave machine connect to the master machine's redis, but I'm not sure which part to put the relative code in,in the spider.py or I just type it in the command line?
I'm quite new to scrapy-redis and any help would be appreciated !
If the example spider is working and your custom one isn't, there must be something that you have done wrong. Update your question with the code, including all relevant parts, so we can see what went wrong.
Besides I'm quite confused about how the request queue is set. I
didn't find any piece of code in the example-project which illustrate
the request queue setting.
As far as your spider is concerned, this is done by appropriate project settings, for example if you want FIFO:
# Enables scheduling storing requests queue in redis.
SCHEDULER = "scrapy_redis.scheduler.Scheduler"
# Don't cleanup redis queues, allows to pause/resume crawls.
SCHEDULER_PERSIST = True
# Schedule requests using a queue (FIFO).
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderQueue'
As far as the implementation goes, queuing is done via RedisSpider which you must inherit from your spider. You can find the code for enqueuing requests here: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/scheduler.py#L73
As for the connection, you don't need to manually connect to the redis machine, you just specify the host and port information in the settings:
REDIS_HOST = 'localhost'
REDIS_PORT = 6379
And the connection is configured in the Ä‹onnection.py: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/connection.py
The example of usage can be found in several places: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/pipelines.py#L17
I was trying to fetch the resource and usage of resources of an instance using ceilometer api. I have used v2/meters/instance , v2/meters/cpu_util and v2/meters/memory. The result these api's return is too large and I'm not able to identify the paramater that needs to be taken to find the resource usage. I need to find the cpu utilization, bandwidth and memory usage of an instance using the ceilometer api. Can anyone please tell me which api I need to use to get the cpu utilization, bandwidth and memory usage of an instance and the parameter that needs to be taken to get the usage.
Thanks for any help in advance.
Regards ,
Lokesh.S
If you use the CLI, you can limit the number of samples with the -l/--limit parameter like in the example below:
`ceilometer sample-list -m cpu_util -l 10`
ceilometer --debug sample-list -m cpu_util -l 1 -q resource={your_vm_id}
Note that
--debug enables you to observe what rest API has been requested, you can learn example from it and write your own rest request, or just use CLI if you can. And this option will show rest response with full detailed sample information, CLI will format it and some information may be dropped.
-l 1 means just limit to return one result, so you will not flushed by huge mount of data
-q resource={your_vm_id} means only get cpu_util sample for that vm
you can read this official document of http://docs.openstack.org/developer/ceilometer/webapi/v2.html, or read my post http://zqfan.github.io/assets/doc/ceilometer-havana-api-v2.html (which is written in Chinese)
I am on an OPDK installation of Apigee Edge. I have a zombie API proxy, meaning I can't delete the API proxy in the UI (and usually not via MS API, either). I get the following error:
What is the best way to ensure Apigee Edge is cleared of this zombie API proxy so that I can redeploy this API proxy again?
To clean up this up, you will need to execute some manual steps:
1) check /o/{}/apiproxies from MS API call ("curl http(s)://{mgmt-host}:{port}/v1/o/{orgname}/e/{envname}/apiproxies") This will give you the actual response info that the UI is -trying- to parse
2) delete the /o/{}/apiproxies/{proxyname} using MS API call ("curl -X DELETE http(s)://:/v1/o/{orgname}/e/{envname}/apiproxies/{apiproxy_name}") Re-check step 1 to see if it is cleaned up
3) if it is clean, try your deployment again. If it succeeds, you are good.
4) if it does not, then
5) go to zookeeper (/opt/apigee//share/zookeeper) and run the CLI (./zkCli.sh)
6) find /organizations/{orgname}/environments/{envname}/apiproxies/ and see if the {apiproxy_name} is there.
7) if so, execute "[{prompt-stuff}] rmr /organization/{orgname}/environment/{envname}/apiproxies/{apiproxy_name}" in zk
8) repeat your checks above, the proxy should be all clean
Note: There a few circumstances that may require some addition steps, such as actually incorrect server configurations, or conflicting confg data.
Hope that helps.