This is probably a noobie question, but am having a hard time finding the answers, so I hope you guys can help me here.
I have a running logstash instance shipping logs from one server to another server which is running graphite.
Here is my output config
output {
stdout { codec => rubydebug }
graphite {
host => "xxxxxxx.yyyy.amazonaws.com"
port => 2003
type => "logstash-metrics"
metrics => ["logstash.%{remote_addr}", "logstash.%{status}"]
}
}
I have checked the firewall is not blocking TCP 2003 on xxxxxxx.yyyy.amazonaws.com where has graphite running, however when I go to graphite's UI can not seem to find any of my metrics, I am wondering what could be the reason?
Thanks!
The following does not look like an answer but since your question is that of a debugging nature, this is the best form that i can come up with-
Make sure the graphite stack is working. The easiest way is to run this on the shell a few times and verify that the corresponding graph appears on graphite-
echo "test.first 10 `date +%s`"| nc graphite.example.com 2003.
Since you do not seem to have statsd in the stack, you don't have to check if it is correctly relaying.
Now graphite logs whatever it receives. The default location is /opt/graphite/storage/log/carbon-cache/carbon-cache-a/.
listener.log- logs whenever NW connections are opened and closed.
06/12/2013 06:09:58 :: MetricLineReceiver connection with 127.0.0.1:59766 established
06/12/2013 06:10:00 :: MetricLineReceiver connection with 127.0.0.1:59766 closed cleanly
updates.log- logs metric updations.
06/12/2013 06:15:39 :: wrote 1 datapoints for stats.message.service.time_taken.std in 0.00017 seconds
06/12/2013 06:15:39 :: wrote 1 datapoints for exchange.message.job.service.time_taken.sum in 0.00016 seconds
creates.log- logs creation of new .wsp files for new metrics.
06/12/2013 06:17:31 :: new metric event.response.time_taken.sum_80 matched schema com
06/12/2013 06:17:31 :: new metric event.response.time_taken.sum_80 matched aggregation schema timers_fall_here
06/12/2013 06:17:31 :: creating database file /opt/graphite/storage/whisper/event/response/time_taken/sum_80.wsp (archive=[(300, 105120)] xff=0.0 agg=average)
Going through these you can find out weather the connection is not being created (NW issue) or if the wsp file creation isn't happening (file system permission issue). If sending metrics to graphite using nc works, then it is the logstash end that needs to be looked into.
Related
My original carbon storage-schema config was set to 10s:1w, 60s:1y and was working fine for months. I've recently updated it to 1s:7d, 10s:30d, 60s,1y. I've resized all my whisper files to reflect the new retention schema using the following bit of bash:
collectd_dir="/opt/graphite/storage/whisper/collectd/"
retention="1s:7d 1m:30d 15m:1y"
find $collectd_dir -type f -name '*.wsp' | parallel whisper-resize.py \
--nobackup {} $retention \;
I've confirmed that they've been updated using whisper-info.py with the correct retention and data points. I've also confirmed that the storage-schema is valid using a storage-schema validation script.
The carbon-cache{1..8}, carbon-relay, carbon-aggregator, and collectd services have been stopped before the whisper resizing, then started once the resizing was complete.
However, when checking in on a Grafana dashboard, I'm seeing empty graphs with correct data points (per sec, but no data) on collectd plugin charts; but with the graphs that are providing data, it's showing data and data points every 10s (old retention), instead of 1s.
The /var/log/carbon/console.log is looking good, and the collectd whisper files all have carbon user access, so no permission denied issues when writing.
When running an ngrep on port 2003 on the graphite host, I'm seeing connections to the relay, along with metrics being sent. Those metrics are then getting relayed to a pool of 8 caches to their pickle port.
Has anyone else experienced similar issues, or can possibly help me diagnose the issue further? Have I missed something here?
So it took me a little while to figure this out. It had nothing to do with the local_settings.py file like some of the old responses, but it had to do with the Interval function in the collectd.conf.
A lot of the older responses mentioned that you needed to include 'Interval 1' inside each Plugin container. I think this would have been great due to the control of each metric. However, that would create config errors in my logs, and break the metric. Setting 'Interval 1' at top level of the config resolved my issues.
First off, I have keypairs, this is not a passphrase question though ssh is involved.
I also have MPICH, Hydra, SLURM and lamd ... this is a cluster computing question.
Node0 will boot but node1 gets hung. I have had this problem for days now. My nfs mirror works just fine and I can run Game Of Life on 8 cores on node2 ... that is really cool too, just ask me about it...
BUT, when I want to run on all three nodes together I hit a password request from each node as node0 uses ssh to send the processes. Again, not a passphrase problem, HYDRA (slurm and lamd as well) wants my user password from node1. Basically my login credential. I can change that to an MPICHuser account; however the dilemma will remain.
Unless I create MPICHusers on all three nodes without passwords at all ... can that be done? It seems like the epitome of security risk.
So the question is, can I automate the password credential whenever # pops up in a way that won't hang lamboot?
It is late, looking at what I have makes me wonder if slurm is the new culprit.
Here is more or less what I am looking at:
me#wherever:/mirror/GameOfLife$ mpiexec.hydra -f /mirror/machinefile -n 10 ./life 10 10 30
[mpiexec#wherever] HYDU_process_mfile_token (utils/args/args.c:296): token node0 not supported at this time
[mpiexec#wherever] HYDU_parse_hostfile (utils/args/args.c:343): unable to process token
[mpiexec#wherever] mfile_fn (ui/mpich/utils.c:336): error parsing hostfile
[mpiexec#wherever] match_arg (utils/args/args.c:152): match handler returned error
[mpiexec#wherever] HYDU_parse_array (utils/args/args.c:174): argument matching returned error
[mpiexec#wherever] parse_args (ui/mpich/utils.c:1596): error parsing input array
[mpiexec#wherever] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1648): unable to parse user arguments
[mpiexec#wherever] main (ui/mpich/mpiexec.c:153): error parsing parameters
me#wherever:/mirror/GameOfLife$
That is not the problem.
I am looking toward Slurm comparability. Several things happen at nearly the same time in a specific order. The handler has to have terminal control in an instant so the master node can begin sending.
Before I added Slurm the hydra machinefile was working but node0 could not "grab" the keyboard.
Where should Slurm look for an equivalent file?
I am wondering if I should remove hydra.
We have been seeing the following 'warnings' in the event log of our BizTalk
machine since upgrading to BTS 2006. They seem to occur
randomly 6 or 8 times per day.
Does anyone know what this means and what needs to be done to clear it up?
we have only one BizTalk server which is running on only one machine.
I am new to BizTalk, so I am unable to find how many tracking host instances running for BizTalk server. Also, can you please let me know that we can configure only one instance for one server/machine?
Source: BAM EventBus Service
Event: 5
Warning Details:
Execute batch error. Exception information: TDDS failed to batch execution
of streams. SQLServer: bizprod, Database: BizTalkDTADb.Cannot insert
duplicate key row in object 'dta_MessageFieldValues' with unique index
'IX_MessageFieldValues'.
The statement has been terminated..
I see you got a partial answer in your MSDN Post
go to BizTalk Admin Console ,check in Platform Settings -> Hosts, in the list of hosts on the right, confirm that only a single Host has the Tracking column marked as Yes.
As to your other question. Yes you can run a Single Host Instance on a Single Server. Although when your server starts to come under a bit of load you may want to consider setting up some more so you can balance the workload better.
I have read the example of scrapy-redis but still don't quite understand how to use it.
I have run the spider named dmoz and it works well. But when I start another spider named mycrawler_redis it just got nothing.
Besides I'm quite confused about how the request queue is set. I didn't find any piece of code in the example-project which illustrate the request queue setting.
And if the spiders on different machines want to share the same request queue, how can I get it done? It seems that I should firstly make the slave machine connect to the master machine's redis, but I'm not sure which part to put the relative code in,in the spider.py or I just type it in the command line?
I'm quite new to scrapy-redis and any help would be appreciated !
If the example spider is working and your custom one isn't, there must be something that you have done wrong. Update your question with the code, including all relevant parts, so we can see what went wrong.
Besides I'm quite confused about how the request queue is set. I
didn't find any piece of code in the example-project which illustrate
the request queue setting.
As far as your spider is concerned, this is done by appropriate project settings, for example if you want FIFO:
# Enables scheduling storing requests queue in redis.
SCHEDULER = "scrapy_redis.scheduler.Scheduler"
# Don't cleanup redis queues, allows to pause/resume crawls.
SCHEDULER_PERSIST = True
# Schedule requests using a queue (FIFO).
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.SpiderQueue'
As far as the implementation goes, queuing is done via RedisSpider which you must inherit from your spider. You can find the code for enqueuing requests here: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/scheduler.py#L73
As for the connection, you don't need to manually connect to the redis machine, you just specify the host and port information in the settings:
REDIS_HOST = 'localhost'
REDIS_PORT = 6379
And the connection is configured in the Ä‹onnection.py: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/connection.py
The example of usage can be found in several places: https://github.com/darkrho/scrapy-redis/blob/a295b1854e3c3d1fddcd02ffd89ff30a6bea776f/scrapy_redis/pipelines.py#L17
First off, I am a newbie when it comes to JMS & ActiveMQ.
I have been looking into a messaging solution to serve as middleware for a message producer that will insert XML messages into a queue via HTTP POST. The producer is an existing system written in C++ that cannot be modified (so Java and the C++ API are out).
Using the "demo" examples and some trial and error, I have cobbled together a working example of what I want to do (on a windows box).
The web.xml I configured in a test directory under "webapps" specifies that the HTTP POST messages received from the producer are to be handled by the MessageServlet.
I added a line for the text app in "activemq.xml" ('ow' is the test app dir):
I created a test script to "insert" messages into the queue which works well.
The problem I am running into is that it as I continue to insert messages via REST/HTTP POST, the memory consumption and thread count used by ActiveMQ continues to rise (It happens when I have timely consumers as well as slow or non-existent consumers).
When memory consumption gets around 250MB's and the thread count exceeds 5000 (as shown in windows task manager), ActiveMQ crashes and I see this in the log:
Exception in thread "ActiveMQ Transport Initiator: vm://localhost#3564" java.lang.OutOfMemoryError: unable to create new native thread
It is as if Jetty is spawning a new thread to handle each HTTP POST and the thread never dies.
I did look at this page:
http://activemq.apache.org/javalangoutofmemory.html
and tried but that didn't fix the problem (although I didn't fully understand the implications of the change either).
Does anyone have any ideas?
Thanks!
Bruce Loth
PS - I included the "test message producer" python script below for what it is worth. I created batches of 100 messages and continued to run the script manually from the command line while watching the memory consumption and thread count of ActiveMQ in task manager.
def foo():
import httplib, urllib
body = "<?xml version='1.0' encoding='UTF-8'?>\n \
<ROOT>\n \
[snip: xml deleted to save space]
</ROOT>"
headers = {"content-type": "text/xml",
"content-length": str(len(body))}
conn = httplib.HTTPConnection("127.0.0.1:8161")
conn.request("POST", "/ow/message/RDRCP_Inbox?type=queue", body, headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
conn.close()
## end method definition
## Begin test code
count = 0;
while(count < 100):
# Test with batches of 100 msgs
count += 1
foo()
The error is not directly caused by ActiveMQ but by the Java Runtime. Take a look here:
http://activemq.apache.org/javalangoutofmemory.html
how you can up your memory for the Java HEAP. There is also interessting stuff about WHY this happens and what you might do to prevent it. ActiveMQ is pretty good but needs some customizing here and there in the config files.
You may want to add the following to the URL's query string:
JMSDeliveryMode=persistent
Otherwise, by definition (read "by default"), the messages would be kept in AMQ's memory.