ValueError when trying to start carbon-cache - graphite

I have an issue with Graphite, specifically with carbon-cache. At some point I had it running. now when coming back after a few weeks I tried to start graphite again. The django-webapp runs fine but it seems I have an issue with the carbon-cache backend. Graphite is installed in /opt/graphite and I run /opt/graphite/bin/carbon-cache.py start. This is the error I get:
root#stfutm01:/opt/graphite/bin# ./carbon-cache.py start
Starting carbon-cache (instance a)
Traceback (most recent call last):
File "./carbon-cache.py", line 30, in <module>
run_twistd_plugin(__file__)
File "/opt/graphite/lib/carbon/util.py", line 92, in run_twistd_plugin
runApp(config)
File "/usr/local/lib/python2.7/dist-packages/twisted/scripts/twistd.py", line 23, in runApp
_SomeApplicationRunner(config).run()
File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 386, in run
self.application = self.createOrGetApplication()
File "/usr/local/lib/python2.7/dist-packages/twisted/application/app.py", line 446, in createOrGetApplication
ser = plg.makeService(self.config.subOptions)
File "/opt/graphite/lib/twisted/plugins/carbon_cache_plugin.py", line 21, in makeService
return service.createCacheService(options)
File "/opt/graphite/lib/carbon/service.py", line 127, in createCacheService
from carbon.writer import WriterService
File "/opt/graphite/lib/carbon/writer.py", line 34, in <module>
schemas = loadStorageSchemas()
File "/opt/graphite/lib/carbon/storage.py", line 123, in loadStorageSchemas
archives = [ Archive.fromString(s) for s in retentions ]
File "/opt/graphite/lib/carbon/storage.py", line 107, in fromString
(secondsPerPoint, points) = whisper.parseRetentionDef(retentionDef)
File "/usr/local/lib/python2.7/dist-packages/whisper.py", line 76, in parseRetentionDef
(precision, points) = retentionDef.strip().split(':')
ValueError: need more than 1 value to unpack
I see that it as an issue with the split retentionDef.strip().split(':'). My storage schema config file (/opt/graphite/conf/storage-schemas.conf) looks like:
[stats]
priority = 110
pattern = ^stats\..*
retentions = 10s:6h,1m:7d,10m:1y
[ts3]
priority = 100
pattern = ^skarp\.ts3\..*
retentions = 60s:1y,1h,:5y
Any hints where I should looking? Or does anybody know what I'm missing here?

I think the problem is the [ts3] rentions. "The retentions line can specify multiple retentions. Each retention of frequency:history is separated by a comma."
In ts3 it appears to be 3 retentions (comma-delimited), with the second not specifying a history and the last not specifying a frequency.
retentions = 60s:1y,1h,:5y
I think you may have meant:
retentions = 60s:1y,1h:5y
Which would be 60 second data for 1 year and 1 hour data for 5 years after that.

Related

Gremlin/Python: run query as string

I have the following code, and run as expected. But I need to use the "g" traversal object to manipulate the graph.
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
g = traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
g.V().drop().iterate()
g.addV('my-label').property('k', 'v').next()
print(g.V().toList())
Instead of the "g" object, I want to run string query to modify the graph, and the following doesn't work.
from gremlin_python.driver import client
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
ws_conn = DriverRemoteConnection('ws://localhost:8182/gremlin','g')
gremlin_conn = client.Client(ws_conn, "g")
query = "g.V().groupCount().by(label).unfold().project('label','count').by(keys).by(values)"
response = gremlin_conn.submit(query)
print(response)
Gives the following error:
(venv) sh-3.2$ python /Users/demo-prj/tests/tools/neptune/local.py
[v[4280]]
Traceback (most recent call last):
File "/Users/demo-prj/tests/tools/neptune/local.py", line 24, in <module>
response = gremlin_conn.submit(query)
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/client.py", line 127, in submit
return self.submitAsync(message, bindings=bindings, request_options=request_options).result()
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/client.py", line 148, in submitAsync
return conn.write(message)
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/connection.py", line 55, in write
self.connect()
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/connection.py", line 45, in connect
self._transport.connect(self._url, self._headers)
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/tornado/transport.py", line 40, in connect
self._ws = self._loop.run_sync(
File "/Users/demo-prj/venv/lib/python3.8/site-packages/tornado/ioloop.py", line 576, in run_sync
return future_cell[0].result()
File "/Users/demo-prj/venv/lib/python3.8/site-packages/tornado/ioloop.py", line 547, in run
result = func()
File "/Users/demo-prj/venv/lib/python3.8/site-packages/gremlin_python/driver/tornado/transport.py", line 41, in <lambda>
lambda: websocket.websocket_connect(url, compression_options=self._compression_options))
File "/Users/demo-prj/venv/lib/python3.8/site-packages/tornado/websocket.py", line 1333, in websocket_connect
conn = WebSocketClientConnection(request,
File "/Users/demo-prj/venv/lib/python3.8/site-packages/tornado/websocket.py", line 1122, in __init__
scheme, sep, rest = request.url.partition(':')
AttributeError: 'DriverRemoteConnection' object has no attribute 'partition'
This works.
from gremlin_python.driver import client
from tornado import httpclient
ws_url = 'ws://localhost:8182/gremlin'
ws_conn = httpclient.HTTPRequest(ws_url)
gremlin_conn = client.Client(ws_conn, "g")
query = "g.V().groupCount().by(label).unfold().project('label','count').by(keys).by(values)"
response = gremlin_conn.submit(query)
print(response)

tuple index out of range for printing a index value

While executing the following code i'm getting below error, Just for information matchObj here returns a tuple value ..
$ ./ftpParser3_re_dup.py
Traceback (most recent call last):
File "./ftpParser3_re_dup.py", line 13, in <module>
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag"))
IndexError: tuple index out of range
Code is below:
from __future__ import print_function
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
import re
with open('all_adta', 'r') as f:
for line in f:
line = line.strip()
data = f.read()
# Making description & termflag optional in the regex pattern as it's missing in the "data_test" file with several occurrences.
regex = (r"dn:(.*?)\nftpuser: (.*)\n(?:description:* (.*))?\n(?:termflag:* (.*))")
matchObj = re.findall(regex, data)
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag"))
print("{0:<30}{1:<20}{2:<50}{3:<15}".format("-----------","------------","--------"))
for index in matchObj:
index_str = ' '.join(index)
new_str = re.sub(r'[=,]', ' ', index_str)
new_str = new_str.split()
# In below print statement we are using "index[2]" as index is tuple here, this is because
# findall() returns the matches as a list, However with groups, it returns it as a list of tuples.
print("{0:<30}{1:<20}{2:<50}{3:<15}".format(new_str[1],new_str[8],index[2],index[3]))
In the line print("{0:<30}{1:<20}{2:<50}{3:<15}".format("FTP ACCOUNT","Account Type","Term Flag")) you have mentioned 4 indices but given only 3 i.e. "FTP ACCOUNT","Account Type","Term Flag"
Remove the 4th index or add a new one

PySpark map datetime to DoW

I'm trying to map a column 'eventtimestamp' to its day of week with the following function:
from datetime import datetime
import calendar
from pyspark.sql.functions import UserDefinedFunction as udf
def toWeekDay(x):
v = int(datetime.strptime(str(x),'%Y-%m-%d %H:%M:%S').strftime('%w'))
if v == 0:
v = 6
else:
v = v-1
return calendar.day_name[v]
and for my df trying to create a new column dow with UDF.
udf_toWeekDay = udf(lambda x: toWeekDay(x), StringType())
df = df.withColumn("dow",udf_toWeekDay('eventtimestamp'))
Yet, I'm getting error I do not understand at all. Firstly, it was complaining for inserting datetime.datetime into strptime instead of string. So I parsed to str and now I don't have a clue what's wrong.
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9040214714346906648.py", line 267, in <module>
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9040214714346906648.py", line 260, in <module>
exec(code)
File "<stdin>", line 10, in <module>
File "/usr/lib/spark/python/pyspark/sql/dataframe.py", line 429, in take
return self.limit(num).collect()
File "/usr/lib/spark/python/pyspark/sql/dataframe.py", line 391, in collect
port = self._jdf.collectToPython()
File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/usr/lib/spark/python/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/usr/lib/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o6250.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1107.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1107.0 (TID 63757, ip-172-31-27-113.eu-west-1.compute.internal, executor 819): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
Thanks a lot for clues!
we can use date_format to get dayofweek,
df = df.withColumn("dow",date_format(df['eventtimestamp'],'EEEE'))

Unable to combine specific lines in notepad++ files using nested for loops

I'm trying to compare portions of lines in two notepad++ files against each other using two variables(vg_line and sn_line)in order to combine them together if equal. Once it has found its pair it prints out certain information from each for loop, but it only finds the first pair and doesn't continue to loop through vg_lines file in order to compare other lines with sn_lines file.
input_file = open(input_VG_name)
input_Server_name = open(input_Server_name)
for line in input_file:
line_data = line.strip()
vg_line = line_data[0:44]
volume_group = line_data[44:58]
for line1 in input_Server_name:
line_data = line1.strip()
sn_line = line_data[0:44]
server_name = line_data[46:64]
if vg_line == sn_line:
print(vg_line, volume_group, server_name)
First post so any tips on what I can do better coding/asking questions is much appreciated!
You are not reading the files
Try the following:
input_file = r'c:\file.txt'
input_Server_name = r'c:\server_file.txt'
with open(input_file, 'r') as file:
for line in file.readlines():
line_data = line.strip()
vg_line = line_data[0:44]
volume_group = line_data[44:58]
with open(input_Server_name, 'r') as file1:
for line1 in file1.readlines():
line1_data = line1.strip()
sn_line = line1_data[0:44]
server_name = line1_data[46:64]
if vg_line == sn_line:
print(vg_line, volume_group, server_name)
The thing is: this code will have to read the second file for every line in the first file (which is what I got from your original code).
There are other methods two match to files up, have a search around, there are plenty of answers. Don't forget to check "Code Review" which has some good examples as well.

convert date column in a text file to float

I have a data file with the 2nd column in the file being dates in the format '01/01/2007'. I am trying to convert this column into number format so that I can insert the data in the textfile into a mysql database. I keep getting these errors when I try to do so:
Traceback (most recent call last):
File "C:/Python27/numpy", line 5, in <module>
x = np.loadtxt(fname='xyz.txt', dtype=[('date', 'str', 12),('x','float')], converters={1:datestr2num}, delimiter=None, skiprows=0, usecols=None);
File "C:\Python27\lib\site-packages\numpy\lib\npyio.py", line 713, in loadtxt
X.append(tuple([conv(val) for (conv, val) in zip(converters, vals)]))
File "C:/Python27/numpy", line 4, in datestr2num
return datetime.datetime.strptime(s,'"%m/%d/%y"')
File "C:\Python27\lib\_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '"01/01/2007"' does not match format '"%m/%d/%y"'
Can anyone please help me with this? Here is the code I am trying:
import numpy as np
import datetime
def datestr2num(s):
return datetime.datetime.strptime(s,'"%m/%d/%y"')
x = np.loadtxt(fname='xyz.txt', dtype= 'float', converters={1:datestr2num}, delimiter=None, skiprows=0, usecols=None);
print x;
'%y' is for two-digit years (e.g. '14'); you have four-digit years (e.g. '2014') so should be using '%Y' - see the documentation.

Resources