How could I filter .csv generated by "report" plug-in in Frama-C? - frama-c

Currently I'm executing:
frama-c -wp -wp-rte -report-rules test_rules.json -wp-split -wp-fct max -wp-status-maybe -wp-status-invalid -wp-timeout 10 -wp-prover alt-ergo -wp-par 12 -warn-signed-overflow -warn-unsigned-overflow -warn-special-float non-finite test.c -then -report-csv test.csv
I read documentation, but didn't find a good explanation how this JSON file works. I found some code at GitHub. But still it's not trivial for Frama-C novice.
I would like to have a CSV that has only rows with status different than Valid and only in test.c file (without dependencies). Is it possible this to be done from JSON config or I have to write custom parser?

I think there is some misunderstanding there: the -report-rules option is meant to be used in conjunction with -report-json. It has no effect on -report-csv, which will always output the whole list of properties. In fact, the very point of -report-csv is to import the resulting file into another tool in order to perform whatever operation you're interested in. For instance, you can simply open the file in your favorite spreadsheet editor and its built-in filters. But there are a lot of programming options as well. Building upon the script written here, here is an example using the python interpreter with the pandas library
>>> import pandas
>>> df = pandas.read_csv("test.csv",sep="\t")
>>> print('There are ' + str(len(df)) + ' properties.')
There are 77 properties.
>>> df = df[df['function']=='merge']
>>> print('There are ' + str(len(df)) + ' properties.')
There are 39 properties.
>>> df = df[df['status']=='Unknown']
>> print('There are ' + str(len(df)) + ' properties.')
There are 3 properties.
>>> print('There are ' + str(len(df)) + ' properties.')
>>> df.to_csv(path_or_buf='res.txt',sep='\t')
This gives me the 3 Unknown properties related to function merge in file res.csv (I hadn't a multi-file example available right away, but of course you would just have to use the file field in your first query). Just keep in mind that the "csv" file is in fact tabular-separated and not comma-separated (since commas tend to appear in ACSL formulas the latter would not be very practical).

Related

Not able to access certain JSON properties in Autoloader

I have a JSON file that is loaded by two different Autoloaders.
One uses schema evolution and besides replacing spaces in the json property names, writes the json directly to a delta table, and I can see all the values are there properly.
In the second one I am mapping to a defined schema and only use a subset of properties. So use a lot of withColumn and then a select to narrows to my defined column list.
Autoloader definition:
df = (spark
.readStream
.format('cloudFiles')
.option('cloudFiles.format', 'json')
.option('multiLine', 'true')
.option('cloudFiles.schemaEvolutionMode','rescue')
.option('cloudFiles.includeExistingFiles','true')
.option('cloudFiles.schemaLocation', bronze_schema)
.option('cloudFiles.inferColumnTypes', 'true')
.option('pathGlobFilter','*.json')
.load(upload_path)
.transform(lambda df: remove_spaces_from_columns(df))
.withColumn(...
Writer:
df.writeStream.format('delta') \
.queryName(al_stream_name) \
.outputMode('append') \
.option('checkpointLocation', checkpoint_path) \
.option('mergeSchema', 'true') \
.trigger(once = True) \
.table(bronze_table)
Issue is that some of the source columns are ok load and I get their values, and others are constantly null in the output table.
For example:
.withColumn('vl_rating', col('risk_severity.value')) # works
.withColumn('status', col('status.name')) # always null
...
.select(
'rating',
'status',
...
json is quite simple, these are all string values, they are always populated. The same code works against another simular json file in another autoloader without issue.
I have run out of ideas to fault find on this. My imports are minimal, outside of Autoloader the JSON loads fine.
e.g
%python
import pyspark.sql.functions as psf
jsontest = spark.read.option('inferSchema','true').json('dbfs:....json')
df = jsontest.withColumn('status', psf.col('status.name')).select('status')
display(df)
Results in the values of the status.name property of the json file
Any ideas would be greatly appreciated.
I have found generally what is causing this. Interesting cause!
I am scanning a whole directory of json files, and the schema evolves over time (as expected). But when I clear out the autoloader schema and checkpoint directories and only scan the latest json file it all works correctly.
So what I surmise is that something in schema evolution with the older json files causes Autoloader to get into a state where it will not put certain properties into the stream to the writer.
If anyone has any recommendation on how to implement some data quality analysis in an Autoloader I would be most appreciative if you would share.

Fortran90: Scripting of Standard In not working as expected

Working with Fortran90 in Unix...
I have a programme which needs to read in the input parameters from a file "input-deck.par". This filename is currently hard-coded but I want to run a number of runs using different input-deck files (input-deck01.par, input-deck02.par, input-deck03.par etc.) so I've set-up the code to do a simple "read(*,*) inpfile" to allow the user to input the name of this file directly on run-time with a view to scripting this later.
This works fine interactively. If I execute the programme it asks for the file name, you type it in and the filename is accepted, the file is opened and the programme picks up the parameters from that file.
The issue lies in scripting this. I've tried scripting using the "<" pipe command so:
myprog.e < input-deck01.par
But I get an error saying:
Fortran runtime error: Cannot open file '----------------': No such file or directory
If I print the filename right after the input line, it prints that the filename is '----------------' (I initialise the variable as having 16 characters hence the 16 hyphens I think)
It seems like the "<" pipe is not passing the keyboard input in correctly. I've tried with spaces and quotes around the filename (various combinations) but the errors are the same.
Can anyone help?
(Please be gentle- this is my first post on SO and Fortran is not my primary language....)
Fortran has the ability to read the command line arguments. A simple example is
program foo
implicit none
character(len=80) name
logical available
integer fd
if (command_argument_count() == 1) then
call get_command_argument(1, name)
else
call usage
end if
inquire(file=name, exist=available)
if (.not. available) then
call usage
end if
open(newunit=fd, file=name, status='old')
! Read file
contains
subroutine usage
write(*,'(A)') 'Usage: foo filename'
write(*,'(A)') ' filename --> file containing input info'
stop
end subroutine usage
end program foo
Instead of piping the file into the executable you simply do
% foo input.txt

Pytest failing on file open command string assert - what's the best way to test this?

I am constructing a command to pass to requests library to Post an attachment - as in
files= attachment = {"attachment": ("image.png", open("C:\tmp\sensor.png", "rb"), "image/png")}
The code is working but I cannot get PyTest to test it as -is because of the open command which is executed when evaluated. Here is simplified code of the problem
import pytest
def openfile():
cmd = {"cmd": open(r"C:\tmp\sensor.png")}
return cmd
def test_openfile():
cmd = openfile()
#assert str(cmd) == str({"cmd": open(r"C:\tmp\sensor.png")}) # this works
assert cmd == {"cmd": open(r"C:\tmp\sensor.png")} # this does not
PyTest complains that the two side are different but then confirms they are the same in the diff panel!
Expected :{'cmd': <_io.TextIOWrapper name='C:\tmp\sensor.png' mode='r' encoding='cp1252'>}
Actual :{'cmd': <_io.TextIOWrapper name='C:\tmp\sensor.png' mode='r' encoding='cp1252'>}
'Click to see difference' - Opening diff panel reports 'Contents are identical'!
I can just stick with comparing the generated string with expected string but am wondering if there is a better way to do this.
Ideas?
You need to test the properties of the actual file buffer that is returned by the open call, instead of the references to that buffer, for example:
def test_openfile():
cmd = openfile()
expected_filename = r"C:\tmp\sensor.png"
assert "cmd" in cmd
file_cmd = cmd["cmd"]
assert file_cmd.name == expected_filename
with open(expected_filename) as f:
contents = f.read()
assert file_cmd.read() == contents
Note that in a test you may not have the file contents, or have them in another place like a fixture, so testing the file contents may have to be adapted, or may not be needed, depending on what you want to test.
After talking this through with a friend I think my original approach is perfectly valid. For anyone that trips over this question here's why:
I am trying to pytest building of an executable parameter to pass to another library for execution. The execution of the parameter is not relevant, just that it is correctly formatted. The test is to compare what is generated with the expected parameter ( as if I typed it) .
Therefore casting to string or json and comparing is appropriate since that is what a human does to manually check the code!

Shortening .write commands

I am learning from the book Learn Python The Hard Way 3.6, by Zed Shaw
There are a series of 6 target.write commands towards the bottom of the script and he wants me to simplify them into a single target.write command using strings formats and escapes. However, I am stuck.
Here is the original code:
from sys import argv
script, filename = argv
print(f"We're going to erase {filename}")
print("If you don't want that, hit CTRL-C (^C).")
print("If you do want that, hit RETURN.")
input("?")
print("Opening the file...")
target = open(filename,'w')
print("Truncating the file. Goodbye!")
target.truncate()
print("Now I'm going to ask you for three lines")
line1 = input("line 1:")
line2 = input("line 2:")
line3 = input("line 3:")
print("Im going to write these to the file.")
target.write(line1)
target.write("\n")
target.write(line2)
target.write("\n")
target.write(line3)
target.write("\n")
print("And finnaly, we close it")
target.close()
So far I have tried
target.write(line1),(line2),(line3)
but this gives a logical error of only writing to one line not all three.
target.write(line1) + (line2) + (line3)
with this one I get error
'unsupported operand types for +: 'int' + 'str'
target.write(line1),\n,(line2)\n(line3),\n
with this one I get error:
unexpected character after line continuation character
(<string>,line 22)
I have been googling and searching here for answers but have not found anything. One person posted a very similar question except for Zed's 2.7 book. However I am reading Zed's 3.6 book so the answers were no help to me unfortunately.
I'm not sure what you have and haven't covered so far in the book as I'm not familiar with it but one way to do what you want is to format the string first and then pass it to the write method like this:
target.write("{0}\n{1}\n{2}\n".format(line1, line2, line3))

serial QR code generator

I need a QR code generator that generates 21500 unique serial number with a QR stamp, and export every 1000 code on a one PDF file, so we'll have 22 PDF file.
How can I do that?
Some time ago I've done a similar thing using Python, qrencode and LaTeX. I've modified my old code to fit your needs. I assumed you want A4 pages. The contents of the QR Codes are the PMY00001 to PMY22000 ASCII strings.
#!/usr/bin/env python
import random, base64, string, os, sys
width=7.7
height=7
print "\\documentclass[a4paper,10pt]{report}"
print "\\usepackage[absolute]{textpos}"
print "\\usepackage{nopageno}"
print "\\usepackage{graphicx}"
print "\\setlength{\\TPHorizModule}{1mm}"
print "\\setlength{\\TPVertModule}{1mm}"
print "\\textblockorigin{10mm}{10mm}"
print "\\setlength{\\parskip}{0pt}"
print "\\setlength{\\parindent}{0pt}"
print "\\setlength{\\fboxsep}{0pt}"
print "\\setlength{\\tabcolsep}{0pt}"
print "\\renewcommand{\\baselinestretch}{0.8}"
print ""
print "\\begin{document}"
idx=int(sys.argv[1])
for i in range(0,25):
for j in range(0,40):
b = 'PMY%05d' % idx
f = os.path.join("codes", b + ".png")
ff = os.popen("qrencode -lH -o " + f, "w")
ff.write(b)
ff.close()
print "\\begin{textblock}{" + str(width) + "}(" + str(width * i) + "," + str(height * j) + ")"
print "\\includegraphics[height="+str(height)+"mm]{" + f + "}"
print "\\end{textblock}"
idx=idx+1
print "\\end{document}"
To use it, write it as e.g. qrgen.py, add execution permissions chmod +x qrgen.py, create codes directory: mkdir codes and run ./qrgen.py 0 >codes.tex to generate the codes.tex document and then pdflatex codes.tex to generate codes.pdf file. The 0 argument is the starting serial number.
To get 22 such sheets it's best to use a loop:
for ((i=0;i<22;i++)); do ../qrgen.py $((i*1000+1)) >$i.tex; pdflatex $i.tex; done
Of course this is not the optimal solution - you can probably get a much faster one using Python qrencode library bindings instead of launching external qrencode program and some library for generating PDFs from Python directly instead of using pdflatex.
You can write a script in your language of choice that uses Google's QR code generator in a loop to generate all the codes you'll need and save them to a pdf. You'll need to provide more details if you need a more specific answer.

Resources