Tensorflow: How to convert .meta, .data and .index model files into one graph.pb file - graph

In tensorflow the training from the scratch produced following 6 files:
events.out.tfevents.1503494436.06L7-BRM738
model.ckpt-22480.meta
checkpoint
model.ckpt-22480.data-00000-of-00001
model.ckpt-22480.index
graph.pbtxt
I would like to convert them (or only the needed ones) into one file graph.pb to be able to transfer it to my Android application.
I tried the script freeze_graph.py but it requires as an input already the input.pb file which I do not have. (I have only these 6 files mentioned before). How to proceed to get this one freezed_graph.pb file? I saw several threads but none was working for me.

You can use this simple script to do that. But you must specify the names of the output nodes.
import tensorflow as tf
meta_path = 'model.ckpt-22480.meta' # Your .meta file
output_node_names = ['output:0'] # Output nodes
with tf.Session() as sess:
# Restore the graph
saver = tf.train.import_meta_graph(meta_path)
# Load weights
saver.restore(sess,tf.train.latest_checkpoint('path/of/your/.meta/file'))
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('output_graph.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
If you don't know the name of the output node or nodes, there are two ways
You can explore the graph and find the name with Netron or with console summarize_graph utility.
You can use all the nodes as output ones as shown below.
output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node]
(Note that you have to put this line just before convert_variables_to_constants call.)
But I think it's unusual situation, because if you don't know the output node, you cannot use the graph actually.

As it may be helpful for others, I also answer here after the answer on github ;-).
I think you can try something like this (with the freeze_graph script in tensorflow/python/tools) :
python freeze_graph.py --input_graph=/path/to/graph.pbtxt --input_checkpoint=/path/to/model.ckpt-22480 --input_binary=false --output_graph=/path/to/frozen_graph.pb --output_node_names="the nodes that you want to output e.g. InceptionV3/Predictions/Reshape_1 for Inception V3 "
The important flag here is --input_binary=false as the file graph.pbtxt is in text format. I think it corresponds to the required graph.pb which is the equivalent in binary format.
Concerning the output_node_names, that's really confusing for me as I still have some problems on this part but you can use the summarize_graph script in tensorflow which can take the pb or the pbtxt as an input.
Regards,
Steph

I tried the freezed_graph.py script, but the output_node_name parameter is totally confusing. Job failed.
So I tried the other one: export_inference_graph.py.
And it worked as expected!
python -u /tfPath/models/object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=/your/config/path/ssd_mobilenet_v1_pets.config \
--trained_checkpoint_prefix=/your/checkpoint/path/model.ckpt-50000 \
--output_directory=/output/path
The tensorflow installation package I used is from here:
https://github.com/tensorflow/models

First, use the following code to generate the graph.pb file.
with tf.Session() as sess:
# Restore the graph
_ = tf.train.import_meta_graph(args.input)
# save graph file
g = sess.graph
gdef = g.as_graph_def()
tf.train.write_graph(gdef, ".", args.output, True)
then, use summarize graph get the output node name.
Finally, use
python freeze_graph.py --input_graph=/path/to/graph.pbtxt --input_checkpoint=/path/to/model.ckpt-22480 --input_binary=false --output_graph=/path/to/frozen_graph.pb --output_node_names="the nodes that you want to output e.g. InceptionV3/Predictions/Reshape_1 for Inception V3 "
to generate the freeze graph.

Related

Can I use the output of one DynamicResource after a map() for multiple solids?

I am doing something similar to the dynamic mapping and collect example in the documentation. The example lists files in a directory, maps each to a solid which computes the file size, and then collects the output for a summary of the overall size.
However, I would like to run multiple solids in parallel on each solid. So to continue with the example: I would list files in a directory; then map so that for each file I would compute the size, check the file permissions, and compute the md5sum all in parallel; and finally collect the output.
I can run these in sequence on each file with something like:
file_results = list_files()
.map(compute_size)
.map(check_permissions)
.map(compute_md5sum)
summarize(file_results.collect())
But if these aren't actually serial dependencies, it would be nice to parallelize the work on each file. Is there some syntax like this:
file_results = list_files().map(
compute_md5sum(check_permissions(compute_size)))
summarize(file_results.collect())
If I understand correctly, something like this should accomplish what you are looking for:
def _process_file(file):
size = compute_size(file)
perms = check_permissions(file)
hash = compute_md5sum(file)
return summarize_file(size, perms, hash)
file_results = list_files().map(_process_file)
summarize(file_results.collect)

How can I download multiple objects from S3 simultaneously?

I have lots (millions) of small log files in s3 in with its name (date/time) helping to define it i.e. servername-yyyy-mm-dd-HH-MM. e.g.
s3://my_bucket/uk4039-2015-05-07-18-15.csv
s3://my_bucket/uk4039-2015-05-07-18-16.csv
s3://my_bucket/uk4039-2015-05-07-18-17.csv
s3://my_bucket/uk4039-2015-05-07-18-18.csv
...
s3://my_bucket/uk4339-2015-05-07-19-23.csv
s3://my_bucket/uk4339-2015-05-07-19-24.csv
...
etc
From EC2, using the AWS CLI, I would like to simultaneously download all files that are have the minute equal 16 for 2015, for all only server uk4339 and uk4338
Is there a clever way to do this?
Also if this is a terrible file structure in s3 to query data, I would be extremely grateful for any advice on how to set this up better.
I can put a relevant aws s3 cp ... command into a loop in a shell/bash script to sequentially download the relevant files but, was wondering if there was something more efficient.
As an added bonus I would like to row bind the results together too as one csv.
A quick example of a mock csv file can be generated in R using this line of R code
R> write.csv(data.frame(cbind(a1=rnorm(100),b1=rnorm(100),c1=rnorm(100))),file='uk4339-2015-05-07-19-24.csv',row.names=FALSE)
The csv that is created is uk4339-2015-05-07-19-24.csv. FYI, I will be importing the combined data into R at the end.
As you didn't answer my questions, nor indicate what OS you use, it is somewhat hard to make any concrete suggestions, so I will briefly suggest you use GNU Parallel to parallelise your S3 fetch requests to get around the latency.
Suppose you somehow generate a list of all the S3 files you want and put the resulting list in a file called GrabMe.txt like this
s3://my_bucket/uk4039-2015-05-07-18-15.csv
s3://my_bucket/uk4039-2015-05-07-18-16.csv
s3://my_bucket/uk4039-2015-05-07-18-17.csv
s3://my_bucket/uk4039-2015-05-07-18-18.csv
Then you can get them in parallel, say 32 at a time, like this:
parallel -j 32 echo aws s3 cp {} . < GrabMe.txt
or if you prefer reading left-to-right
cat GrabMe.txt | parallel -j 32 echo aws s3 cp {} .
You can obviously alter the number of parallel requests from 32 to any other number. At the moment, it just echoes the command it would run, but you can remove the word echo when you see how it works.
There is a good tutorial here, and Ole Tange (the author of GNU Parallel) is on SO, so we are in good company.

Output file numbering in graphicsmagick

I'm trying to convert pdf to images using this command:
gm convert ./file.pdf -scene 1 thumbs/thumb%02d.jpg
Although I specify -scene argument, it does nothing, as I get output files starting from thumb00.jpg. And I need them to start from thumb01.jpg.
I'm using GraphicsMagick 1.3.12.
What am I doing wrong here?
In order to ensure numbered output files, add the +adjoin option like:
gm convert ./file.pdf -scene 1 +adjoin thumbs/thumb%02d.jpg
This additional requirement was added by GraphicsMagick 1.3.15. It is ok to use the same option for all earlier releases.
There is still an inability to specify the starting scene number. This is a known bug.

R, passing variables to a system command

Using R, I am looking to create a QR code and embed it into an Excel spreadsheet (hundreds of codes and spreadsheets). The obvious way seems to be to create a QR code using the command line, and use the "system" command in R. Does anyone know how to pass R variables through the "system" command? Google is not too helpful as "system" is a bit generic, ?system does not contain any examples of this.
Note - I am actually using data matrices rather than QR codes, but using the term "data matrix" in an R question will lead to havoc, so let's talk QR codes instead. :-)
system("dmtxwrite my_r_variable -o image.png")
fails, as do the variants I have tried with "paste". Any suggestions gratefully received.
Let's say we have the variable x that we want to pass on to dmtxwrite, you can pass it on like:
x = 10
system(sprintf("dmtxwrite %s -o image.png", x))
or alternatively using paste:
system(paste("dmtxwrite", x, "-o image.png"))
but I prefer sprintf in this case.
Also making use of base::system2 may be worth considering as system2 provides args argument that can be used for that purpose. In your example:
my_r_variable <- "a"
system2(
'echo',
args = c(my_r_variable, '-o image.png')
)
would return:
a -o image.png
which is equivalent to running echo in the terminal. You may also want to redirect output to text files:
system2(
'echo',
args = c(my_r_variable, '-o image.png'),
stdout = 'stdout.txt',
stderr = 'stderr.txt'
)

Printing hard copies of code

I have to hand in a software project that requires either a paper or .pdf copy of all the code included.
One solution I have considered is grouping classes by context and doing a cat *.extension > out.txt to provide all the code, then by catting the final text files I should have a single text file that has classes grouped by context. This is not an ideal solution; there will be no page breaks.
Another idea I had was a shell script to inject latex page breaks in between files to be joined, this would be more acceptable. Although I'm not too adept at scripting or latex.
Are there any tools that will do this for me?
Take a look at enscript (or nenscript), which will convert to Postscript, render in columns, add headers/footers and perform syntax highlighting. If you want to print code in a presentable fashion, this works very nicely.
e.g. here's my setting (within a zsh function)
# -2 = 2 columns
# -G = fancy header
# -E = syntax filter
# -r = rotated (landscape)
# syntax is picked up from .enscriptrc / .enscript dir
enscript -2GrE $*
For a quick solution, see a2ps, followed by ps2pdf. For a nicer, more complex solution I would go for a simple script that puts each file in a LaTeX listings environment and combines the result.

Resources