convert a `find` like output to a `tree` like output - unix

This question is a generalized version of the Output of ZipArchive() in tree format question.
Just before I am wasting time on writing this (*nix command line) utility, it will be a good idea to find out if someone already wrote it. I would like a utility that will get as its' standard input a list such as the one returned by find(1) and will output something similar to the one by tree(1)
E.g.:
Input:
/fruit/apple/green
/fruit/apple/red
/fruit/apple/yellow
/fruit/banana/green
/fruit/banana/yellow
/fruit/orange/green
/fruit/orange/orange
/i_want_my_mommy
/person/men/bob
/person/men/david
/person/women/eve
Output
/
|-- fruit/
| |-- apple/
| | |-- green
| | |-- red
| | `-- yellow
| |-- banana/
| | |-- green
| | `-- yellow
| `-- orange/
| |-- green
| `-- orange
|-- i_want_my_mommy
`-- person/
|-- men/
| |-- bob
| `-- david
`-- women/
`-- eve
Usage should be something like:
list2tree --delimiter="/" < Input > Output
Edit0: It seems that I was not clear about the purpose of this exercise. I like the output of tree, but I want it for arbitrary input. It might not be part of any file system name-space.
Edit1: Fixed person branch on the output. Thanks, #Alnitak.

In my Debian 10 I have tree v1.8.0. It supports --fromfile.
--fromfile
Reads a directory listing from a file rather than the file-system. Paths provided on the command line are files to read from rather than directories to search. The dot (.) directory indicates that tree should read paths from standard input.
This way I can feed tree with output from find:
find /foo | tree -d --fromfile .
Problems:
If tree reads /foo/whatever or foo/whatever then foo will be reported as a subdirectory of .. Similarly with ./whatever: . will be reported as an additional level named . under the top level .. So the results may not entirely meet your formal expectations, there will always be a top level . entry. It will be there even if find finds nothing or throws an error.
Filenames with newlines will confuse tree. Using find -print0 is not an option because there is no corresponding switch for tree.

I whipped up a Perl script that splits the paths (on "/"), creates a hash tree, and then prints the tree with Data::TreeDumper. Kinda hacky, but it works:
#!/usr/bin/perl
use strict;
use warnings;
use Data::TreeDumper;
my %tree;
while (<>) {
my $t = \%tree;
foreach my $part (split m!/!, $_) {
next if $part eq '';
chomp $part;
$t->{$part} ||= {};
$t = $t->{$part};
}
}
sub check_tree {
my $t = shift;
foreach my $hash (values %$t) {
undef $hash unless keys %$hash;
check_tree($hash);
}
}
check_tree(\%tree);
my $output = DumpTree(\%tree);
$output =~ s/ = undef.*//g;
$output =~ s/ \[H\d+\].*//g;
print $output;
Here's the output:
$ perl test.pl test.data
|- fruit
| |- apple
| | |- green
| | |- red
| | `- yellow
| |- banana
| | |- green
| | `- yellow
| `- orange
| |- green
| `- orange
|- i_want_my_mommy
`- person
|- men
| |- bob
| `- david
`- women
`- eve

An other tool is treeify written in Rust.
Assuming you have Rust installed get it with:
$ cargo install treeify

So, I finally wrote what I hope will become the python tree utils. Find it at http://pytree.org

I would simply use tree myself but here's a simple thing that I wrote a few days ago that prints a tree of a directory. It doesn't expect input from find (which makes is different from your requirements) and doesn't do the |- display (which can be done with some small modifications). You have to call it like so tree <base_path> <initial_indent>. intial_indent is the number of characters the first "column" is indented.
function tree() {
local root=$1
local indent=$2
cd $root
for i in *
do
for j in $(seq 0 $indent)
do
echo -n " "
done
if [ -d $i ]
then
echo "$i/"
(tree $i $(expr $indent + 5))
else
echo $i
fi
done
}

Related

How to SRC_URI:append per $MACHINE

I have a yocto native recipe that should apply a different patch to the source code depending on the target ${MACHINE} that it bitbakes for. The folder struct looks like this:
recipe-folder
|-files
| |-machine1
| | |-p1.patch
| |
| |-machine2
| | |-p2.patch
| |
| |-common.patch
|
|-recipe-native_0.1.bb
then the important contents of the recipe are
inherit native
SRC_URI = <some git repo>
SRC_URI:append = "file://common.patch"
SRC_URI:append:machine1 = "file://p1.patch"
SRC_URI:append:machine2 = "file://p2.patch"
do_configure() {
./configure --static
}
do_compile() {
oe_runmake tool1
}
do_install() {
# Default sigtrace installation directory
install -d ${D}${bindir}
install -m 0755 ${S}/output/linux/${release}/tool1 ${D}/${bindir}/tool1
}
The above does not work - only the common patch gets applied.
I also tried
SRC_URI = <some git repo>
SRC_URI:append = "\
file://common.patch \
file://p1.patch \
file://p2.patch \
"
which applies all patches in all targets. Also not what I aim for.
Am I using the command wrong? Is there another way to achieve this ?
Thank you in advance for your help

Firebase security rules: Get a document with a space in the documentID

I am writing firebase security rules, and I am attempting to get a document that may have a space in its documentID.
I have the following snippet which works well when the document does not have a space
function isAdminOfCompany(companyName) {
let company = get(/databases/$(database)/documents/Companies/$(companyName));
return company.data.authorizedUsers[request.auth.uid].access == "ADMIN;
}
Under the collection, "Companies", I have a document called "Test" and another called "Test Company" - Trying to get the document corresponding to "Test" works just fine, but "Test Company" does not seem to work, as the company variable (first line into the function) is equal to null as per the firebase security rules "playground".
My thought is that there is something to do with URL encoding, but replacing the space in a documentID to "%20" or a "+" does not change the result. Perhaps spaces are illegal characters for documentIDs (https://cloud.google.com/firestore/docs/best-practices lists a few best practices)
Any help would be appreciated!
EDIT: As per a few comments, I Will add some additional images/explanations below.
Here is the structure of my database
And here is what fields are present in the user documents
In short, the following snippet reproduces the problem (I am not actually using this, but it demonstrates the issue the same way)
rules_version = '2';
service cloud.firestore {
match /databases/{database}/documents {
match /Users/{user} {
allow update: if findPermission(resource.data.company) == "MASTER"
}
function findPermission(companyName) {
let c = get(path("/databases/" + database + "/documents/Companies/" + companyName));
return c.data.authorizedUsers[request.auth.uid].access;
}
}
}
When I try to update a user called test#email.com (which belongs to company "Test"), the operation is permitted, and everything works exactly as expected.
The issue arises when a user, called test2#email.com, who belongs to company "Test Company" comes along and makes the same request (with authorization email/uid updated in playground, to match what is actually found in the company structure), the request fails. The request fails, since the get() call (line 1 of the function) cannot find the Company document corresponding to "Test Company" - indicated by the variable "c" being null in the screenshot (see below) - IT IS NOT NULL WHEN LOOKING FOR "Test"
Below is a screenshot of the error message, as well as some of the relevant variables when the error occurs
Check to see what type of space just in case it is another non-printable character. You could convert it to Unicode, and check what it might be. However, it is considered bad practice to use spaces in naming variables and data structures. There are so many different types to consider.
| Unicode | HTML | Description | Example |
|---------|--------|--------------------|---------|
| U+0020 | &#32 | Space | [ ] |
| U+00A0 | &#160 | No-Break Space | [ ] |
| U+2000 | &#8192 | En Quad | [ ] |
| U+2001 | &#8193 | Em Quad | [ ] |
| U+2002 | &#8194 | En Space | [ ] |
| U+2003 | &#8195 | Em Space | [ ] |
| U+2004 | &#8196 | Three-Per-Em Space | [ ] |
| U+2005 | &#8197 | Four-Per-Em Space | [ ] |
| U+2006 | &#8198 | Six-Per-Em Space | [ ] |
| U+2007 | &#8199 | Figure Space | [ ] |
| U+2008 | &#8200 | Punctuation Space | [ ] |
| U+2009 | &#8201 | Thin Space | [ ] |
| U+200A | &#8202 | Hair Space | [ ] |

Reduce network traffic for Native Client installation from Chrome Web Store

My Chrome app contains three .nexe files for arm, x86-32 and 64-bit processors. When I install this app from Chrome web store, the size of downloaded package is the same as the size of the app containing all .nexe files. Is it possible to optimize this network traffic?
My .nmf file bundled in the app looks like this:
{
"program": {
"arm": { "url": "arm.nexe" },
"x86-32": { "url": "x86_32.nexe" },
"x86-64": { "url": "x86_64.nexe" }
}
}
Thanks
Yes, you can add a platform specific section to your manifest.json. Then the packages will only download the components that are specified for that CPU architecture.
There is documentation for that feature here: https://developer.chrome.com/native-client/devguide/distributing#reducing-the-size-of-the-user-download-package
And there is an example in the SDK as well: examples/tutorial/multi_platform
To summarize the documentation above:
First create a _platform_specific directory in your App package. For each architecture, create a subdirectory with that name:
|-- my_app_directory/
| |-- manifest.json
| |-- my_app.html
| |-- my_module.nmf
| +-- css/
| +-- images/
| +-- scripts/
| |-- _platform_specific/
| | |-- x86-64/
| | | |-- my_module_x86_64.nexe
| | |-- x86-32/
| | | |-- my_module_x86_32.nexe
| | |-- arm/
| | | |-- my_module_arm.nexe
| | |-- all/
| | | |-- my_module_x86_64.nexe
| | | |-- my_module_x86_64.nexe
| | | |-- my_module_x86_32.nexe
Then in the manifest.json file, specify the location of these directories:
...
"platforms": [
{
"nacl_arch": "x86-64",
"sub_package_path": "_platform_specific/x86-64/"
},
{
"nacl_arch": "x86-32",
"sub_package_path": "_platform_specific/x86-32/"
},
{
"nacl_arch": "arm",
"sub_package_path": "_platform_specific/arm/"
},
{
"sub_package_path": "_platform_specific/all/"
}
]
You'll want your .nmf to point to the location of these nexes. The SDK build system has an option to do all of this for you automatically, I'd suggest using it.

CherryPy 3.6 - reading Multipart Post http request

I coded a java client that sends a string of meta information and a byte array through a multipart post http request to my server running cherrypy 3.6.
I need to extract both values and I coded this in python3 on the server side to find out how to manipulate the result as I can't find any relevant documentation over internet that explains how to read this html part
def controller(self, meta, data):
print("meta", meta)
print("data", type(data))
outputs :
my meta information
<class 'cherrypy._cpreqbody.Part'>
Note : the data part contains raw binary data.
How can I read the http part content into a buffer or output it to a disk file ?
Thanks for your help.
Thanks for your answer.
I'v already read this doc but unfortunately methods read-into_file and make_file, read ... it doesn't work for me. for example when trying to read a zip file sent form my java client :
Assuming data is the Http post parameter
make_file()
fp = data.make_file()
print("fp type", type(fp)) # _io.BufferedRandom
zipFile = fp.read()
outputs:
AttributeError: 'bytes' object has no attribute 'seek'
line 651, in read_lines_to_boundary raise EOFError("Illegal end of multipart body.")EOFError: Illegal end of multipart body.
read_into_file()
file = data.read_into_file()
print("file type", type(file))
zipFile = io.BytesIO(file.read())
# zipFile = file.read() # => raises same error
outputs:
line 651, in read_lines_to_boundary raise EOFError("Illegal end of multipart body.")EOFError: Illegal end of multipart body.
I don't understand what happens ...
Actually "data" is not a file like object but a cherrypy._cpreqbody.Part one. It holds a "file" file an _io.BufferedRandom class property.
Its read() method returns the whole body content in a binary form (bytes).
so to end up the straightforward solution is :
class BinReceiver(object):
def index(self, data):
zipFile = io.BytesIO(data.file.read())
path = "/tmp/data.zip"
fp = open(path)
fp.write(zipFile, 'wb')
fp.close()
print("saved data into", path, "size", len(zipFile))
index.exposed = True
and this works fine ...
fyi : I'm running python3.2
It seems like data is a file-like object which you can call .read on. In addition CherryPy provides a method read_into_file.
See the full documentation by typing help(cherrypy._cpreqbody.Part) in your REPL.
class Part(Entity)
| A MIME part entity, part of a multipart entity.
|
| Method resolution order:
| Part
| Entity
| __builtin__.object
|
| Methods defined here:
|
| __init__(self, fp, headers, boundary)
|
| default_proc(self)
| Called if a more-specific processor is not found for the
| ``Content-Type``.
|
| read_into_file(self, fp_out=None)
| Read the request body into fp_out (or make_file() if None).
|
| Return fp_out.
|
| read_lines_to_boundary(self, fp_out=None)
| Read bytes from self.fp and return or write them to a file.
|
| If the 'fp_out' argument is None (the default), all bytes read are
| returned in a single byte string.
|
| If the 'fp_out' argument is not None, it must be a file-like
| object that supports the 'write' method; all bytes read will be
| written to the fp, and that fp is returned.
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| from_fp(cls, fp, boundary) from __builtin__.type
|
| read_headers(cls, fp) from __builtin__.type
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| attempt_charsets = ['us-ascii', 'utf-8']
|
| boundary = None
|
| default_content_type = 'text/plain'
|
| maxrambytes = 1000
|
| ----------------------------------------------------------------------
| Methods inherited from Entity:
|
| __iter__(self)
|
| __next__(self)
|
| fullvalue(self)
| Return this entity as a string, whether stored in a file or not.
|
| make_file(self)
| Return a file-like object into which the request body will be read.
|
| By default, this will return a TemporaryFile. Override as needed.
| See also :attr:`cherrypy._cpreqbody.Part.maxrambytes`.
|
| next(self)
|
| process(self)
| Execute the best-match processor for the given media type.
|
| read(self, size=None, fp_out=None)
|
| readline(self, size=None)
|
| readlines(self, sizehint=None)
|
| ----------------------------------------------------------------------
| Data descriptors inherited from Entity:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| type
| A deprecated alias for :attr:`content_type<cherrypy._cpreqbody.Entity.content_type>`.
|
| ----------------------------------------------------------------------
| Data and other attributes inherited from Entity:
|
| charset = None
|
| content_type = None
|
| filename = None
|
| fp = None
|
| headers = None
|
| length = None
|
| name = None
|
| params = None
|
| part_class = <class 'cherrypy._cpreqbody.Part'>
| A MIME part entity, part of a multipart entity.
|
| parts = None
|
| processors = {'application/x-www-form-urlencoded': <function process_u...

How to extract the name of immediate directory along with the filename?

I have a file whose complete path is like
/a/b/c/d/filename.txt
If I do a basename on it, I get filename.txt. But this filename is not too unique.
So, it would be better if I could extract the filename as d_filename.txt i.e.
{immediate directory}_{basename result}
How can I achieve this result?
file="/path/to/filename"
echo $(basename $(dirname "$file")_$(basename "$file"))
or
file="/path/to/filename"
filename="${file##*/}"
dirname="${file%/*}"
dirname="${dirname##*/}"
filename="${dirname}_${filename}"
This code will recursively search through your hierarchy starting with the directory that you run the script in. I've coded the loop in such a way that it will handle any filename you throw at it; file names with spaces, newlines etc.
*Note**: the loop is currently written to not include any files in the directory that this script resides in, it only looks at subdirs below it. This was done as it was the easiest way to make sure the script does not include itself in its processing. If for some reason you must include the directory the script resides in, it can be changed to accommodate this.
Code
#!/bin/bash
while IFS= read -r -d $'\0' file; do
dirpath="${file%/*}"
filename="${file##*/}"
temp="${dirpath}_${filename}"
parent_file="${temp##*/}"
printf "dir: %10s orig: %10s new: %10s\n" "$dirpath" "$filename" "$parent_file"
done < <(find . -mindepth 2 -type f -print0)
Test tree
$ tree -a
.
|-- a
| |-- b
| | |-- bar
| | `-- c
| | |-- baz
| | `-- d
| | `-- blah
| `-- foo
`-- parent_file.sh
Output
$ ./parent_file.sh
dir: ./a/b/c/d orig: blah new: d_blah
dir: ./a/b/c orig: baz new: c_baz
dir: ./a/b orig: bar new: b_bar
dir: ./a orig: foo new: a_foo
$ FILE=/a/b/c/d/f.txt
$ echo $FILE
/a/b/c/d/f.txt
$ echo $(basename ${FILE%%$(basename $FILE)})_$(basename $FILE)
d_f.txt
don't need to call external command
s="/a/b/c/d/filename.txt"
t=${s%/*}
t=${t##*/}
filename=${t}_${s##*/}
Take the example:
/a/1/b/c/d/file.txt
/a/2/b/c/d/file.txt
The only reliable way to qualify file.txt and avoid conflicts is to build the entire path into the new filename, e.g.
/a/1/b/c/d/file.txt -> a_1_b_c_d_file.txt
/a/2/b/c/d/file.txt -> a_2_b_c_d_file.txt
You may be able to skip part of the beginning if you know for sure that it will be common to all files, e.g if you know that all files reside somewhere underneath the directory /a above:
/a/1/b/c/d/file.txt -> 1_b_c_d_file.txt
/a/2/b/c/d/file.txt -> 2_b_c_d_file.txt
To achieve this on a per-file basis:
# file="/path/to/filename.txt"
new_file="`echo \"$file\" | sed -e 's:^/::' -e 's:/:_:g'`"
# new_file -> path_to_filename.txt
Say you want do do this recursively in a directory and its subdirectories:
# dir = /a/b
( cd "$dir" && find . | sed -e 's:^\./::' | while read file ; do
new_file="`echo \"$file\" | sed -e 's:/:_:g'`"
echo "rename $dir/$file to $new_file"
done )
Output:
rename /a/b/file.txt to file.txt
rename /a/b/c/file.txt to c_file.txt
rename /a/b/c/e/file.txt to c_e_file.txt
rename /a/b/d/e/file.txt to d_e_file.txt
...
The above is highly portable and will run on essentially any Unix system under any variant of sh (inclusing bash, ksh etc.)

Resources