Stop Python3 creating module cache in system directory - unix

In Question 2918898, users discussed how to avoid caching because
modules were changing, and solutions focused on reloading. My question is
somewhat different; I want to avoid caching in the first place.
My application runs on Un*x and lives in /usr/local. It imports a
module with some shared code used by this application and another.
It's normally run as an ordinary user, and Python doesn't cache the
module in that case, because it doesn't have write permission for that
system directory. All good so far.
However, I sometimes need to run the application as superuser, and then
it does have write permission and it does cache it, leaving unsightly
footprints in a system directory. Do not want.
So ... any way to tell CPython 3.2 (or later, I'm willing to upgrade)
not to cache the module? Or some other way to solve the problem?
Changing the directory permissions doesn't work; root can still write,
root is all-powerful.
I looked through PEP 3147 but didn't see a way to prevent caching.
I don't recall any way to import code other than import. I suppose I
could read a simple text file and exec it, but that seems inelegant
and bug-prone.
The run-as-root is accomplished by calling the program with sudo in a
shell script, and I can have the shell script delete the cache after the
run, but I'm hoping for something more elegant that doesn't change the
directory's last-modified timestamp.
Implemented solution, based on Wander Nauta's answer:
Since I run the executable as a plain filename, not as python executablename, I went with the environment variable. First, the
sudoers file needs to be changed to allow setting environment
variables:
tom ALL=(ALL) SETENV: NOPASSWD: /usr/local/bkup/bin/mkbkup
Then, the invocation needs to include the variable:
/usr/bin/sudo PYTHONDONTWRITEBYTECODE=true /usr/local/bkup/bin/mkbkup "$#"

You can start python with the -B command-line flag to prevent it from writing cached bytecode.
$ ls
bar.py foo.py
$ cat foo.py
import bar
$ python -B foo.py; ls
bar.py foo.py
$ python foo.py; ls
bar.py foo.py __pycache__
Setting the PYTHONDONTWRITEBYTECODE environment variable to a non-empty string or the sys.dont_write_bytecode to True will have the same effect.
Of course, I'd say that the benefits in this case (faster loading times for your app, for free) vastly outweigh the perceived unsightliness you were talking about - but if you really want to disable caching, here's how.
Source: man python

Related

Folderstructure with rsync in bash

I looked up the forum but didn't find an article which matches my problem. Maybe there is some, and you can help me out with it.
My problem is I want to sync an folder with the command rsync -a -v. The point is I got 5 different Maschinen. On every maschine is a scratch folder I want to sync into the folder: ~/work_dir/scratch_maschines and inside the /scratch_maschines folder should be a folder for maschine_a, maschine_b and so on.
On the maschines it is always the same path: /scratch/my_name. So when I use now this command for the first two maschines:
rsync -a -v --exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk' --delete sp02:/scratch/my_name ~/work_dir/scratch_maschine01; rsync -a -v --exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk' --delete maschine02:/scratch/my_name ~/work_dir/scratch_maschine02
I got a folders for scratch_maschine01 and scratch_maschine02 in my working directory but inside these folders are not direct my data there is first a folder inside with my_name and this folder contains the data. So my question is how can I use the rsync command and get the files from the scratch directorys straight to the folders for each machine?
You might want to consider reformulating your commands similar to the following:
START=`pwd`
EXCLUDES="--exclude='*.chk' --exclude='*.rwf' --exclude='*.fchk'"
{ SOURCE="sp02:/scratch/my_name"
REMOTE="${HOME}/work_dir/scratch_maschine01"
cd "${SOURCE}"
rsync --recursive -v --delete ${EXCLUDES} "./" "${REMOTE}/"
}>${START}/job.log 2>${START}/job.err
The key elements there are
the --recursive which will rsync will expand to include all content and subdirs of the SOURCE directory.
the / behind the ${SOURCE} notifies rsync to limit itself to content of the SOURCE directory, but not the directory itself.
the / behind the ${REMOTE} notifies rsync to limit itself to depositing content into that directory and expect it to already exist, to specifically fail if that does not already exist at REMOTE; this ensures that the remote site doesn't attempt a failsafe PWD and deposit files elsewhere than expected.
The above approach lends itself to a function form that could be placed into a loop with pre-attempt condition checks, along with having a complementary case for all variable assignments grouped under a destination heading (i.e. case statements).
Using such an approach with meaningful labels for variables lends itself to a type of implicit documentation, making the code more meaningful to someone not familiar with the code, as well as a refresher for yourself after a long period of not working or using the code.
I try to avoid the "~" because I prefer to always enclose definitions for variables in double quotes, to avoid issues that might arise from paths that may include unexpected characters or spaces. That way, you are sure to have your defined paths correctly interpreted by commands in scripts.
Lastly, I prefer to use the long form for the rsync options (and almost every other command) so that I don't have to refer to the manual every time to translate the single-character options when trying to understand what is coded, if the need arises for troubleshooting unexpected errors (I have always had poor memory).
My own backup command is as follows. The only reason why the
${PathMirror}${dirC}/
is not encapsulated in single quotes within the double quotes for COM is because I know those variables all evaluate to non-complex strings which cannot be misinterpreted.

How do I unzip a password protected file with Deflate64 compression? I have the password already. In python or vb.net

So I have a series of thousands of .zip files that need to be opened.
They are password protected, but I have the passwords.
Trying to automate the opening of these. the deflate64 issue is causing a lot of pain.
Okay, So deflate64 is proprietary which is annoying as that stops you from using the normal zipfile library in python. As a workaround i typically make a subprocess call to 7zip or similar. So something like:
import subprocess, sys
subprocess.Popen(["7z", "e", f"{filename}", f"-o{destination}", "-y" "-p" password])
Then naturally just run that in a loop over your files. Depending on how they are laid out you might want to just glob everything in a directory or pipe them via stdin etc.
Often tasks like this are well suited to shell scripts so you might want to consider that, I'm not a windows user but i think something like the following script would work as well:
#echo off
set pass=[password]
set folder=[folder]
for /R "%folder%" %%I in ("*.zip") do (
"C:\somedirectory\7z.exe" x -p%pass% -y -o"%%~dpI" "%%~fI"
)

Is there a way to wrap arbitary commands located under a subdirctory in a shell script

I have a bunch of customizations and would like to run my test program in a pristine environment.
Sure I could use a tiny shell script to wrap and pass of arguments but it would be cool and useful if I could invoke a pre and possibly post script only to commands located under certain sub directories. The shell I'm using is zsh.
I don't know what you include in your “pristine environment”.
If you want to isolate yourself from the whole system, then maybe chroot is what you're after. You can set up a complete new system, with its own /etc, /bin and so on, but sharing the kernel, networking and other non-filesystem stuff with your running system. Root's cooperation is required (the chroot system call is reserved to root).
If you want to isolate yourself from your dot files, run the program with a different value for the HOME environment variable:
HOME=~/test-environment /path/to/test-program
HOME=~/test-environment zsh
If this is specifically about zsh's configuration files, you can set the ZDOTDIR environment variable before starting it to tell zsh to run its own dot files from a directory other than $HOME (or zsh --no-rcs to not load any dot file).
If by pristine environment you mean a fully controlled set of environment variables, then the env program does this.
env -i PATH=$PATH HOME=$HOME program args
will run program args with only the environment variables you specified.

How do I synchronize in both directions?

I want to use rsync to synchronize two directories in both directions.
I refer to synchronization in classical sense
(not how it is meant in rsync manuals):
I want to update the directories in both directions,
depending on which of them is newer.
Can this be done by rsync (preferable in a Linux-way)?
If not, what other solutions exist?
Just run it twice, with "newer" mode (-u or --update flag) plus -t (to copy file modified time), -r (for recursive folders), and -v (for verbose output to see what it is doing):
rsync -rtuv /path/to/dir_a/* /path/to/dir_b
rsync -rtuv /path/to/dir_b/* /path/to/dir_a
This won't handle deletes, but I'm not sure there is a good solution to that problem with only periodic sync'ing.
Do you know Unison File Synchronizer?
Unison is a file-synchronization tool
for Unix and Windows. It allows two
replicas of a collection of files and
directories to be stored on different
hosts (or different disks on the same
host), modified separately, and then
brought up to date by propagating the
changes in each replica to the other. ...
Note also that it is resilient to failure:
Unison is resilient to failure. It is
careful to leave the replicas and its
own private structures in a sensible
state at all times, even in case of
abnormal termination or communication failures.
You need to run rsync twice and I recommend to run it with -au:
rsync -au /local/source/* /remote/destination
rsync -au /remote/destination/* /local/source
-a (a for archive) is a shortcut for -rlptgoD:
-r Recurse into sub directories
-l Also sync symbolic links
-p Also sync file permissions
-t Also sync file modification times
-g Also sync file groups
-o Also sync file owner
-D Also sync special (not regular/meta) files
Basically whenever you want to create an identical one-to-one copy using rsync, you should always use -a as that's what most users expect to happen when they talk about "syncing". Other answers here seem to overlook that sometimes the content of a file stays unchanged but its owner may have changed or its access permissions may have changed and in that case rsync would not sync the file which could be fatal.
But you also require -u as that tells rsync to completely leave any file/folder alone, in case it exists already at the destination and has a newer last modification date. Without -u rsync would sync regardless if a file/folder is newer or not.
Please note that this solution cannot handle deleted files. Handling deletes is not easily possible as consider the following situation: A file has been deleted at the source, now how shall rsync know if that file once existed and has been deleted (in that case it must be deleted at the destination as well) or whether it never existed at the source (in that case it must be copied from the destination). These two situations look identical to rsync thus it cannot know how to react correctly. It won't help to sync the other way round as that can lead to the same situation: A file exists at the source but not at the destination. Why? Has it never existed at the destination or has it been deleted? Both cases look identical to rsync.
Sync tools that can reliably sync deleted files usually manage a sync log about all past sync operations. If that log reveals that there once was a file and has been synced but now it is missing, it's clear that it has been deleted. If there never was such a file according to the log, it must be synced. By storing all log entries with timestamps, it's even possible that a deleted file comes back and gets deleted multiple times yet the sync tool will always know what to do and the result is always correct. rsync has no such log, it only relies on the current file state of two sides of the operation.
You can however build yourself a sync command using rsync and a bit POSIX shell scripting which gets already very close to a sync tool as described above. As I needed such a tool myself, here is an answer on Stackoverflow that guides you through the creation of such a script.
Thanks jsight
rsync -urv --progress dir_a dir_b && rsync -urv --progress dir_b dir_a
This would result in the second sync happening immediately after 1st sync is over. In case the directory structure is huge, this will save time, as one does not need to sit before the pc. If the structure is huge, remove the verbose and progress stuff
rsync -ur dir_a dir_b && rsync -ur dir_b dir_a
Use rsync <OPTIONS> [hostname:]source-dir [hostname:]dest-dir
for example:
rsync -pogtEtvr --progress --bwlimit=2000 xxx-files different-stuff
Will sync xxx-files to different-stuff/xxx-files .If different-stuff/xxx-files did not exist, it will create it - i.e. copy it.
-pogtEtv - just bunch of options to preserve file metadata, plus v - verbose and r - recursive
--progress - show progress of syncing in real time - super useful if you copy big files
--bwlimit=2000 - sets maximum speed of copying/syncing (bw = bandwidth)
P.S. rsync is critically important when you work over network in case of local machine you can use commands like cp.
Good Luck!
What you need is Rclone. Rclone ("rsync for cloud storage") is a command line Linux program to sync files and directories to and from different cloud storage providers (box,dropbox,ftp etc) and local filesystems. Rlone supports mirror syncing only.
Another more graphical solution which includes real-time syncing would be to use FreeFileSync, which includes the program RealTimeSync. FreefileSync support 2-way bidirectional syncing which includes handling deletes.
I was having the same question and end up using git. It might not fit your situation, but if anyone find this topic and have the same question, you may consider a version control system.
I'm using rsync with inotifywait.
When you change any file, rsync will be executed.
inotifywait -m --exclude "$_LOG_FILE" -r -e create,delete,delete_self,modify,moved_to --format "%w%f" "$folder"
You need run inotifywait on both host. Please check example inotifywait

rsync error: failed to set times on "/foo/bar": Operation not permitted

I'm getting a confusing error from rsync and the initial things I'm finding from web searches (as well as all the usual chmod'ing) are not solving it:
rsync: failed to set times on "/foo/bar": Operation not permitted (1)
rsync error: some files could not be transferred (code 23)
at /SourceCache/rsync/rsync-35.2/rsync/main.c(992) [sender=2.6.9]
It seems to be working despite that error, but it would be nice to get rid of that.
If /foo/bar is on NFS (or possibly some FUSE filesystem), that might be the problem.
Either way, adding -O / --omit-dir-times to your command line will avoid it trying to set modification times on directories.
The issue is probably due to /foo/bar not being owned by the writing process on a remote darwin (OS X) system.
A solution to the issue is to set adequate owner on the remote site.
Since this answer has been voted, and therefore has been hopefully useful to someone, I'm extending it to make it clearer.
The reason why this happens is that rsync is probably trying to set an arbitrary modification time (mtime) when copying files.
In order to do this darwin's system utime() function requires that the writing process effective uid is either the same as the file uid or super user's one, see opengroup utime's page.
Check this discussion on rsync mailing list as reference.
As #racl101 has commented on an answer, this problem might be related to the folder owner. The rsync command should be done by the same user as the folder owner's one. If it's not the same, you can change it.
chown -R userCorrect /remote/path/to/foo/bar
I had the same problem. For me the solution is to delete the remote file and let rsync create again.
The problem in my case was that the "receiver mountpoint" was incorrectly mounted. It was in read-only mode (for some extrange reason).
It looked like rsync was copying the files, but it was not.
I checked my fstab file and changed mount options to default, re-mount file system and execute rsync again. All fine then.
I've seen that problem when I'm writing to a filesystem which doesn't (properly) handle times -- I think SMB shares or FAT or something.
What is your target filesystem?
This happened to me on a partition of type xfs (rw,relatime,seclabel,attr2,inode64,noquota), where the directories where owned by another user in a group we were both members of. The group membership was already established before login, and the whole directory structure was group-writeable. I had manually run sudo chown -R otheruser.group directory and sudo chmod -R g+rw directory to confirm this.
I still have no idea why it didn't work originally, but taking ownership with sudo chown -R myuser.group directory fixed it. Perhaps SELinux-related?
I came across this problem as well and the issue I was having was a permissions issue with the root folder that contained the files I was trying to send over. I don't care about that root folder being included with rsync I just care what's in it. The error was coming from my command where I need to specify an additional / at the end. If you do not have that trailing slash rsync will attempt to set times the folder.
Example:
This will attempt to set times on html
rsync /var/www/html/ ubuntu#xxx.xxx.xxx.xxx:html
This will not
rsync /var/www/html/ ubuntu#xxx.xxx.xxx.xxx:html/
This error might also pop-up if you run the rsync process for files that are not recently modified in the source or destination...because it cant set the time for the recently modified files.
I ran into this error trying to fix timestamps on a new MacOS Monterey, after the Migration Assistant decided to set all of them to the time the copy operation occurred, instead of the original file's.
anddam's answer did not help me, as the remote user used in the rsync command did match the directories and files owner.
After further research, I realised that I had no access to the Mac's Documents directory over SSH (error ls: Documents: Operation not permitted).
I managed to fix the problem by opening System Preferences on the Mac, then selecting Security & Privacy, go to Privacy tab select Full Disk Access and check the box next to sshd-keygen-wrapper.
It could be that you don't have privileges to some of the files. From an administrator account, try "sudo rsync -av " Alternately, enable the root account and sign in as root. That should allow you to completely hose your system and brute force your rsync! ;-) I'm not sure if the above mentioned --extended-attributes will help, but I threw it in too, just for good measure.

Resources