p4 changes ignore certain directories and files due to maxscanrows - unix

When I run p4 changes ... in a particular directory I am getting the following error:
Too many rows scanned (over 16000000); see 'p4 help maxscanrows'.
I have figured out which directory is causing the issues, but now I don't know how to exclude it from my p4 changes ... command.
I have tried several variations with no success:
p4 changes ... -//depot/.../baddir/...
p4 changes ... -baddir/...
Is this even possible with a single p4 changes command?
Below is a simple example of what I am trying to do:
I have some directories and files:
base/
sub0/
sub1/
sub2/
file0.txt
file1.txt
file2.txt
I want to run p4 changes ... in base and have it include sub0 and sub1 along with all files within base but exclude sub2.

Multiple arguments to a command aren't combined into a single mapping, they're evaluated independently, so exclusions don't do anything when they're specified on the command line. Instead, add those exclusions to your client view:
//depot/base/... //your_client/...
-//depot/base/sub2/... //your_client/sub2/...
If removing the directory from your client view is infeasible because you need to work with these files, your MaxScanRows limit should be set higher -- the idea of MaxScanResults and MaxScanRows is to force you to limit your client view and/or queries to include only the files/revisions that are necessary for your project. If your project actually contains more files than that, the limits are set too low.

Related

scp_download to download multiple files based on a pattern?

I need to download many files from a server (specifically tectia) ideally using the ssh package. These files all follow the a predictable pattern across multiple sub folders. The filepath is formatted like this
/directory/subfolder/A001/abcde001.csv
Where A001 counts up alongside the last 3 digits of the filename (/A002/abcde002.csv and so on)
In the vignette for scp_download it states that the files parameter may contain wildcards so I have tried to do something like
scp_download(session, "/directory/subfolder/A.*/abcde.*[.]csv", to=tempdir())
and
scp_download(session, "directory/subfolder/A\\d{3}/abcde\\d{3}[.]csv", to=tempdir())
but no matter which combination of patterns or wildcards I can think of (which isn't many) I only get something like
Warning: SSH warning: scp: /directory/subfolder/A\d{3}/abcde\d{3}[.]csv: No such file or directory
What I'm hoping to do is either find a way to do pattern matching here, or to find a way to store tectia directories as a string to be read by scp_download. I've made sure that my session is connected properly and it works without attempting to pattern match, which it does.
I had the same problem. The problem is that when you use * in your pattern it gets escaped when you send it to the server. However, when you request a special file name like this /directory/subfolder/A001/abcde001.csv, it works fine.
Finally I changed my code based on the below steps:
I got the list of files/folders using ls command with ssh_exec_wait function and then store them on a variable.
Download files in the variable separately
session <- ssh_connect("username#ip",passwd="password")
files<-capture.output(ssh_exec_wait(session, command = 'ls /directory/subfolder/A001/*'))
dnc1<- scp_download(session, files[1], to = paste0(getwd(),"/data/"))
dnc2<- scp_download(session, files[2], to = paste0(getwd(),"/data/"))
dnc3<- scp_download(session, files[3], to = paste0(getwd(),"/data/"))
The bottom 3 commands can be done in a loop as this could be hundreds or thousands of records.

Datastage Sequence job- how to process each file at a time if those files are in 7 different folders

DataStage - There are 7 folders in a path and in each folder there are 2 files . for eg : the 2 files are in the folllowing format- filename = test_s1_YYYYMMDD.txt, test_s1_YYYYMMDD.done. The path for these files are user/test/test_s1/
user/test/test_s2/
...
...
..
user/test/test_s7/------here s1,s2...s7 represents the different folders
In these folders the 2 above mentioned files are present , so how can i process each file in a sequence job?
First you need a job to process a file and the filename needs to be a parameter of that job.
For the Sequence level you need two levels - the inner one for the two files within each folder and a outer one for the different directories.
For the inner one you can choose to build a loop with to iterations or simply add the processing job twice to the sequence (which will reduce complexity in case it will always be two files).
The outer Sequence is a loop where you could parameterize the path in a way that the loop counter could be used to generate your 1-7 flexible path addon.
Check out more details on loops here
You can use the loop counter (stage_label.$Counter) to parameterize your job.
Depending on what you want to do with the files, it is an important decision how to process your files. Starting a job (or more) in a sequence for each file can lead to heavy overhead for just starting the jobs. Try loading all files at once in a parallel job using the sequenial file stage.
In the Sequential File Stage, set the appropriate Format. You can also set everything to none to just put each row in one column and process that in a later job. This will make the reading very flexible and forgiving. If your files are all the same structure, define your columns as needed.
To select the files, use File Patterns. In the Options of the Sequential File Stage, choose to have a File Name Column so you can process the filenames in a later job. You might also want to add a Row Number Column.
This method works pretty fast.

Multiple diff outputs in one patchfile

I was trying to store multiple diff outputs in one .patch file to keep versioning in one file instead of running multiple
diff -u f1 f2 > f1.patch
commands. Preferably I'd keep running
diff -u[other params?] f1 f2 >> f1.patch
to have one file containing all changes which would allow me to later on run patch on those files to have a f1 file available in any given moment.
Unfortunately patch fails with file generated in such manner. It only seems to apply first patch from the file and then quits with error.
My question: is that possible with diff and patch? And if so, how?
Thank you in advance.

Is there a way to avoid recursive make with nobase?

I've got the following directory structure:
Makefile.am
src/
mymod/
mod.cc
submod/
submod.cc
inc/
Makefile.am
mymod/
mod.hh
submod/
submod.hh
Using autotools, I'd like to distribute both a library made from src and the headers in inc. The top level Makefile.am looks something like
lib_LTLIBRARIES = mylib.la
mylib_la_SOURCES=./mymod/mod.cc\
./mymod/submod/submod.cc
SUBDIRS=inc
Then inc/Makefile.am has
mymod_includedir=$(includedir)
nobase_mymod_include_HEADERS=mymod/mod.hh\
mymod/submod/submod.hh
This works OK. I end up with whatever library stuff, and my headers get installed appropriately. However, I'd like to eliminate the recursion involved in the Makefile. The problem is that if I move the lines in inc/Makefile.am to the root directory, then I have to update the paths as follows:
mymod_includedir=$(includedir)
nobase_mymod_include_HEADERS=inc/mymod/mod.hh\
inc/mymod/submod/submod.hh
This results in my headers getting dumped as $PREFIX/include/inc/mymod/mod.hh and not $PREFIX/include/mymod/mod.hh like I want. I know I
could do something like
mymodincludedir=$(includedir)/mymod
mymod_HEADERS=inc/mymod/mod.hh
mysubmodincludedir=$(includedir)/mymod/submod
mysubmod_HEADERS=inc/mymod/submod/submod.hh
but that's pretty painful, because there's a lot of subdirectories, and more subdirectories within the subdirectories (we're distributing a 3rd party's code that our own headers need). What I'd like to be able to do is either tell automake to just copy the directories in /inc to $(includepath) along with every subdirectory it encounters within, or tell it to only strip part of the path from the header files I'm listing. Is this possible?
I think the closest you can find is Karel Zak's Makemodule.am approach for which nobase_ would work as you need.

How to `diff` files to create a "common" file?

I have a slew of CSS files to go through where someone just grunted through making alterations to various core stylesheets on a number of subsites. Obviously if the original developer had had some foresight they would have just included a master stylesheet and overridden the necessary elements…
I first started off with comm thinking that it might do the trick, but quickly found that it needed to receive a sorted input file.
I then switched over to diff and have gotten down to the following through some reading and research:
diff --unchanged-group-format="## %dn,%df%c'\012'%<" --old-group-format='' --new-group-format='' --changed-group-format='' file_1.css file_2.css
The previous obviously is almost there, but:
A) I need to grep out the ## lines (which should be fine, right? At first glance this appears right, but does diff throw in any other unexpected lines that need to be yanked?) and then
B) I need to create two more files that first is the leftover unique lines from file_1.css and then the leftover unique lines of file_2.css.
Obviously the first "in common" file will go into an include folder and then be included into the two latter created files as a #import url("common.css");
I am thinking that the following simple alteration will create the latter two files to which I'm referring:
diff --unchanged-group-format='' --old-group-format="## %dn,%df%c'\012'%<" --new-group-format='' --changed-group-format='' file_1.css file_2.css
diff --unchanged-group-format='' --old-group-format='' --new-group-format="## %dn,%df%c'\012'%<" file_1.css file_2.css
Sample files:
file 1: https://gist.github.com/c13843972c47b5037704
file 2: https://gist.github.com/fff39eae386e8969dc10
So for example, upon executing a test of the following:
diff --unchanged-group-format="## %dn,%df%c'\012'%<" --old-group-format='' --new-group-format='' --changed-group-format='' file_1.css file_2.css | egrep -v "^##\d*" > common.css
diff --unchanged-group-format='' --old-group-format="## %dn,%df%c'\012'%<" --new-group-format='' --changed-group-format='' file_1.css file_2.css | egrep -v "^##\d*" > old.css
And then searching for body with egrep "^body" *css, it yielded only a body in common.css and none in old.css, whereas it showed that there were two different entries in file_1.css and file_2.css. So obviously this methodology is flawed.
How would one about creating these two files that would ultimately become the common include and the override files?
#ylluminate, you have a couple of options:
use BeyondCompare to visually verify the differences. It does a fantastic job comparing similar files. It allows saving common lines/left only lines/right only lines. Only downside is it is interactive and if you have a lot of files, will take some time. On the positive side, it looks like you want to build trust first by testing it out a few times.
Add formatting text for --changed-group-format and capture modified code (and the old code as your command does it now). You need to run one more comparison to get what is in new code but not in old code. Downside here is the validation is going to be hard.
Saving all the lines in a database table and comparing columns is another option. Take care to store old and new line numbers. Downsides are the data lines need to be unique, blank lines will be chopped off.
I would go with option 1 if i have less than 50 files.
Hope this helps.
PS: I am not associated with BeyondCompare in any way. just a happy user of the software

Resources