tcl deep recursive file search, search for files with *.c extension - recursion

Using an old answer to search for a file in tcl:
https://stackoverflow.com/a/435094/984975
First lets discuss what I am doing right now:
Using this function:(credit to Jacson)
# findFiles
# basedir - the directory to start looking in
# pattern - A pattern, as defined by the glob command, that the files must match
proc findFiles { basedir pattern } {
# Fix the directory name, this ensures the directory name is in the
# native format for the platform and contains a final directory seperator
set basedir [string trimright [file join [file normalize $basedir] { }]]
set fileList {}
# Look in the current directory for matching files, -type {f r}
# means ony readable normal files are looked at, -nocomplain stops
# an error being thrown if the returned list is empty
foreach fileName [glob -nocomplain -type {f r} -path $basedir $pattern] {
lappend fileList $fileName
}
# Now look for any sub direcories in the current directory
foreach dirName [glob -nocomplain -type {d r} -path $basedir *] {
# Recusively call the routine on the sub directory and append any
# new files to the results
set subDirList [findFiles $dirName $pattern]
if { [llength $subDirList] > 0 } {
foreach subDirFile $subDirList {
lappend fileList $subDirFile
}
}
}
return $fileList
}
And calling the following command:
findFiles some_dir_name *.c
The current result:
bad option "normalize": must be atime, attributes, channels, copy, delete, dirname, executable, exists, extension, isdirectory, isfile, join, lstat, mtime, mkdir, nativename, owned, pathtype, readable, readlink, rename, rootname, size, split, stat, tail, type, volumes, or writable
Now, if we run:
glob *.c
We get a lot of files but all of them in the current directory.
The goal is to get ALL the files in ALL sub-folders on the machine with their paths.
Anyone who could help?
What I really want to do is find the directory with the highest count of *.c files.
However, if I could list all the files and their paths, I could count, how many files are in each directory and get the one with the highest count.

You are using an old version of Tcl. [file normalize] was introduced in Tcl 8.4 around 2002 or so. Upgrade already.
If you can't - then you use glob but call it once just for files and then walk the directories. See the glob -types option.
Here's a demo:
proc on_visit {path} {
puts $path
}
proc visit {base glob func} {
foreach f [glob -nocomplain -types f -directory $base $glob] {
if {[catch {eval $func [list [file join $base $f]]} err]} {
puts stderr "error: $err"
}
}
foreach d [glob -nocomplain -types d -directory $base *] {
visit [file join $base $d] $glob $func
}
}
proc main {base} {
visit $base *.c [list on_visit]
}
main [lindex $argv 0]

I would use the ::fileutil::traverse function to do it.
Something like:
package require ::fileutil::traverse
proc check_path {path} {
string equal [file extension $path] ".c"
}
set obj [::fileutil::traverse %AUTO% -filter check_path]
array set pathes {}
$obj foreach file {
if {[info exists pathes([file dirname $file])]} {
incr pathes([file dirname $file])
} else {
set pathes([file dirname $file]) 1
}
}
# print pathes and find the biggest
foreach {name value} [array get pathes] {
puts "$name : $value"
}

For quick (1 level) file pattern matching use:
glob **/*.c
If you want to recursively search, then use:
proc ::findFiles { baseDir pattern } {
set dirs [ glob -nocomplain -type d [ file join $baseDir * ] ]
set files {}
foreach dir $dirs {
lappend files {*}[ findFiles $dir $pattern ]
}
lappend files {*}[ glob -nocomplain -type f [ file join $baseDir $pattern ] ]
return $files
}
puts [ join [ findFiles $basepath "*.tcl" ] \n ]

Related

Recursion Logic

I've been working on a PowerShell script that will preserve directory structure when cabbing nested folders. I'm having a bit of difficulty getting the recursion logic right, though.
This is a rough, so there isn't any try/catch error code yet. I've commented out the Remove-Item to prevent accidental runs.
The logic I worked out for this is as follows.
Check & get base tree for subdirectories
Go one level in and check again
Continue until no directories and then return one level up.
Cab directory, remove directory, write log for automated extraction (file names of subdirectory cabs).
Repeat process next level up and continue until base directory
function Chkfordir ($clevel)
{
$dir = dir $clevel | ? { $_.PSIsContainer -eq $true } #Does Current Level have Folders?
if($dir -ne $null) # Yes
{
Chkfordir $dir #Go Deeper
}
if ($dir -eq $null) #Deepest Branch
{
return # Go Back One Level and begin Cabbing
}
$dir | % {
Compress-Directory $_.FullName (".\" + [string]$_.Name + ".cab")
echo ($_.FullName + ".cab" >> .\cleaf.log"
#Remove-Item -Recurse $_.FullName
return
}
}
The function call Compress-Directory is from here.
Edit Changes:
Will Re-Post Code Soon (08/18)
Edit 08/18 So I finally had a chance to test it and the logic seems to work now. There were some problems.
Most of the difficulty came with a powershell gotcha and the unnoticed problem that Compress-Directory is not path independent. Looks like I'll be needing to re-write this function later to be path independent.
The powershell gotcha was in a type change for a value on the pipeline. Apparently after returning from a function directory items are changed from System.IO.FileInfo to System.IO.DirectoryInfo with differently named member functions.
Echo was replaced with Add-Content as the redirection operators don't work in powershell.
There were some unaccounted states to contend with as well. A leaf directory which had no files would cause Compress-Directory to error or complete silently with no file creation (thus not preserving hierarchy).
Solution was to add an Add-Content for leaf folders before return, and moved Add-Content to before the Compress-Directory so there is at least one file in each directory.
I've included my current version below but it is a work in progress.
function Chkfordir ($clevel)
{
$dir = dir $clevel | ? { $_.PSIsContainer -eq $true } # Get Folders?
if ($dir -eq $null) { #Check if deepest branch
Add-Content (Join-Path $_.PSPath "\leaf.log") ([string]$_.FullName + ".cab")
return $_ # Return one level up and cab
}
$dir | % { #for each sub go deeper
Chkfordir $_.FullName
Add-Content (Join-Path $_.PSParentPath "\branch.log") ([string]$_.FullName + ".cab")
Compress-Directory $_.FullName ([string]$_.Name + ".cab")
#Remove-Item $_.FullName -recurse
}
}
You need to recurse for each subdirectory and compress it after the recursive call returns:
function Chkfordir($clevel) {
Get-ChildItem $clevel |
Where-Object { $_.PSIsContainer } |
ForEach-Object {
Chkfordir $_
Compress-Directory ...
...
}
}
That way you automatically descend first, then create the archives as you return.

How to use awk for multiple file search in two directories, print records only from files with matching string in second directory

Remade a previous question so that it is more clear. I'm trying to search files in two directories and print matching character strings (+ line immediately following) into a new file from the second directory only if they match a record in the first directory. I have found similar examples but nothing quite the same. I don't know how to use awk for multiple files from different directories and I've tortured myself trying to figure it out.
Directory 1, 28,000 files, formatted viz.:
>ABC
KLSDFIOUWERMSDFLKSJDFKLSJDSFKGHGJSNDKMVMFHKSDJFS
>GHI
OOILKJSDFKJSDFLMOPIWERIOUEWIRWIOEHKJTSDGHLKSJDHGUIYIUSDVNSDG
Directory 2, 15 files, formatted viz.:
>ABC
12341234123412341234123412341234123412341234123412341234123412341234
>DEF
12341234123412341234123412341234
>GHI
12341234123412341234123412341234123412341234123412341234123412341234123412341234
Desired output:
>ABC
12341234123412341234123412341234123412341234123412341234123412341234
>GHI
12341234123412341234123412341234123412341234123412341234123412341234123412341234
Directories 1 and 2 are located in my home directory: (./Test1 & ./Test2)
If anyone could advise command to specific the different directories, I'd be immensely grateful! Currently when I include file path (e.g., /Test1/*.fa) I get the following error:
awk: can't open file /Test1/*.fa
You'll want something like this (untested):
awk '
FNR==1 {
dirname = FILENAME
sub("/.*","",dirname)
if (NR==1) {
dirname1 = dirname
}
}
dirname == dirname1 {
if (FNR % 2) {
key = $0
}
else {
map[key] = $0
}
next
}
(FNR % 2) && ($0 in map) && !seen[$0,map[$0]]++ {
print $0 ORS map[$0]
}
' Test1/* Test2/*
Given you're getting the error message /usr/bin/awk: Argument list too long which means you're exceeding your shells maximum argument length for a command and that 28,000 of your files are in the Test1 directory, try this:
find Test1 -type f -exec cat {} \; |
awk '
NR == FNR {
if (FNR % 2) {
key = $0
}
else {
map[key] = $0
}
next
}
(FNR % 2) && ($0 in map) && !seen[$0,map[$0]]++ {
print $0 ORS map[$0]
}
' - Test2/*
Solution in TXR:
Data:
$ ls dir*
dir1:
file1 file2
dir2:
file1 file2
$ cat dir1/file1
>ABC
KLSDFIOUWERMSDFLKSJDFKLSJDSFKGHGJSNDKMVMFHKSDJFS
>GHI
OOILKJSDFKJSDFLMOPIWERIOUEWIRWIOEHKJTSDGHLKSJDHGUIYIUSDVNSDG
$ cat dir1/file2
>XYZ
SDOIWEUROIUOIWUEROIWUEROIWUEROIWUEROUIEIDIDIIDFIFI
>MNO
OOIWEPOIUWERHJSDHSDFJSHDF
$ cat dir2/file1
>ABC
12341234123412341234123412341234123412341234123412341234123412341234
>DEF
12341234123412341234123412341234
>GHI
12341234123412341234123412341234123412341234123412341234123412341234123412341234
$ cat dir2/file2
>STP
12341234123412341234123412341234123412341234123412341234123412341234123412341234
>MNO
123412341234123412341234123412341234123412341234123412341234123412341234
$
Run:
$ txr filter.txr dir1/* dir2/*
>ABC
12341234123412341234123412341234123412341234123412341234123412341234
>GHI
12341234123412341234123412341234123412341234123412341234123412341234123412341234
>MNO
123412341234123412341234123412341234123412341234123412341234123412341234
Code in filter.txr:
#(bind want #(hash :equal-based))
#(next :args)
#(all)
#dir/#(skip)
#(and)
# (repeat :gap 0)
#dir/#file
# (next `#dir/#file`)
# (repeat)
>#key
# (do (set [want key] t))
# (end)
# (end)
#(end)
#(repeat)
#path
# (next path)
# (repeat)
>#key
#datum
# (require [want key])
# (output)
>#key
#datum
# (end)
# (end)
#(end)
To separate the dir1 paths from the rest, we use an #(all) match (try multiple pattern branches, which must all match) with two branches. The first branch matches one #dir/#(skip) pattern, binding the variable dir to text that is preceded by a slash, and ignore the rest. The second branch matches a whole consecutive sequence of #dir/#file patterns via #(repeat :gap 0). Because the same dir variable appears that already has a binding from the first branch of the all, this constrains the matches to the same directory name. Inside this repeat we recurse into each file via next and gather the >-delimited keys into the keep hash. After that, we process the remaining arguments as path names of files to process; they don't all have to be in the same directory. We scan through each one for the >#key pattern followed by a line of #datum. The #(require ...) directive will fail the match if key is not in the wanted hash, otherwise we fall through to the #(output).

Batch file to find a file, create a copy in the same location and run this on multiple directories

here's what I am trying to do.
I have a few hundred users My Documents folders in which most(not all) have a file(key.shk for shortkeys program).
I need to upgrade the software but doing so makes changes to the original file.
I would like to run a batch file on the server to find the files in each My Docs folder and make a copy of it there called backup.shk
I can then use this for roll back.
The folder structure looks like this
userA\mydocs
userB\mydocs
userC\mydocs
My tools are xcopy, robocopy or powershell
Thanks in advance
This powershell script works... save as .ps1
Function GET-SPLITFILENAME ($FullPathName) {
$PIECES=$FullPathName.split(“\”)
$NUMBEROFPIECES=$PIECES.Count
$FILENAME=$PIECES[$NumberOfPieces-1]
$DIRECTORYPATH=$FullPathName.Trim($FILENAME)
$baseName = [System.IO.Path]::GetFileNameWithoutExtension($_.fullname)
$FILENAME = [System.IO.Path]::GetFileNameWithoutExtension($_.fullname)
return $FILENAME, $DIRECTORYPATH
}
$Directory = "\\PSFS03\MyDocs$\Abbojo\Insight Software"
Get-ChildItem $Directory -Recurse | where{$_.extension -eq ".txt"} | % {
$details = GET-SPLITFILENAME($_.fullname)
$name = $details[0]
$path = $details[1]
copy $_.fullname $path$name"_backup".txt
}

Tcl - Recursive walking and FTP upload

How can I do a recursive walk through a local folder in order to upload everything it has inside it to the desired ftp folder? Here's what I have so far :
package require ftp
set host **
set user **
set pass **
set ftpdirectory **
set localdirectory **
proc upload {host user pass dir fileList} {
set handle [::ftp::Open $host $user $pass]
ftpGoToDir $handle $dir
# some counters for our feedback string
set j 1
set k [llength $fileList]
foreach i $fileList {
upload:status "uploading ($j/$k) $i"
::ftp::Put $handle $i
incr j
}
::ftp::Close $handle
}
#---------------
# feedback
#---------------
proc upload:status {msg} {
puts $msg
}
#---------------
# go to directory in server
#---------------
proc ftpGoToDir {handle path} {
::ftp::Cd $handle /
foreach dir [file split $path] {
if {![::ftp::Cd $handle $dir]} {
::ftp::MkDir $handle $dir
::ftp::Cd $handle $dir
}
}
}
proc watchDirChange {dir intv {script {}} {lastMTime {}}} {
set nowMTime [file mtime $dir]
if [string eq $lastMTime ""] {
set lastMTime $nowMTime
} elseif {$nowMTime != $lastMTime} {
# synchronous execution, so no other after event may fire in between
catch {uplevel #0 $script}
set lastMTime $nowMTime
}
after $intv [list watchDirChange $dir $intv $script $lastMTime]
}
watchDirChange $localdirectory 5000 {
puts stdout {Directory $localdirectory changed!}
upload $host $user $pass $ftpdirectory [glob -directory $localdirectory -nocomplain *]
}
vwait forever
Thanks in advance :)
You're already using the ftp package, so that means you've got tcllib installed. Good. That means in turn that you've got the fileutil package as well, and can do this:
package require fileutil
# How to do the testing; I'm assuming you only want to upload real files
proc isFile f {
return [file isfile $f]
}
set filesToUpload [fileutil::find $dirToSearchFrom isFile]
The fileutil::find command is very much like a recursive glob, except that you specify the filter as a command instead of via options.
You might prefer to use rsync instead though; it's not a Tcl command, but it is very good and it will minimize the amount of data actually transferred.

Equivalent of *Nix 'which' command in PowerShell?

How do I ask PowerShell where something is?
For instance, "which notepad" and it returns the directory where the notepad.exe is run from according to the current paths.
The very first alias I made once I started customizing my profile in PowerShell was 'which'.
New-Alias which get-command
To add this to your profile, type this:
"`nNew-Alias which get-command" | add-content $profile
The `n at the start of the last line is to ensure it will start as a new line.
Here is an actual *nix equivalent, i.e. it gives *nix-style output.
Get-Command <your command> | Select-Object -ExpandProperty Definition
Just replace with whatever you're looking for.
PS C:\> Get-Command notepad.exe | Select-Object -ExpandProperty Definition
C:\Windows\system32\notepad.exe
When you add it to your profile, you will want to use a function rather than an alias because you can't use aliases with pipes:
function which($name)
{
Get-Command $name | Select-Object -ExpandProperty Definition
}
Now, when you reload your profile you can do this:
PS C:\> which notepad
C:\Windows\system32\notepad.exe
I usually just type:
gcm notepad
or
gcm note*
gcm is the default alias for Get-Command.
On my system, gcm note* outputs:
[27] » gcm note*
CommandType Name Definition
----------- ---- ----------
Application notepad.exe C:\WINDOWS\notepad.exe
Application notepad.exe C:\WINDOWS\system32\notepad.exe
Application Notepad2.exe C:\Utils\Notepad2.exe
Application Notepad2.ini C:\Utils\Notepad2.ini
You get the directory and the command that matches what you're looking for.
Try this example:
(Get-Command notepad.exe).Path
My proposition for the Which function:
function which($cmd) { get-command $cmd | % { $_.Path } }
PS C:\> which devcon
C:\local\code\bin\devcon.exe
A quick-and-dirty match to Unix which is
New-Alias which where.exe
But it returns multiple lines if they exist so then it becomes
function which {where.exe command | select -first 1}
I like Get-Command | Format-List, or shorter, using aliases for the two and only for powershell.exe:
gcm powershell | fl
You can find aliases like this:
alias -definition Format-List
Tab completion works with gcm.
To have tab list all options at once:
set-psreadlineoption -editmode emacs
This seems to do what you want (I found it on http://huddledmasses.org/powershell-find-path/):
Function Find-Path($Path, [switch]$All = $false, [Microsoft.PowerShell.Commands.TestPathType]$type = "Any")
## You could comment out the function stuff and use it as a script instead, with this line:
#param($Path, [switch]$All = $false, [Microsoft.PowerShell.Commands.TestPathType]$type = "Any")
if($(Test-Path $Path -Type $type)) {
return $path
} else {
[string[]]$paths = #($pwd);
$paths += "$pwd;$env:path".split(";")
$paths = Join-Path $paths $(Split-Path $Path -leaf) | ? { Test-Path $_ -Type $type }
if($paths.Length -gt 0) {
if($All) {
return $paths;
} else {
return $paths[0]
}
}
}
throw "Couldn't find a matching path of type $type"
}
Set-Alias find Find-Path
Check this PowerShell Which.
The code provided there suggests this:
($Env:Path).Split(";") | Get-ChildItem -filter notepad.exe
Try the where command on Windows 2003 or later (or Windows 2000/XP if you've installed a Resource Kit).
BTW, this received more answers in other questions:
Is there an equivalent of 'which' on Windows?
PowerShell equivalent to Unix which command?
If you want a comamnd that both accepts input from pipeline or as paramater, you should try this:
function which($name) {
if ($name) { $input = $name }
Get-Command $input | Select-Object -ExpandProperty Path
}
copy-paste the command to your profile (notepad $profile).
Examples:
❯ echo clang.exe | which
C:\Program Files\LLVM\bin\clang.exe
❯ which clang.exe
C:\Program Files\LLVM\bin\clang.exe
I have this which advanced function in my PowerShell profile:
function which {
<#
.SYNOPSIS
Identifies the source of a PowerShell command.
.DESCRIPTION
Identifies the source of a PowerShell command. External commands (Applications) are identified by the path to the executable
(which must be in the system PATH); cmdlets and functions are identified as such and the name of the module they are defined in
provided; aliases are expanded and the source of the alias definition is returned.
.INPUTS
No inputs; you cannot pipe data to this function.
.OUTPUTS
.PARAMETER Name
The name of the command to be identified.
.EXAMPLE
PS C:\Users\Smith\Documents> which Get-Command
Get-Command: Cmdlet in module Microsoft.PowerShell.Core
(Identifies type and source of command)
.EXAMPLE
PS C:\Users\Smith\Documents> which notepad
C:\WINDOWS\SYSTEM32\notepad.exe
(Indicates the full path of the executable)
#>
param(
[String]$name
)
$cmd = Get-Command $name
$redirect = $null
switch ($cmd.CommandType) {
"Alias" { "{0}: Alias for ({1})" -f $cmd.Name, (. { which $cmd.Definition } ) }
"Application" { $cmd.Source }
"Cmdlet" { "{0}: {1} {2}" -f $cmd.Name, $cmd.CommandType, (. { if ($cmd.Source.Length) { "in module {0}" -f $cmd.Source} else { "from unspecified source" } } ) }
"Function" { "{0}: {1} {2}" -f $cmd.Name, $cmd.CommandType, (. { if ($cmd.Source.Length) { "in module {0}" -f $cmd.Source} else { "from unspecified source" } } ) }
"Workflow" { "{0}: {1} {2}" -f $cmd.Name, $cmd.CommandType, (. { if ($cmd.Source.Length) { "in module {0}" -f $cmd.Source} else { "from unspecified source" } } ) }
"ExternalScript" { $cmd.Source }
default { $cmd }
}
}
Use:
function Which([string] $cmd) {
$path = (($Env:Path).Split(";") | Select -uniq | Where { $_.Length } | Where { Test-Path $_ } | Get-ChildItem -filter $cmd).FullName
if ($path) { $path.ToString() }
}
# Check if Chocolatey is installed
if (Which('cinst.bat')) {
Write-Host "yes"
} else {
Write-Host "no"
}
Or this version, calling the original where command.
This version also works better, because it is not limited to bat files:
function which([string] $cmd) {
$where = iex $(Join-Path $env:SystemRoot "System32\where.exe $cmd 2>&1")
$first = $($where -split '[\r\n]')
if ($first.getType().BaseType.Name -eq 'Array') {
$first = $first[0]
}
if (Test-Path $first) {
$first
}
}
# Check if Curl is installed
if (which('curl')) {
echo 'yes'
} else {
echo 'no'
}
You can install the which command from https://goprogram.co.uk/software/commands, along with all of the other UNIX commands.
If you have scoop you can install a direct clone of which:
scoop install which
which notepad
There also always the option of using which. there are actually three ways to access which from Windows powershell, the first (not necessarily the best) wsl -e which command (this requires installation of windows subsystem for Linux and a running distro). B. gnuwin32 which is a port of several gnu binaries in .exe format as standle alone bundled lanunchers option three, install msys2 (cross compiler platform) if you go where it installed in /usr/bin you'll find many many gnu utils that are more up-to-date. most of them work as stand alone exe and can be copied from the bin folder to your home drive somewhere amd added to your PATH.
There also always the option of using which. there are actually three ways to access which from Windows powershell
The first, (though not the best) is wsl(windows subsystem for linux)
wsl -e which command
This requires installation of windows subsystem for Linux and a running distro.
Next is gnuwin32 which is a port of several gnu binaries in .exe format as standle alone bundled lanunchers
Third, install msys2 (cross compiler platform) if you go where it installed in /usr/bin you'll find many many gnu utils that are more up-to-date. most of them work as stand alone exe and can be copied from the bin folder to your home drive somewhere amd added to your PATH.

Resources