grep netlist and cell name from file - unix

Suppose, Directory A contains two files, fileA and netlist.tcl. Below is information of fileA
Source "netlist.tcl"
cell "chklist"
I want that when the user selects fileA in Combobox in GUI,
Automatically in netlist string field: - a real path of the netlist.tcl from fileA popup
and in cell string field:-cell name from fileA popup.
How to achieve the above result?
output:
A/netlist.tcl
chklist

There's several ways to parse such a file. Here's one of the nicer ones with a safe child interpreter:
interp create -safe i
i alias Source netlistSource
i alias cell netlistCell
proc netlistSource {filename} {
global fn
set fn $filename
return
}
proc netlistCell {cellname} {
global cn
set cn $cellname
return
}
i invokehidden source "fileA"
interp delete i
That will store netlist.tcl in fn and chklist in cn. I'm not sure where the A/ prefix comes from, so I've left that part out.
Real code might need more aliases setting up. I hope you can see how easy that is to do. Remember, the aliases are called in the child, but call into your nominated code in the parent interpreter; it's a bit like doing an OS system call but with much less overhead. (Safe interpreters have all commands that touch the OS disabled/hidden by default.)

If all you need is just basic file I/O and string matching, then this would be a start. You would just need to adapt this into whatever you're doing with the GUI.
# File I/O
set fp [open fileA "r"]
set lines [split [read $fp] "\n"]
close $fp
# Check lines for netlist and cell
foreach line $lines {
if {[string match "Source*" $line]} {
set netlist [lindex $line 1]
if {[file exists $netlist]} {
puts [file normalize "./$netlist"]
}
}
if {[string match "cell*" $line]} {
set cell [lindex $line 1]
puts "$cell"
}
}
There are multiple ways to do this kind of work. regexes vs string matching. opening the file in Tcl vs executing a system call to grep vs using the fileutil package...
There's nothing here that isn't covered in Tcl introduction, like https://wiki.tcl-lang.org/page/Tcl+Tutorial+Index. It would be helpful to understand what you've already tried.
Donal's previous answer using a safe interpreter is pretty cool too, if you understand what's happening.

Related

how to seach and print realpath version in tcl?

Suppose a path x/y/z has 5 directories i.e 1.1,1.2,1.3,1.4,1.5. now I want to print only that directories names which is greater than 1.1. if in another path a/b/c same directories is present, but 1.2 dir is missing, then it should print 1.3 as the next directory is higher than 1.1? How to do that in tclsh???
I assume you're talking about filenames here? Of directories?
To get a list of directories in a location that match a pattern like that, you might use:
# d for “directory”
set names [glob -directory a/b/c -type d {[0-9]*.[0-9]*}]
That's going to be in random order (well, it depends on a vast number of factors in the OS so pretending it is random is much simpler!) and might have some false positives in it. We need to filter and sort. Fortunately, we have package vsatisfies to do the parsing.
set filtered [lmap name $names {
try {
if {[package vsatisfies [file tail $name] 1.1]} {
set name
} else continue
} on error {} continue
}]
# You'll find that dictionary sorting does the Right Thing in this case
set sorted [lsort -dictionary $filtered]
All that's left now is to print the elements in the list out. The maximum item in it is [lindex $sorted end]…

How to delete a row in a csv file with powershell in R?

Good morning,
I'm new about powershell and I'd like to ask you if somebody can help me.
I have a big csv file around 3.5gb and my goal is to load it with fread (a data.table function) in R environment, but this function makes a error.
> n_a<-fread("C:/x/xy/xyz/name_file.csv",sep=";", fill = TRUE)
The error is:
Warning message:
In fread("C:/x/xy/xyz/name_file.csv") :
Stopped early on line 458945. Expected 29 fields but found 30. Consider fill=TRUE and comment.char=. First discarded non-empty line
I tried to use different way (I putted in my code fill=true, but doesn't work) to solve the problem, but I couldn't do it.
After different researches I found this kind of solution (always to do in R):
>system("powershell Get-Content C:/a/b/c/file.csv | Select -Index (0..458944 + 1000000) > output.csv")
The focus about the use of powershell in R is to delete a specific row and to load with fread the file.
My question is:
How I can delete a specific row in a csv in powershell but without specifying the length of the matrix?
Thank you in advance for every type of help.
Francesco
I'd hazard a guess that the invalid row's location is not known. In such a case, it might be sensible to read the original file and create a new file that contains only valid data. What's more, if the source data would benefit of manipulation, it can be done before reading it into R.
A file as large as 3,5 GiB is a bit on the large side to read in memory as such. Sure, it can be done in the days of 64 bit systems, but for simple row processing it's unwieldy. A scalable solution uses .Net methods and row-by-row approach.
To process a file on row-by-row basis, use .Net methods for efficient row reading. A StringBuilder is created to store rows that contain valid data, others are discarded. The StringBuilder is flushed on disk every so often. Even on days of SSDs, a write operation for each row is relatively slow in respect to writing in a bulk of, say, 10 000 rows a time.
$sb = New-Object Text.StringBuilder
$reader = [IO.File]::OpenText("MyCsvFile.csv")
$i = 0
$MaxRows = 10000
$colonCount = 30
while($null -ne ($line = $reader.ReadLine())) {
# Split the line on semicolons
$elements = $line -split ';'
# If there were $colonCount elements, add those to builder
if($elements.count -eq $colonCount) {
# If $line's contents need modifications, do it here
# before adding it into the builder
[void]$sb.AppendLine($line)
++$i
}
# Write builder contents into file every now and then
if($i -ge $MaxRows) {
add-content "MyCleanCsvFile.csv" $sb.ToString()
[void]$sb.Clear()
$i = 0
}
}
# Flush the builder after the loop if there's data
if($sb.Length -gt 0) {
add-content "MyCleanCsvFile.csv" $sb.ToString()
}
This is easy done in powershell: Read csv in generic list, remove line and write back:
Add-Type -AssemblyName System.Collections
[System.Collections.Generic.List[string]]$csvList = #()
$csvFile = 'C:\test\myfile.csv'
$csvList = [System.IO.File]::ReadLines( $csvFile )
$lineToDelete = 2
[void]$csvList.RemoveAt( $lineToDelete - 1 )
[System.IO.File]::WriteAllLines( $csvFile, $csvList ) | Out-Null
vonPryz's helpful answer offers the best solution, given the size of your input file.
The following works too, but will be slow - in general, due to the overhead of using a pipeline, but also because Get-Content itself is slow due to decorating each line read with additional properties (see green-lighted, but not yet implemented GitHub suggestion #7537):
# Exclude line number 458945 (0-based index 458944)
Get-Content C:/a/b/c/file.csv | Select-Object -SkipIndex 458944 > output.csv
The beneficial flip side of use of the pipeline is that it acts as a memory throttle, so the above command can be used to process arbitrarily large files (though it may take a long time).

Iterating through !DumpHeap output to read value at memory offset

I'm trying to come up with a WinDbg command line expression that takes the output of the !DumpHeap command and for each address, reads a 64-bit value from offset 0x08 after the address. I think this is possible (not sure about it) but every attempt I made so far fails with some error.
I searched a lot but most WinDbg articles show simple examples which I can try but my attempts fail.
I have a process dump of an ASP.NET worker process. The process has some memory growth but there's no clear offender so I'm trying to list a number of objects that appear many times in memory. I'm using sos.dll for the managed debugging WinDbg extensions.
Here's what I'm trying to do
.foreach(myaddress {!dumpheap -short -mt 000007fe998adea8})
{r #$t0=poi(myaddress+0x8);!do #$t0;.echo ************* myaddress}
Note, that the above command must be on a single line - I only added a line break for better readability here.
For the above line, WinDbg prints this error: Couldn't resolve error at 'myaddress+0x8);!do #$t0;.echo ************* 00000001003cb870'.
I'm trying to iterate through all addresses returned by !DumpHeap - each address should go into the myaddress variable. Then, for each address, I'm trying to set the $t0 user register to the value read from myaddress+0x8. The !do (!DumpObject) command would then dump the object at that address.
If I run only (again, on one line in WinDbg):
.foreach(myaddress {!dumpheap -short -mt 000007fe998adea8})
{!do myaddress;.echo ************* myaddress}
I get a list of object dumps but this is one level higher than what I need. I want to drill down one level deeper and dump a particular member of these top-level objects that I'm iterating through.
Is this possible or am I on the wrong track with this?
After further searching, I found that I was using the wrong syntax. According to question and to MSDN, variable names must be surrounded by spaces or must be enclosed in ${...} to work. After I used the ${} enclosure, my script started working.
For future reference, here's how to run the script (keep it on one line in WinDbg):
.foreach(myaddress {!dumpheap -short -mt 000007fe998adea8})
{r #$t0=poi(${myaddress}+0x8);!do #$t0;.echo ************* myaddress}
yes you need space around the aliases
.foreach ( place { .shell -ci "!DumpHeap -stat" sed 1,3d | awk "{print
$1 }" } ) { .foreach (plays { !DumpHeap -short -mt place } ) { r $t0 =
poi( plays + 8 ) ; !do #$t0 ; .echo
========================================= } }

How do I convert my 5GB 1 liner file to lines based on pattern?

I have a 5GB 1 liner file with JSON data and each line starts from this pattern "{"created". I need to be able to use Unix commands on my Mac to convert this monster of a 1 liner into as many lines as it deserves. Any commands?
ASCII English text, with very long lines, with no line terminators
If you have enough memory you can open the file once with the TextWrangler application (the free BBEdit cousin) and use regular search/replace on the whole file. Use \r in replace to add a return. Will be very slow at opening the file, may even hang if low on memory, but in the end it may probably work. No scripting, no commands,.. etc.. I did this with big SQL files and sometimes it did the job.
You have to replace your line-start string with the same string with \n or \r or \r\n in front of it.
Unclear how it can be a “one liner” file but then each line starts with "{"created", but perhaps python -mjson.tool can help you get started:
cat your_source_file.json | python -mjson.tool > nicely_formatted_file.json
Piping raw JSON through ``python -mjson.tool` will cleanly format the JSON to be more human readable. More info here.
OS X ships with both flex and bison, you can use those to write a parser for your data.
You can use PHP as a shell command (if PHP is installed), just save a text file with name "myscript" and appropriate code (I cannot test code now, but the idea is as follows)
UNTESTED CODE
#!/usr/bin/php
<?php
$REPLACE_STRING='{"created'; // anything you like
// open input file with fopen() in read mode
$inFp=fopen('big_in_file.txt', "r");
// open output file with fopen() in write mode
$outFp=fopen('big_out_file.txt', "w+");
// while not end of file
while (!feof($inFp)) {
// read file chunks here with fread() in variable $chunk
$chunk = fread($inFp, 8192);
// do a $chunk=str_replace($REPLACE_STRING,"\r".$REPLACE_STRING; // to add returns
// (or use \r\n for windows end of lines)
$chunk=str_replace($REPLACE_STRING,"\r".$REPLACE_STRING,$chunk);
// problem: if chunk contains half the string at the end
// easily solved if $REPLACE_STRING is a one char like '{'
// otherwise test for fist char { in the end of $chunk
// remove final part and save it in a var for nest iteration
// write $chunk to output file
fwrite($outFp, $chunk);
// End while
}
?>
After you save it you must make it executable whith sudo chmod a+x ./myscript
and then launch it as ./myscript in terminal
After this, the myscript file is a full unix command

Has there ever been a unix system call to create a link from an open file descriptor? [duplicate]

In Unix, it's possible to create a handle to an anonymous file by, e.g., creating and opening it with creat() and then removing the directory link with unlink() - leaving you with a file with an inode and storage but no possible way to re-open it. Such files are often used as temp files (and typically this is what tmpfile() returns to you).
My question: is there any way to re-attach a file like this back into the directory structure? If you could do this it means that you could e.g. implement file writes so that the file appears atomically and fully formed. This appeals to my compulsive neatness. ;)
When poking through the relevant system call functions I expected to find a version of link() called flink() (compare with chmod()/fchmod()) but, at least on Linux this doesn't exist.
Bonus points for telling me how to create the anonymous file without briefly exposing a filename in the disk's directory structure.
A patch for a proposed Linux flink() system call was submitted several years ago, but when Linus stated "there is no way in HELL we can do this securely without major other incursions", that pretty much ended the debate on whether to add this.
Update: As of Linux 3.11, it is now possible to create a file with no directory entry using open() with the new O_TMPFILE flag, and link it into the filesystem once it is fully formed using linkat() on /proc/self/fd/fd with the AT_SYMLINK_FOLLOW flag.
The following example is provided on the open() manual page:
char path[PATH_MAX];
fd = open("/path/to/dir", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);
/* File I/O on 'fd'... */
snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file", AT_SYMLINK_FOLLOW);
Note that linkat() will not allow open files to be re-attached after the last link is removed with unlink().
My question: is there any way to re-attach a file like this back into the directory structure? If you could do this it means that you could e.g. implement file writes so that the file appears atomically and fully formed. This appeals to the my compulsive neatness. ;)
If this is your only goal, you can achieve this in a much simpler and more widely used manner. If you are outputting to a.dat:
Open a.dat.part for write.
Write your data.
Rename a.dat.part to a.dat.
I can understand wanting to be neat, but unlinking a file and relinking it just to be "neat" is kind of silly.
This question on serverfault seems to indicate that this kind of re-linking is unsafe and not supported.
Thanks to #mark4o posting about linkat(2), see his answer for details.
I wanted to give it a try to see what actually happened when trying to actually link an anonymous file back into the filesystem it is stored on. (often /tmp, e.g. for video data that firefox is playing).
As of Linux 3.16, there still appears to be no way to undelete a deleted file that's still held open. Neither AT_SYMLINK_FOLLOW nor AT_EMPTY_PATH for linkat(2) do the trick for deleted files that used to have a name, even as root.
The only alternative is tail -c +1 -f /proc/19044/fd/1 > data.recov, which makes a separate copy, and you have to kill it manually when it's done.
Here's the perl wrapper I cooked up for testing. Use strace -eopen,linkat linkat.pl - </proc/.../fd/123 newname to verify that your system still can't undelete open files. (Same applies even with sudo). Obviously you should read code you find on the Internet before running it, or use a sandboxed account.
#!/usr/bin/perl -w
# 2015 Peter Cordes <peter#cordes.ca>
# public domain. If it breaks, you get to keep both pieces. Share and enjoy
# Linux-only linkat(2) wrapper (opens "." to get a directory FD for relative paths)
if ($#ARGV != 1) {
print "wrong number of args. Usage:\n";
print "linkat old new \t# will use AT_SYMLINK_FOLLOW\n";
print "linkat - <old new\t# to use the AT_EMPTY_PATH flag (requires root, and still doesn't re-link arbitrary files)\n";
exit(1);
}
# use POSIX qw(linkat AT_EMPTY_PATH AT_SYMLINK_FOLLOW); #nope, not even POSIX linkat is there
require 'syscall.ph';
use Errno;
# /usr/include/linux/fcntl.h
# #define AT_SYMLINK_NOFOLLOW 0x100 /* Do not follow symbolic links. */
# #define AT_SYMLINK_FOLLOW 0x400 /* Follow symbolic links. */
# #define AT_EMPTY_PATH 0x1000 /* Allow empty relative pathname */
unless (defined &AT_SYMLINK_NOFOLLOW) { sub AT_SYMLINK_NOFOLLOW() { 0x0100 } }
unless (defined &AT_SYMLINK_FOLLOW ) { sub AT_SYMLINK_FOLLOW () { 0x0400 } }
unless (defined &AT_EMPTY_PATH ) { sub AT_EMPTY_PATH () { 0x1000 } }
sub my_linkat ($$$$$) {
# tmp copies: perl doesn't know that the string args won't be modified.
my ($oldp, $newp, $flags) = ($_[1], $_[3], $_[4]);
return !syscall(&SYS_linkat, fileno($_[0]), $oldp, fileno($_[2]), $newp, $flags);
}
sub linkat_dotpaths ($$$) {
open(DOTFD, ".") or die "open . $!";
my $ret = my_linkat(DOTFD, $_[0], DOTFD, $_[1], $_[2]);
close DOTFD;
return $ret;
}
sub link_stdin ($) {
my ($newp, ) = #_;
open(DOTFD, ".") or die "open . $!";
my $ret = my_linkat(0, "", DOTFD, $newp, &AT_EMPTY_PATH);
close DOTFD;
return $ret;
}
sub linkat_follow_dotpaths ($$) {
return linkat_dotpaths($_[0], $_[1], &AT_SYMLINK_FOLLOW);
}
## main
my $oldp = $ARGV[0];
my $newp = $ARGV[1];
# link($oldp, $newp) or die "$!";
# my_linkat(fileno(DIRFD), $oldp, fileno(DIRFD), $newp, AT_SYMLINK_FOLLOW) or die "$!";
if ($oldp eq '-') {
print "linking stdin to '$newp'. You will get ENOENT without root (or CAP_DAC_READ_SEARCH). Even then doesn't work when links=0\n";
$ret = link_stdin( $newp );
} else {
$ret = linkat_follow_dotpaths($oldp, $newp);
}
# either way, you still can't re-link deleted files (tested Linux 3.16 and 4.2).
# print STDERR
die "error: linkat: $!.\n" . ($!{ENOENT} ? "ENOENT is the error you get when trying to re-link a deleted file\n" : '') unless $ret;
# if you want to see exactly what happened, run
# strace -eopen,linkat linkat.pl
Clearly, this is possible -- fsck does it, for example. However, fsck does it with major localized file system mojo and will clearly not be portable, nor executable as an unprivileged user. It's similar to the debugfs comment above.
Writing that flink(2) call would be an interesting exercise. As ijw points out, it would offer some advantages over current practice of temporary file renaming (rename, note, is guaranteed atomic).
Kind of late to the game but I just found http://computer-forensics.sans.org/blog/2009/01/27/recovering-open-but-unlinked-file-data which may answer the question. I haven't tested it, though, so YMMV. It looks sound.

Resources