creating multi-dimensional array from textfile in perl [closed] - multidimensional-array

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
The file temp.txt has contents like this:
ABC 1234 56 PQR
XYZ 8672 12 RQP
How to store the temp.txt file into a two dimensional array, so that I can access them through the array index?

use File::Slurp;
use Data::Dumper;
my #arr = map [split], read_file("temp.txt");
print Dumper \#arr;
output
$VAR1 = [
[
'ABC',
'1234',
'56',
'PQR'
],
[
'XYZ',
'8672',
'12',
'RQP'
]
];

At a minimum, you could do this
my #file = load_file($filename);
sub load_file {
my $filename = shift;
open my $fh, "<", $filename or die "load_file cannot open $filename: $!";
my #file = map [ split ], <$fh>;
return #file;
}
This will read an argument file, split the content on whitespace and put it inside an array ref (one per line), then return the array with array refs. On exiting the subroutine, the file handle will be closed.
This is a somewhat clunky solution, in some ways. It loads the entire file into memory, it does not have a particularly fast lookup when you are looking for a specific value, etc. If you have a unique key in each row, you can use a hash instead of an array, to make lookup faster:
my %file = map { my ($key, #vals) = split; $key => \#vals; } <$fh>;
Note that the keys must be unique, or they will overwrite each other.
Or you can use Tie::File to only look up the values you want:
use Tie::File;
tie my #file, 'Tie::File', $filename or die "Cannot tie file: $!";
my $line = [ split ' ', $file[0] ];
Or if you have a specific delimiter on the lines of your file, and a format that complies with the CSV format, you can use Tie::File::CSV
use Tie::File::CSV;
tie my #file, 'Tie::Array::CSV', $filename, sep_char => ' '
or die "Cannot tie file: $!";
my $line = $file[0];
Note that using this module might be overkill, and might cause problems if you do not have a strict csv format. Also, Tie::File has a reputation of decreasing performance. Which solution is best depends largely on your needs and preferences.

Related

Producing files in dagster without caring about the filename

In the dagster tutorial, in the Materializiations section, we choose a filename (sorted_cereals_csv_path) for our intermediate output, and then yield it as a materialization:
#solid
def sort_by_calories(context, cereals):
# Sort the data (removed for brevity)
sorted_cereals_csv_path = os.path.abspath(
'calories_sorted_{run_id}.csv'.format(run_id=context.run_id)
)
with open(sorted_cereals_csv_path, 'w') as fd:
writer = csv.DictWriter(fd, fieldnames)
writer.writeheader()
writer.writerows(sorted_cereals)
yield Materialization(
label='sorted_cereals_csv',
description='Cereals data frame sorted by caloric content',
metadata_entries=[
EventMetadataEntry.path(
sorted_cereals_csv_path, 'sorted_cereals_csv_path'
)
],
)
yield Output(None)
However, this is relying on the fact that we can use the local filesystem (which may not be true), it will likely get overwritten by later runs (which is not what I want) and it's also forcing us to come up with a filename which will never be used.
What I'd like to do in most of my solids is just say "here is a file object, please store it for me", without concerning myself with where it's going to be stored. Can I materialize a file without considering all these things? Should I use python's tempfile facility for this?
Actually it seems this is answered in the output_materialization example.
You basically define a type:
#usable_as_dagster_type(
name='LessSimpleDataFrame',
description='A more sophisticated data frame that type checks its structure.',
input_hydration_config=less_simple_data_frame_input_hydration_config,
output_materialization_config=less_simple_data_frame_output_materialization_config,
)
class LessSimpleDataFrame(list):
pass
This type has an output_materialization strategy that reads the config:
def less_simple_data_frame_output_materialization_config(
context, config, value
):
csv_path = os.path.abspath(config['csv']['path'])
# Save data to this path
And you specify this path in the config:
execute_pipeline(
output_materialization_pipeline,
{
'solids': {
'sort_by_calories': {
'outputs': [
{'result': {'csv': {'path': 'cereal_out.csv'}}}
],
}
}
},
)
You still have to come up with a filename for each intermediate output, but you can do it in the config, which can differ per-run, instead of defining it in the pipeline itself.

Filemaker Pro - Using Script to populate report layout

I have a problem where I have a list of fields from a table (not static, can be modified by user), and I need to generate a report using these user selected fields. The report can show all the rows, no need for aggregation or filtering.
I thought I could create a report layout then using a filemaker script to populate it but can't seem to find the right commands, can someone let me know how I could achieve this?
I'm using filemaker pro 18 advanced
Thanks in advance!
EDIT: Since you want a dynamic report, then I recommend you look up a technique called "Virtual List" for rendering the data.
Here's an example script that iterates over a found set of records and builds the virtual list data in a variable (it doesn't show how to render it though):
# Field names and delimiter
Set Variable [ $delim ; Value: Char(9) // tab character ]
# Set these dynamically with a script parameter
Set Variable [ $fields ; Value: List ( "Contacts::nameFirst" ; "Contacts::nameCompany" ; "Contacts::nameLast" ) ]
Set Variable [ $fieldCount ; Value: ValueCount ( $fields ) ]
Go to Layout [ “Contacts” (Contacts) ; Animation: None ]
Show All Records
Go to Record/Request/Page [ First ]
# Loop over all the records and append a row in the $data variable for each
Set Variable [ $data ; Value: "" ]
Loop
# Get the delimited field values
Set Variable [ $i ; Value: 0 ]
Set Variable [ $row ; Value: "" ]
Loop
Exit Loop If [ Let ( $i = $i + 1 ; $i > $fieldCount ) ]
Set Variable [ $value ; Value: GetField ( GetValue ( $fields ; $i ) ) ]
Insert Calculated Result [ Target: $row ; If ( $i > 1 ; $delim ) & $value ]
End Loop
enter code here
# Append the new row of data to the list variable
Insert Calculated Result [ Target: $data ; If ( Get ( RecordNumber ) > 1 ; ¶ ) & $row ]
Go to Record/Request/Page [ Next ; Exit after last: On ]
End Loop
# Save to a global variable to show in a virtual list layout
Set Variable [ $$DATA ; Value: $data ]
Exit Script [ Text Result: ]
please note this code is just one of many possible formats the virtual list can take. A lot of people, myself included, prefer to use JSON objects or arrays for each row of the list since it automatically handle field values with carriage returns. This is sort of the old-fashioned way. Kevin Frank at FileMaker Hacks has some good recent articles about virtual list techniques if you're interested.
PS, another great technique for rendering table data dynamically is to collect the data in a JSON array and render it in a webviewer with https://datatables.net/
I did something like this for the oncology department of UM om 1980 or so using 4th Dimension and a new plug in that used one line of code to create a web browser with all the functions that a doctor might want. The data was placed inside a variable as it was sent/returned and 4D could use a variable in the report to display the data.
FileMaker does not have this ability built in as 4D did so you will have to do it yourself.JSON is the most likely tool that I am familiar with. YouTube has many videos on JSON.
You have two classes of variables for your report: Column headers and column data to display. Fortunately Filemaker is quite good and very easy to design. Just make a typical report and replace the text/header or field names with a JSON variable or any. $ColumnName = JSON variable.
Create a JSON calculated field in the database. In that calculated field set the JSON variable and this can be used for all of the columns.
This is the essence of the idea with the final result to be determined by you. What you are asking for is not easy and would require serious work by a skilled JSON scripter.

SQLite: isolating the file extension from a path

I need to isolate the file extension from a path in SQLite. I've read the post here (SQLite: How to select part of string?), which gets 99% there.
However, the solution:
select distinct replace(column_name, rtrim(column_name, replace(column_name, '.', '' ) ), '') from table_name;
fails if a file has no extension (i.e. no '.' in the filename), for which it should return an empty string. Is there any way to trap this please?
Note the filename in this context is the bit after the final '\'- it shouldn't be searching for'.'s in the full path, as it does at moment too.
I think it should be possible to do it using further nested rtrims and replaces.
Thanks. Yes, you can do it like this:
1) create a scalar function called "extension" in QtScript in SQLiteStudio
2) The code is as follows:
if ( arguments[0].substring(arguments[0].lastIndexOf('\u005C')).lastIndexOf('.') == -1 )
{
return ("");
}
else
{
return arguments[0].substring(arguments[0].lastIndexOf('.'));
}
3) Then, in the SQL query editor you can use
select distinct extension(PATH) from DATA
... to itemise the distinct file extensions from the column called PATH in the table called DATA.
Note that the PATH field must contain a backslash ('\') in this implementation - i.e. it must be a full path.

pattern matching and delete all the lines except the last occurence

I have a txt file which is having 100+ lines, i want to search for pattern and delete all the lines except the last occurrence.
Here are the lines from the txt file.
my pattern search is "string1=" , "string2=", "string3=" , "string4=" and "string5="
string1=hi
string2=hello
string3=welcome
string3=welcome1
string3=
string4=hi
string5=hello
i want to go through the each line and keep "string3=" is empty on the file and remove the "string3=welcome" ,"string3=welcome1"
please help me.
For a single pattern, you can start with something like this:
grep "string3" input | tail -1
#!/usr/bin/perl
my %h;
while (<STDIN>) {
my ($k, $v) = split /=/;
$h{$k} = $v;
}
foreach my $k ( sort keys %h ) {
print "$k=$h{$k}";
}
The perl script here will take your list as stdin and process output as you mention. This assumes you want the keys (string*) as sorted output.
If you only wants the values that start with string1-5 only then you can put a match in the beginning of your while loop as so:
next if ! /^string[1-5]=/;

Function to create the array by reading the file

I am creating scripts which will store the contents of pipe delimited file. Each column is stored in a separate array. I then read the information from the arrays and process it. There are 20 pipe delimited files and I need to write 20 scripts. The processing that will happen in each script after the information is stored in the array is different. The number of columns in each pipe delimited file is different (but in no case it would be more than 9 columns). I need to do this activity of storing the information in the array in the beginning of each script. The way I am doing it at present is given below. I want help from you to understand how can I write a function to do this activity.
cat > example_file.txt <<End-of-message
some text first row|other text first row|some other text first row
some text nth row|other text nth row|some other text nth row
End-of-message
# Note that example_file.txt will available. I have created it inside the script just to let you know the format of the file
OIFS=$IFS
IFS='|'
i=0
while read -r first second third ignore
do
first_arr[$i]=$first
second_arr[$i]=$second
third_arr[$i]=$third
(( i=i+1 ))
done < example_file.txt
IFS=$OIFS
Here is a sort-of minimal change to your script that should get you further...
...
...
while read -r first second third ignore
do
arr0[$i]=$first
arr1[$i]=$second
arr2[$i]=$third
(( i=i+1 ))
done < example_file.txt
IFS=$OIFS
proc0 () {
for j in "$#"; do
echo proc0 : "$j"
done
}
proc1 () {
echo proc1
}
proc2 () {
echo proc2
}
for i in 0 1 2; do
t=arr$i'[#]'
proc$i "${!t}"
done

Resources