I am very new to Netlogo and have been using it to do basic network analysis. I have created a social network made up of 5 different turtle breeds. In order to continue my analysis in another program, I need to create an edge list( a two column list of all the connected nodes in the network). So the first column would have the breed and who number (ex. Actor 1) and the second column would list one of Actor 1's contacts
(ex [Actor1, Actor2] [Actor 1, Director5] [Actor1, Producer 1]..........)
The output needs to be a txt or csv file so that I can import it easily into EXCEL.
I've tried:
to make-edgelist
file-open "Test.txt"
ask links [file-show both-ends]
ask turtles[file-show who]
file-close
end
The problem is that 'both-ends' only reports the who number, not the breed. I can get the breed by using ask turtles [file-show who] but this appends the identifcation to the end of the edgelist which means a lot of manipulation to get things in the correct format. Does anyone have any suggetsions about how to build the edge list with the breeds+who numbers? I feel like I'm probably missing something simple, but I am new to Netlogo. Thanks!
The csv extension makes this a one-liner. Assuming you have extensions [ csv ] at the top of your code, you can just do:
csv:to-file "test.csv" [ [ (word breed " " who) ] of both-ends ] of links
If you need column titles, you can add them using fput, e.g.:
csv:to-file "test.csv"
fput ["source" "target"]
[ [ (word breed " " who) ] of both-ends ] of links
Note, however, that both-ends is an agentset that will always be accessed in random order, so "source" and "target" are not very meaningful in that case.
If you have directed links and if the direction is important, you can preserve it with this slightly more complicated version:
csv:to-file "test.csv"
fput ["source" "target"]
[ map [ t -> [ (word breed " " who) ] of t ] (list end1 end2) ] of links
I'm assuming that whatever program you're using for your analysis can organize and rearrange your pairs as needed, so as long as all pairs are recorded your network should build fine regardless of order. Your approach using ask-links and both-ends is a good one, and works with a bit of tweaking (mainly just using the breed primitive to have the turtles include their breed in the output. Here's one example:
to pairs-out
file-open "test.csv"
file-type (word "End_1, " "End_2,\n" )
ask links [
ask both-ends [
file-type (word breed " " who ",")
]
file-type "\n"
]
file-close
end
Related
I have a query written out where one of the lines is as follows:
[individualNode IN listOfNodes | [(individualNode)-[:CONNECTED_WITH]->(otherNode) | {node:otherNode, similarity:individualNode['similarity']}]] AS connectionMap
listOfNodes is a List of maps
Example of one of the map in the list is
{
"similarity":0.25,
"node":{
"identity":12345,
"labels": [
"Label1",
"Label2"
],
"properties": {
yada..yada..
}
}
The issue here is that since individualNode is a map the statement (individualNode)-[:CONNECTED_WITH]->(otherNode) will fail.
So my question is how do i access the node to use in the match statement, but still retain the map so i can grab the similarity value.
Disclaimer: I know node is a special word in cypher, i only used it here so you guys know what it is i am talking about in the map. That's not how it is in my actual query.
I also change the names of things because i cannot reveal the actual information in the map.
I have tried to write it as (individualNode.node)-[:CONNECTED_WITH]->(otherNode) or (individualNode['node'])-[:CONNECTED_WITH]->(otherNode) but both throw errors too.
Do you want to use it later?
You can access maps with their keys, e.g. map.node
if you want to use that in a pattern you have to alias it with an identifier
e.g. WITH map.node as startNode MATCH (startNode)-->(...)
if you have a list of nodes like in your case you either can walk through that in a pattern comprehension again, like you already did
or you can use UNWIND to turn the list into rows.
UNWIND listOfNodesMaps as map
WITH map.node as startNode, map.similarity as similarity
MATCH (startNode)-->(...)
I am currently struggling quite seriously to extract (important) data from a tricky Json file in R.
Well, technically it's a file extracted from a MongoDB database but it doesn't change the problem at all when it comes to the structure issue.
First of all, here is the nasty json with its structure (JSON editor online). It's a short version (4 objects) but enough to see problems with lists in columns:
Structure of the json file
I scanned most of the StackOverflow topics explaining how to deal with nested lists from Jsons but couldn't manage to solve the problem yet. For simplification, I will put the code without the Mongo query but use the absolute path of my json file. It has a structure similar to the json I linked above.
The code I used it extremely simple, I took it from a StackO topic about a similar issue:
require(tidyverse)
require(jsonlite)
#Simple import with jsonlite
data <- fromJSON("C:/mypersonnalpath/apps.json", flatten = TRUE) # flatten = TRUE, supposed to flattens list columns
data <- data[, c("app.id", "app.hosts","app.fluxList","app.interventions","app.microservices")] # just for simplification, I select annoying columns and the app.id
This little code gives me something like this:
Aspect of the data with nested lists
It's like having tables in cells, sometimes with one obs, sometimes with three, sometimes 2 columns, sometimes 3... It's tricky to deal with complex json structure at my level, far from being a beast with R...
I did try the unnest() function from the tidyverse package, but it doesn't work and put this error message (example with the column app.hosts to make it as simple as possible):
> unnest(data, app.hosts)
Error: Each column must either be a list of vectors or a list of data frames
[app.hosts]
Make a loop to convert all these "NULL" or "list()" into a lists of vectors didn't help of course, because the problem is deeper, or my loop is bad (definitely possible too), or both (highest probability). But something's wrong somewhere and I can't put my finger on what it is.
If I could flatten this stupid column "app.host" that would be already a good step ahead...
I don't even know if it's possible to flatten this kind column. I'm kinda confused now :/
Thank you in advance for everything. I will add more infos if it's necessary. I hope one of you guys will have a good idea about this; The json file should work in R.
Joe
Edit:
Dealing with these list is quite tricky and I don't really know what would be the best option to gather all the infos clearly for an incoming analysis (like network graphs blablabla...). But I already know it will increase the size of the dataset by repeating lines with only one change
Anyway, I my mind, this would give something like this:
For example, for the first row in the image I linked, (app.id = 45)
app.id + app.hosts.host.hostname1 + app.hosts.host.env1 + ...
45 | "ATOME" | ""
(next columns)
app.hosts.host.notes1 + app.hosts.host.ip1 +
"YYY Actionnaraire" | Na |
And repeat these columns for the second dimension in the vector, third if necessary, fourth...
The app.hosts list in the json is like this, with 3 app.hosts.host that I will have to transform into columns. Same for the three other columns that contain lists and that are visible in the picture I posted:
"hosts": [
{
"host": {
"hostname": "ATOME",
"env": "",
"notes": "YYY Actionnaraire"
}
},
{
"host": {
"hostname": "SQL-BUR",
"env": "Prod",
"notes": "CHL;Server SQL"
}
},
{
"host": {
"hostname": "",
"ip": ""
}
}
]
It's a little bit annoying :[
Or, If someone knows how to extract infos for a analysis with graphs and other goodies in R, I won't have to flatten this stuff...
I am a fairly novice, self-taught programmer using Scilab. I have .csv files that I want to read. They are mixed text and numerical values, and have a variable numbers of columns and rows. The part of the file I am interested in has a fixed amount of columns but not rows. I can skip the first part using the header argument but also have cells at the bottom that I do not need. An example of what it could look like:
DATA,1,0,3,3960.4,3236,3373,-132
DATA,1,0,4,4544.5,3530,3588,-76
RANDOM TEXT,0
INFO,1,0,#+BHO0 _:WRF&-11,S%00-0-03-1
INFO,2,1,#*BHO0 _8WRF&-11,NAS%00-0-15-1
I am only interested in the lines that start with DATA. If I try to run csvRead without removing the lines below I get this error:
Warning: Inconsistency found in the columns. At line 4993, found 2 columns
while the previous had 8.
I currently have a program that will read the file and manipulate it as required but I have to go into each file and delete the bottom rows. Is there a way to get around this?
My current program looks something like this:
D = uigetfile([".csv"],"path", "Choose a file name", %t);
filename = fullfile(D);
sub = ["DATA" "0"];
//Import data
data = csvRead(filename, ',', [], 'string', sub, [], [], 34);
edit(filename)
//determine # of rows
data_size = size(data);
limit = data_size(1);
Any ideas?
It is not possible to specify that csvRead should ignore lines with less columns, or to use a default set or anything (what would be nice).
A workaround could be in your case, to only parse lines starting with DATA. This can be accomplished with Regular Expressions.
The regexpcomments argument of csvRead gives the opportunity to ignore lines in the csv file matching a certain regular expression. Next to this, it is also possible to write a regular expression that matches all strings that do not match a certain pattern:
/^(?:(?!PATTERN).)*$/; # Matches strings not containing PATTERN
Applying this regex in your case, would lead that all lines not containing PATTERN, are to be assumed as comments and thus will be ignored.
In code that means something like the following.
filename = fullfile('data.csv');
sub = ["DATA" "0"];
//Import data
number_of_header_lines = 1
read_only_lines_starting_with = 'DATA'
regexp_magic = '/^(?:(?!' + read_only_lines_starting_with + ').)*$/'
data = csvRead(filename, ',', [], 'string', sub, regexp_magic, [], number_of_header_lines);
disp(data)
I am hoping to perform a series of edits to a large text file composed almost entirely of single letters, seperated by spaces. The file is about 300 rows by about 400,000 columns, and about 250 MB.
My goal is to tranform this table using a series of steps, for eventual processing with another language (R, probably). I don't have much experience working with big data files, but PERL has been suggested to me as the best way to go about this. Please let me know if there is a better way :).
So, I am hoping to write a PERL script that does the following:
Open file, edit or write to a new file the following:
remove columns 2-6
merge/concatenate pairs of columns, starting with column 2 (so, merge column 2-3,4-5, etc)
replace each character pair according to sequential conditional algorithm running accross each row:
[example PSEUDOCODE: if character 1 of cell = character 2 of cell=a, cell=1
else if character 1 of cell = character 2 of cell=b, cell=2
etc.] such that except for the first column, the table is a numerical matrix
remove every nth column, or keep every nth column and remove all others
I am just starting to learn PERL, so I was wondering if these operations were possible in PERL, whether PERL would be the best way to do them, and if there were any suggestions for syntax on these operations in the context of reading/writing to a file.
I'll start:
use strict;
use warnings;
my #transformed;
while (<>) {
chomp;
my #cols = split(/\s/); # split on whitespace
splice(#cols, 1,6); # remove columns
push #transformed, $cols[0];
for (my $i = 1; $i < #cols; $i += 2) {
push #transformed, "$cols[$i]$cols[$i+1]";
}
# other transforms as required
print join(' ', #transformed), "\n";
}
That should get you on your way.
You need to post some sample input and expected output or we're just guessing what you want but maybe this will be a start:
awk '{
printf "%s ", $1
for (i=7;i<=NF;i+=2) {
printf "%s%s ", $i, $(i+1)
}
print ""
}' file
Setting:
I have (simple) .csv and .dat files created from laboratory devices and other programs storing information on measurements or calculations. I have found this for other languages but nor for R
Problem:
Using R, I am trying to extract values to quickly display results w/o opening the created files. Hereby I have two typical settings:
a) I need to read a priori unknown values after known key words
b) I need to read lines after known key words or lines
I can't make functions such as scan() and grep() work.
c) Finally I would like to loop over dozens of files in a folder and give me a summary (to make the picture complete: I will manage this part)
I woul appreciate any form of help.
ok, it works for the key value (although perhaps not very nice)
variable<-scan("file.csv", what=character(),sep="")
returns a charactor vector of everything
variable[grep("keyword", ks)+2] # + 2 as the actual value is stored two places ahead
returns characters of seaked values.
as.numeric(lapply(variable, gsub, patt=",", replace="."))
for completion: data had to be altered to number and "," and "." problem needed to be solved.
in a line:
data=as.numeric(lapply(ks[grep("Ks_Boden", ks)+2], gsub, patt=",", replace="."))
Perseverence is not to bad of an asset ;-)
The rest isn't finished, yet, I will post once finished.