Link array inside CSV for Neo4j - graph

I have a file with 3 column where one of the column will consist of an "array" with delimiter as say "," . I will need to link the text inside the array to form something like a linked list. After which, it will be linked to the other 2 column.
For example:
Column 1 (Text): A
Column 2 (Array of text): B1, B2, B3, B4
Column 3 (Text): C
I will need something like A->B1->B2->B3->B4->C to be visualise in Neo4j.
I need help in forming the "LOAD CSV..." query. Appreciate any help offered!

You can use split for extracting each element of the desired array
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file://directory/file.csv' AS line
with SPLIT(line.columnName,',') as arrayColumn
now you can use each data of the arrayColumn like
arrayColumn[0], arrayColumn[1]
then you can create relationships or node
MERGE (v:LabelName {name:arrayColumn[0]})-[:relations]->(v:LabelName {name:arrayColumn[1]})
Hope this helps ...

Related

Reading XML files in R, obtain values of different elements, and linking them by the text of attribute's node

I'm trying to import and process various XML files using R. Each XML file can contain different variables from various individuals. I would like to identify the values linked with each individual. The output should be a dataframe/table where each row is an individual and each column a variable contained in the XML.
For example, I have the following XML file:
<DatosE xmlns:ns0="tmp" xmlns:ns1="aux">
<ns0:DatosE>
<ns0:Cap>
<ns0:Code>1000</ns0:Code>
<ns0:Year>2022</ns0:Year>
</ns0:Cap>
<ns1:DataBody>
<ns1:RealData>
<ns1:IndividualData identity="1" name="AAA">
<ns1:DataA>
<ns1:Label1>2300.32</ns1:Label1>
<ns1:Label2>5600.90</ns1:Label2>
<ns1:Label3>87</ns1:Label3>
</ns1:DataA>
<ns1:DataB>
<ns1:DataB2>
<ns1:Label4>4500.34</ns1:Label4>
<ns1:Label5>23.20</ns1:Label5>
<ns1:Label6>10000.50</ns1:Label6>
</ns1:DataB2>
</ns1:DataB>
</ns1:IndividualData>
<ns1:IndividualData identity="2" name="BBB">
<ns1:DataA>
<ns1:Label1>4560.24</ns1:Label1>
<ns1:Label2>896.30</ns1:Label2>
<ns1:Label3>790.3</ns1:Label3>
</ns1:DataA>
<ns1:DataB>
<ns1:DataB2>
<ns1:Label4>2004.78</ns1:Label4>
<ns1:Label7>890</ns1:Label7>
<ns1:Label8></ns1:Label8>
</ns1:DataB2>
</ns1:DataB>
</ns1:IndividualData>
</ns1:RealData>
</ns1:DataBody>
</ns0:DatosE>
The output I would like to obtain is something similar as this:
Identify
Name
Label1
Label2
Label5
Label6
Label7
Label8
1
AAA
2300.32
5600.90
23.20
10000.50
NA
NA
2
BBB
4560.24
896.30
NA
NA
890
0
I want to read the different value numbers of the different elements in the XML nodes. Also, I want to link them depending on whose individual the value is. The identification of each individual is in the attributes (identity and name) inside the "ns1:IndividualData" node.
I've tried with 'xmlToDataFrame' function (XML package), and using the XPath synthaxis, but I don't know how to obtain the number/text of the attributes identify and name...I can read the values of the nodes that I want to, but not in the way I would like to link the different data.
I've tried the following function:
xmlToDataFrame(nodes = getNodeSet(xmlParse("xmlGGG.xml"), "//ns1:DataA |
//ns1:DataB2", namespaces = xml_ns(read_xml("xmlGGG.xml"))))
I also have investigated the "xml2" package...but my efforts didn't succeed.
Does anyone know how I can read the different value numbers of the different nodes/elements of my XML and link all of them considering the text element of the attributes than indicates which individual is?
Thank you.

Loop through a comma separated text and fix value with variables or get a first column, second column etc. to define variables with a value

Want to loop through a comma separated text file.
For ex:
mytext <- 3,24,25,276,2,87678,20-07-2022,1,5
From this mytext I would like to loop through like below :
for (i in 1:length(mytext)) {
print(mytext[[i]])
}
I need to display like
3
24
25
276
2
87678
20-07.2021
1
5
Actually I need to set every value as an individual variable, like :
variable1:3
variable2:24
variable3:25
variable4:276
variable5:2
variable6:87678
variable7:20-07.2021
variable8:1
variable9:5
(my project is retrieve data from text file and then having database validations in R before entering records to database.)
Could anyone help me out? Thanks in advance.
Split your string:
strsplit(mytext, ",")[[1]]

SQLite- Storing multiple values in same column

In SQLite , How would you store the information like this:
id name groups
1 xyz one,two
2 abc one
3 lmn two,three
The groups column may multiple entries. How can we store like that?
The main thing is the multiple values are should be appended.
I'm not sure I understand it correctly but why not store it as delimited string? Something like string1;string2;string3..or use comma instead of semi-colon like you already posted.
Just fetch the row, append the data followed by your delimiter and update the record. When you need the individual entries, just split the string using your delimiter.

data frame accessing specific rows and col from csv file in R programming

I have csv file contains iphone device roadmap like version number, name of model, release of model , price etc. I have done following:
I have imported data set in Rstudio in variable name iphonedetail by following command. iphonedetail <-read.csv("iphodedata.csv")
Than i hv changed the attribute "name of model" to character by using following: iphonedetail$nameofmodel <- as.character(iphonedetail$nameofmodel)
Now i need to access 1st 5 name of model and store them in vector .
I tried this to achieve : iphonesubset <- data.frame(iphonedetail$nameofmodel)
Then on console i typed iphonesubset, but gave 0 col and row.
Could someone help in above 2 steps correct or not ? and also suggest how to fix 3rd step?
if you want to extract the first five (non unique):
iphonedf1to5 <- df[1:5,]
That means that you get the first 5 rows and all columns. Then if you want to get the unique first five elements it should be like:
iphonedf1to5 <- unique(df[1:5,])
Edit:
df means your data frame of the read csv, iphonedetail in your case.

How to create a table in R from a csv file?

I have a csv file and am unsure how to get R to interpret it as a table because all the title info is in one cell and all the data relating to the titles is in a separate cell. So all the info I need is in 2 cells but it actually needs to be split up.
The cell A3 has a value called 'Team' , this corresponds to the part in the cell A4 that says 'Visitor'. Then each part after than corresponds to the bit below it. ..sorry I don't know how to describe it, but ultimately it would look like this …
Looks like the field separator in your data is a ;
read.csv has a parameter sep to change the field separator and another parameter header to tell it there is an initial line containing the column names. Use read.csv like this:
data = read.csv(file="/mydir/myfile.csv", sep=";", header=T)
To test you can print out the first 5 lines of the data table with:
head(data,5)

Resources