How to understand codes in HERE-API request result - here-api

I have a HERE-API flow request like this:
https://traffic.api.here.com/traffic/6.3/flow.json?app_id=blablablabla&app_code=blablablabla&bbox=-6.616762,%20106.814743;-6.617337,%20106.815086&criticality=minor
With result like this:
{"RWS":[{"RW":[{"FIS":[{"FI":[{"TMC":{"PC":2622,"DE":"Jalan Durian Raya","QD":"+","LE":0.43409},"SHP":[],"CF":[{"TY":"TR","SP":26.5,"SU":26.5,"FF":47.0,"JF":3.7249,"CN":0.78}]},{"TMC":{"PC":2621,"DE":"Jalan Raya Tajur","QD":"+","LE":0.65005},"SHP":[],"CF":[{"TY":"TR","SP":20.32,"SU":20.32,"FF":37.9,"JF":3.84588,"CN":0.78}]}]}],"mid":"b454e664-bc01-4ff7-948e-38ea40554fcd|","LI":"C23-02620","DE":"Jalan Raya Pajajaran","PBT":"2018-10-18T09:31:13Z"},{"FIS":[{"FI":[{"TMC":{"PC":2621,"DE":"Jalan Raya Tajur","QD":"-","LE":0.06995},"SHP":[],"CF":[{"TY":"TR","SP":16.01,"SU":16.01,"FF":25.5,"JF":2.54451,"CN":0.81}]},{"TMC":{"PC":2622,"DE":"Jalan Durian Raya","QD":"-","LE":0.57022},"SHP":[],"CF":[{"TY":"TR","SP":24.1,"SU":24.1,"FF":35.3,"JF":2.32419,"CN":0.81}]},{"TMC":{"PC":2623,"DE":"Jalan Pajajaran Indah Raya","QD":"-","LE":0.43682},"SHP":[],"CF":[{"TY":"TR","SP":28.01,"SU":28.01,"FF":40.1,"JF":2.26799,"CN":0.84}]}]}],"mid":"68880e46-f33d-42a0-836c-f22b42765a6a|","LI":"C23+02620","DE":"Jalan Raya Pajajaran","PBT":"2018-10-18T09:31:09Z"},{"FIS":[{"FI":[{"TMC":{"PC":2698,"DE":"Jalan Raya Pajajaran","QD":"-","LE":1.15256},"SHP":[],"CF":[{"TY":"TR","SP":21.78,"SU":21.78,"FF":35.9,"JF":3.09603,"CN":0.78}]}]}],"mid":"24eb0f4b-43d8-4048-974b-3ad21e6d4075|","LI":"C23+02695","DE":"Jalan Raya Tajur","PBT":"2018-10-18T09:31:13Z"},{"FIS":[{"FI":[{"TMC":{"PC":2698,"DE":"Jalan Raya Pajajaran","QD":"+","LE":0.05673},"SHP":[],"CF":[{"TY":"TR","SP":8.67,"SU":8.67,"FF":27.6,"JF":8.13035,"CN":0.75}]},{"TMC":{"PC":2697,"DE":"Jalan Dahlia","QD":"+","LE":1.07164},"SHP":[],"CF":[{"TY":"TR","SP":15.91,"SU":15.91,"FF":38.7,"JF":6.97436,"CN":0.71}]}]}],"mid":"04e46740-d1c1-4d3b-b362-c8385f7ba83c|","LI":"C23-02695","DE":"Jalan Raya Tajur","PBT":"2018-10-18T09:31:26Z"},{"FIS":[{"FI":[{"TMC":{"PC":8478,"DE":"Jalan Parung Banteng","QD":"+","LE":0.03168},"SHP":[],"CF":[{"TY":"TR","SP":13.99,"SU":13.99,"FF":17.0,"JF":0.0,"CN":0.7}]},{"TMC":{"PC":8477,"DE":"Jalan Durian Raya","QD":"+","LE":0.0742},"SHP":[],"CF":[{"TY":"TR","SP":13.0,"SU":13.0,"FF":21.0,"JF":2.38095,"CN":0.7}]}]}],"mid":"ec286d55-3452-458b-bcce-249ef46b8b17|","LI":"C23-08476","DE":"Jalan Cempedak Raya","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8477,"DE":"Jalan Durian Raya","QD":"-","LE":0.03512},"SHP":[],"CF":[{"TY":"TR","SP":12.0,"SU":12.0,"FF":21.0,"JF":2.85714,"CN":0.7}]},{"TMC":{"PC":8478,"DE":"Jalan Parung Banteng","QD":"-","LE":0.07076},"SHP":[],"CF":[{"TY":"TR","SP":8.01,"SU":8.01,"FF":17.5,"JF":3.71428,"CN":0.7}]}]}],"mid":"316164f0-3b40-4c1f-8368-cee4433b6637|","LI":"C23+08476","DE":"Jalan Cempedak Raya","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8577,"DE":"Jalan Raya Pajajaran","QD":"+","LE":0.07311},"SHP":[],"CF":[{"TY":"TR","SP":12.0,"SU":12.0,"FF":19.0,"JF":2.10526,"CN":0.7}]},{"TMC":{"PC":8576,"DE":"Jalan Cempedak Raya","QD":"+","LE":0.56327},"SHP":[],"CF":[{"TY":"TR","SP":13.0,"SU":13.0,"FF":23.1,"JF":3.07359,"CN":0.7}]}]}],"mid":"357f85f1-209e-4718-864c-0ecea6dfe68a|","LI":"C23-08575","DE":"Jalan Durian Raya","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8576,"DE":"Jalan Cempedak Raya","QD":"-","LE":0.04608},"SHP":[],"CF":[{"TY":"TR","SP":17.0,"SU":17.0,"FF":20.0,"JF":0.0,"CN":0.7}]},{"TMC":{"PC":8577,"DE":"Jalan Raya Pajajaran","QD":"-","LE":0.5903},"SHP":[],"CF":[{"TY":"TR","SP":13.99,"SU":13.99,"FF":22.8,"JF":2.54385,"CN":0.7}]}]}],"mid":"3183f70b-ae99-43dc-9275-9ff66bf08c30|","LI":"C23+08575","DE":"Jalan Durian Raya","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8914,"DE":"Jalan Parung Banteng","QD":"-","LE":0.02799},"SHP":[],"CF":[{"TY":"TR","SP":11.0,"SU":11.0,"FF":14.0,"JF":0.0,"CN":0.7}]},{"TMC":{"PC":8915,"DE":"Jalan Pajajaran Indah Raya/Jalan Tunjung Biru","QD":"-","LE":0.92454},"SHP":[],"CF":[{"TY":"TR","SP":12.98,"SU":12.98,"FF":16.1,"JF":0.07209,"CN":0.73}]}]}],"mid":"92a561d1-1857-477a-a5e7-8930a8e65fa3|","LI":"C23+08913","DE":"Jalan Pajajaran Indah V","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8914,"DE":"Jalan Parung Banteng","QD":"+","LE":0.95254},"SHP":[],"CF":[{"TY":"TR","SP":14.38,"SU":14.38,"FF":16.4,"JF":0.0,"CN":0.71}]}]}],"mid":"94e9f95c-c2f3-4385-940c-030263a955f2|","LI":"C23-08913","DE":"Jalan Pajajaran Indah V","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8953,"DE":"Jalan Cempedak Raya/Jalan Pajajaran Indah V","QD":"+","LE":1.1729},"SHP":[],"CF":[{"TY":"TR","SP":13.99,"SU":13.99,"FF":18.4,"JF":0.76086,"CN":0.7}]}]}],"mid":"28d7ef32-9bf4-49f2-8289-0ed52c0de94a|","LI":"C23-08952","DE":"Jalan Parung Banteng","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":8953,"DE":"Jalan Cempedak Raya/Jalan Pajajaran Indah V","QD":"-","LE":0.03956},"SHP":[],"CF":[{"TY":"TR","SP":13.99,"SU":13.99,"FF":16.2,"JF":0.0,"CN":0.7}]},{"TMC":{"PC":8954,"DE":"Bogor 1","QD":"-","LE":1.13334},"SHP":[],"CF":[{"TY":"TR","SP":13.0,"SU":13.0,"FF":18.8,"JF":1.48936,"CN":0.7}]}]}],"mid":"45f7aa3b-0cb6-4ddc-9c5a-4149b5f9648b|","LI":"C23+08952","DE":"Jalan Parung Banteng","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":24124,"DE":"Jalan Siliwangi","QD":"+","LE":0.03148},"SHP":[],"CF":[{"TY":"TR","SP":13.67,"SU":13.67,"FF":19.0,"JF":1.22073,"CN":0.72}]},{"TMC":{"PC":24123,"DE":"Jalan Saleh Danasasnita","QD":"+","LE":0.97711},"SHP":[],"CF":[{"TY":"TR","SP":21.86,"SU":21.86,"FF":29.1,"JF":1.45539,"CN":0.82}]}]}],"mid":"47235aea-f54b-46df-9f15-03f0c913e52a|","LI":"C23-24122","DE":"Jalan Lawang Gintung","PBT":"2018-10-18T09:31:02Z"},{"FIS":[{"FI":[{"TMC":{"PC":24127,"DE":"Jalan Pahlawan","QD":"+","LE":0.5877},"SHP":[],"CF":[{"TY":"TR","SP":11.82,"SU":11.82,"FF":27.2,"JF":5.46778,"CN":0.79}]},{"TMC":{"PC":24126,"DE":"Jalan Saleh Danasasnita","QD":"+","LE":1.03899},"SHP":[],"CF":[{"TY":"TR","SP":21.78,"SU":21.78,"FF":29.0,"JF":1.4555,"CN":0.82}]}]}],"mid":"8dfb675e-cd62-4e0e-8fa1-913febbf3dd1|","LI":"C23-24125","DE":"Jalan Batu Tulis","PBT":"2018-10-18T09:31:13Z"},{"FIS":[{"FI":[{"TMC":{"PC":24124,"DE":"Jalan Siliwangi","QD":"-","LE":1.81806},"SHP":[],"CF":[{"TY":"TR","SP":16.12,"SU":16.12,"FF":27.8,"JF":3.1217,"CN":0.77}]}]}],"mid":"6fdc5622-6392-478d-aedc-356009d885f3|","LI":"C23+24122","DE":"Jalan Lawang Gintung","PBT":"2018-10-18T09:31:19Z"},{"FIS":[{"FI":[{"TMC":{"PC":24154,"DE":"Jalan Sukasan I","QD":"-","LE":0.24046},"SHP":[],"CF":[{"TY":"TR","SP":14.77,"SU":14.77,"FF":27.0,"JF":3.42004,"CN":0.84}]},{"TMC":{"PC":24155,"DE":"Jalan Lawang Gintung","QD":"-","LE":0.38831},"SHP":[],"CF":[{"TY":"TR","SP":10.68,"SU":10.68,"FF":28.3,"JF":7.10283,"CN":0.76}]},{"TMC":{"PC":24156,"DE":"Jalan Raya Tajur/Jalan Raya Pajajaran","QD":"-","LE":0.27849},"SHP":[],"CF":[{"TY":"TR","SP":9.65,"SU":9.65,"FF":25.5,"JF":6.76531,"CN":0.75}]}]}],"mid":"710c2348-f63f-48c1-a91b-0a3178df92be|","LI":"C23+24151","DE":"Jalan Siliwangi","PBT":"2018-10-18T09:31:01Z"},{"FIS":[{"FI":[{"TMC":{"PC":24156,"DE":"Jalan Raya Tajur/Jalan Raya Pajajaran","QD":"+","LE":0.04761},"SHP":[],"CF":[{"TY":"TR","SP":9.99,"SU":9.99,"FF":21.0,"JF":3.81175,"CN":0.8}]},{"TMC":{"PC":24155,"DE":"Jalan Lawang Gintung","QD":"+","LE":0.24403},"SHP":[],"CF":[{"TY":"TR","SP":14.35,"SU":14.35,"FF":26.3,"JF":3.40337,"CN":0.79}]},{"TMC":{"PC":24154,"DE":"Jalan Sukasan I","QD":"+","LE":1.54185},"SHP":[],"CF":[{"TY":"TR","SP":20.61,"SU":20.61,"FF":28.0,"JF":1.56765,"CN":0.8}]}]}],"mid":"568b5b32-49cc-440f-a8b6-703268da32c1|","LI":"C23-24151","DE":"Jalan Siliwangi","PBT":"2018-10-18T09:31:06Z"},{"FIS":[{"FI":[{"TMC":{"PC":27746,"DE":"Jalan Pajajaran/Jalan Raya Pajajaran","QD":"+","LE":0.0485},"SHP":[],"CF":[{"TY":"TR","SP":11.0,"SU":11.0,"FF":28.0,"JF":6.66666,"CN":0.7}]},{"TMC":{"PC":27745,"DE":"Jalan Siliwangi","QD":"+","LE":0.38998},"SHP":[],"CF":[{"TY":"TR","SP":8.99,"SU":8.99,"FF":28.4,"JF":8.13203,"CN":0.78}]}]}],"mid":"7288833c-cca1-43d1-8e46-684d359cf6b0|","LI":"C23-27744","DE":"Jalan Sukasan I","PBT":"2018-10-18T09:31:41Z"},{"FIS":[{"FI":[{"TMC":{"PC":27745,"DE":"Jalan Siliwangi","QD":"-","LE":0.04268},"SHP":[],"CF":[{"TY":"TR","SP":24.68,"SU":24.68,"FF":17.0,"JF":0.0,"CN":0.73}]},{"TMC":{"PC":27746,"DE":"Jalan Pajajaran/Jalan Raya Pajajaran","QD":"-","LE":0.3958},"SHP":[],"CF":[{"TY":"TR","SP":21.43,"SU":21.43,"FF":25.3,"JF":0.34253,"CN":0.72}]}]}],"mid":"b0f4bfc0-9ed7-4391-86e9-12f0aa6b1963|","LI":"C23+27744","DE":"Jalan Sukasan I","PBT":"2018-10-18T09:31:41Z"}],"TY":"TMC","MAP_VERSION":"201804","EBU_COUNTRY_CODE":"C","EXTENDED_COUNTRY_CODE":"F2","TABLE_ID":"23","UNITS":"metric"}],"MAP_VERSION":"","CREATED_TIMESTAMP":"2018-10-18T09:31:41.000+0000","VERSION":"3.2.2","UNITS":"metric"}
There are codes in this result and I don't understand what it says.
Is there a documentation that explain it?
Thanks.

You can refer to Here Traffic API Documentation to get the response elements and meanings documentation.developer.here.com/pdf/traffic_hlp/6.0.85.0/Traffic%20API%20v6.0.85.0%20Developer's%20Guide.pdf. You can also go to https://developer.here.com/documentation/versions and download this file.

Related

How to split elements inside <p> tag while web scraping

I am trying to scrape url. However the output is not in the desired format. I need just the Branch name and address. How do I split this information from p tag.
import re
import requests
from bs4 import BeautifulSoup
page = requests.get(url)
Branch_list=[]
soup = BeautifulSoup(page.content, 'html.parser')
for i in soup.find_all('div',class_="col-md-9 text-left"):
Branch=i.find_all('p') if i.find_all('p') else ''
for k in Branch:
k=re.sub(r'<(.*?)>','', str(k))
Branch_list.append(k)
Try this:
import re
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.bukopin.co.id/page/jaringankantor")
soup = BeautifulSoup(page.text, 'html.parser').find_all('div', class_="col-md-9 text-left")
paragraphs = [re.sub(r"Tel.+", "", p.find("p").getText(strip=True)) for p in soup]
for paragraph in paragraphs:
print(paragraph)
Output:
KCP Rasuna SaidGd. Kementerian Koperasi & UKM, Lt. 1. Jl. HR. Rasuna Said Kav. 3 - 5, Jakarta Selatan 12940
KCP Plaza AsiaJl. Jend. Sudirman Kav. 59 No. 77 Lt. GF No. GF - D Blok A Senayan, Kebayoran Baru, Jakarta Selatan
KCP Bulog IIGedung Diklat Bulog II Jl. Kuningan Timur Blok M2 No.5 Jakarta Selatan 12950
KCP Pondok Indah Plaza VPlaza V Pondok Indah Kav.A11 Jl. Marga Guna Raya - Pondok Indah Jakarta Selatan
KCP Kebayoran LamaJl. Raya Kebayoran Lama No.10 Jakarta Selatan 12220
KCP Kebayoran BaruJl. RS. Fatmawati No.7 Blok A Kebayoran Baru Jakarta Selatan12140
KCP MelawaiJl. Melawai Raya Kebayoran Baru No. 66 Jakarta Selatan 12160
KK PLN Lenteng AgungJl. Raya Tanjung Barat No. 55 Jakarta Selatan 12610
and so on...
EDIT: To get this into a pandas dataframe try this:
import re
import requests
import pandas as pd
from bs4 import BeautifulSoup
page = requests.get("https://www.bukopin.co.id/page/jaringankantor")
soup = BeautifulSoup(page.text, 'html.parser').find_all('div', class_="col-md-9 text-left")
data = []
for div in soup:
branch = div.find("strong").getText()
address = div.find("p").getText(strip=True)
data.append([branch, re.sub(r"Telp.+", "", address[len(branch):])])
print(pd.DataFrame(data, columns=["Branch", "Address"]))
Output:
Branch Address
0 KCP Rasuna Said Gd. Kementerian Koperasi & UKM, Lt. 1. Jl. HR....
1 KCP Plaza Asia Jl. Jend. Sudirman Kav. 59 No. 77 Lt. GF No. G...
2 KCP Bulog II Gedung Diklat Bulog II Jl. Kuningan Timur Blok...
3 KCP Pondok Indah Plaza V Plaza V Pondok Indah Kav.A11 Jl. Marga Guna Ra...
4 KCP Kebayoran Lama Jl. Raya Kebayoran Lama No.10 Jakarta Selatan ...
5 KCP Kebayoran Baru Jl. RS. Fatmawati No.7 Blok A Kebayoran Baru J...
...

Web scraping relevant information from soup file

I am trying to scrape this particular url to obtain information on branch/atm name and location address.
url="https://www.bankmayapada.com/en/contactus/location-information"
However, the soup file I get is pretty confusing and I am unable to figure out how to extract the required information.
The information I need is Branch/Atm Name and its corresponding address. Right now, I am just figuring out the structure of the soup file.
import re
import requests
from bs4 import BeautifulSoup
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
print(soup.prettify())
You can get that table's data with a single POST request. Fun fact, no payload required!
Here's how:
import requests
from bs4 import BeautifulSoup
page = requests.post("https://myapps.bankmayapada.com/frontend/IN/lokasi.aspx").text
rows = BeautifulSoup(page, "html.parser").find_all("tr", {"class": "dxgvDataRow"})
branch_location_data = []
for row in rows:
province, area, location = row.find_all("td")
branch_location_data.append(
[
province.getText(strip=True), # province column
area.getText(strip=True), # area column
location.find("b").getText(strip=True), # Branch name
" ".join(
d.getText() for d in location.find_all("div") # branch address
if not d.getText().startswith(("Tel", "Fax")) # skipping Phone & Fax info
),
]
)
for branch in branch_location_data:
print(branch)
Output:
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Citra Garden 2', 'Rukan Citra Niaga Blok A-7 Jl. Utan Jati - Kalideres Jakarta - DKI Jakarta']
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Puri Indah', 'Jl. Puri Indah Raya Blok I No. 2 Jakarta 11610 - DKI Jakarta']
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Pasar Pagi Asemka', 'Jl. Pasar Pagi No. 84 Jakarta - DKI Jakarta']
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Tanjung Duren', 'Jl. Tanjung Duren No. 91 B Jakarta 11470 - DKI Jakarta']
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Meruya', 'Jl. Meruya Ilir Raya No. 82 G Jakarta - DKI Jakarta']
['DKI JAKARTA', 'Jakarta Barat', 'Kantor Capem Jembatan Lima', 'Jl. KH Moch. Mansyur No. 24 A Jakarta - DKI Jakarta']
and so on...

Maximum length of a vector in R is only 349? [duplicate]

This question already has an answer here:
Command Lines error in Rstudio console
(1 answer)
Closed 2 years ago.
I would like to use a very long vector in R, but it seems that when I create a vector, the maximum length/number of values in the vector is 349. See below for code for vector1 and vector2. I can create vector1 with 349 values without a problem, but vector2, which contains 350 values, throws + on the next line, as if I forgot to use a closing " or ).
I read this post:
Max Length for a Vector in R, but that is completely different from my experience.
Am I missing something? Can anyone help, please?
Code
vector1 <- c("value1", "value2", "value3", "value4", "value5", "value6", "value7", "value8", "value9", "value10", "value11", "value12", "value13", "value14", "value15", "value16", "value17", "value18", "value19", "value20", "value21", "value22", "value23", "value24", "value25", "value26", "value27", "value28", "value29", "value30", "value31", "value32", "value33", "value34", "value35", "value36", "value37", "value38", "value39", "value40", "value41", "value42", "value43", "value44", "value45", "value46", "value47", "value48", "value49", "value50", "value51", "value52", "value53", "value54", "value55", "value56", "value57", "value58", "value59", "value60", "value61", "value62", "value63", "value64", "value65", "value66", "value67", "value68", "value69", "value70", "value71", "value72", "value73", "value74", "value75", "value76", "value77", "value78", "value79", "value80", "value81", "value82", "value83", "value84", "value85", "value86", "value87", "value88", "value89", "value90", "value91", "value92", "value93", "value94", "value95", "value96", "value97", "value98", "value99", "value100", "value101", "value102", "value103", "value104", "value105", "value106", "value107", "value108", "value109", "value110", "value111", "value112", "value113", "value114", "value115", "value116", "value117", "value118", "value119", "value120", "value121", "value122", "value123", "value124", "value125", "value126", "value127", "value128", "value129", "value130", "value131", "value132", "value133", "value134", "value135", "value136", "value137", "value138", "value139", "value140", "value141", "value142", "value143", "value144", "value145", "value146", "value147", "value148", "value149", "value150", "value151", "value152", "value153", "value154", "value155", "value156", "value157", "value158", "value159", "value160", "value161", "value162", "value163", "value164", "value165", "value166", "value167", "value168", "value169", "value170", "value171", "value172", "value173", "value174", "value175", "value176", "value177", "value178", "value179", "value180", "value181", "value182", "value183", "value184", "value185", "value186", "value187", "value188", "value189", "value190", "value191", "value192", "value193", "value194", "value195", "value196", "value197", "value198", "value199", "value200", "value201", "value202", "value203", "value204", "value205", "value206", "value207", "value208", "value209", "value210", "value211", "value212", "value213", "value214", "value215", "value216", "value217", "value218", "value219", "value220", "value221", "value222", "value223", "value224", "value225", "value226", "value227", "value228", "value229", "value230", "value231", "value232", "value233", "value234", "value235", "value236", "value237", "value238", "value239", "value240", "value241", "value242", "value243", "value244", "value245", "value246", "value247", "value248", "value249", "value250", "value251", "value252", "value253", "value254", "value255", "value256", "value257", "value258", "value259", "value260", "value261", "value262", "value263", "value264", "value265", "value266", "value267", "value268", "value269", "value270", "value271", "value272", "value273", "value274", "value275", "value276", "value277", "value278", "value279", "value280", "value281", "value282", "value283", "value284", "value285", "value286", "value287", "value288", "value289", "value290", "value291", "value292", "value293", "value294", "value295", "value296", "value297", "value298", "value299", "value300", "value301", "value302", "value303", "value304", "value305", "value306", "value307", "value308", "value309", "value310", "value311", "value312", "value313", "value314", "value315", "value316", "value317", "value318", "value319", "value320", "value321", "value322", "value323", "value324", "value325", "value326", "value327", "value328", "value329", "value330", "value331", "value332", "value333", "value334", "value335", "value336", "value337", "value338", "value339", "value340", "value341", "value342", "value343", "value344", "value345", "value346", "value347", "value348", "value349")
vector2 <- c("value1", "value2", "value3", "value4", "value5", "value6", "value7", "value8", "value9", "value10", "value11", "value12", "value13", "value14", "value15", "value16", "value17", "value18", "value19", "value20", "value21", "value22", "value23", "value24", "value25", "value26", "value27", "value28", "value29", "value30", "value31", "value32", "value33", "value34", "value35", "value36", "value37", "value38", "value39", "value40", "value41", "value42", "value43", "value44", "value45", "value46", "value47", "value48", "value49", "value50", "value51", "value52", "value53", "value54", "value55", "value56", "value57", "value58", "value59", "value60", "value61", "value62", "value63", "value64", "value65", "value66", "value67", "value68", "value69", "value70", "value71", "value72", "value73", "value74", "value75", "value76", "value77", "value78", "value79", "value80", "value81", "value82", "value83", "value84", "value85", "value86", "value87", "value88", "value89", "value90", "value91", "value92", "value93", "value94", "value95", "value96", "value97", "value98", "value99", "value100", "value101", "value102", "value103", "value104", "value105", "value106", "value107", "value108", "value109", "value110", "value111", "value112", "value113", "value114", "value115", "value116", "value117", "value118", "value119", "value120", "value121", "value122", "value123", "value124", "value125", "value126", "value127", "value128", "value129", "value130", "value131", "value132", "value133", "value134", "value135", "value136", "value137", "value138", "value139", "value140", "value141", "value142", "value143", "value144", "value145", "value146", "value147", "value148", "value149", "value150", "value151", "value152", "value153", "value154", "value155", "value156", "value157", "value158", "value159", "value160", "value161", "value162", "value163", "value164", "value165", "value166", "value167", "value168", "value169", "value170", "value171", "value172", "value173", "value174", "value175", "value176", "value177", "value178", "value179", "value180", "value181", "value182", "value183", "value184", "value185", "value186", "value187", "value188", "value189", "value190", "value191", "value192", "value193", "value194", "value195", "value196", "value197", "value198", "value199", "value200", "value201", "value202", "value203", "value204", "value205", "value206", "value207", "value208", "value209", "value210", "value211", "value212", "value213", "value214", "value215", "value216", "value217", "value218", "value219", "value220", "value221", "value222", "value223", "value224", "value225", "value226", "value227", "value228", "value229", "value230", "value231", "value232", "value233", "value234", "value235", "value236", "value237", "value238", "value239", "value240", "value241", "value242", "value243", "value244", "value245", "value246", "value247", "value248", "value249", "value250", "value251", "value252", "value253", "value254", "value255", "value256", "value257", "value258", "value259", "value260", "value261", "value262", "value263", "value264", "value265", "value266", "value267", "value268", "value269", "value270", "value271", "value272", "value273", "value274", "value275", "value276", "value277", "value278", "value279", "value280", "value281", "value282", "value283", "value284", "value285", "value286", "value287", "value288", "value289", "value290", "value291", "value292", "value293", "value294", "value295", "value296", "value297", "value298", "value299", "value300", "value301", "value302", "value303", "value304", "value305", "value306", "value307", "value308", "value309", "value310", "value311", "value312", "value313", "value314", "value315", "value316", "value317", "value318", "value319", "value320", "value321", "value322", "value323", "value324", "value325", "value326", "value327", "value328", "value329", "value330", "value331", "value332", "value333", "value334", "value335", "value336", "value337", "value338", "value339", "value340", "value341", "value342", "value343", "value344", "value345", "value346", "value347", "value348", "value349", "value350")
Command lines entered at the console are limited to about 4095 bytes (not characters).
Source: R Documentation
You can try it yourself, if you insert a line break, it will work:
vector2 <- c("value1", "value2", "value3", "value4", "value5", "value6", "value7", "value8", "value9", "value10", "value11", "value12", "value13", "value14", "value15", "value16", "value17", "value18", "value19", "value20", "value21", "value22", "value23", "value24", "value25", "value26", "value27", "value28", "value29", "value30", "value31", "value32", "value33", "value34", "value35", "value36", "value37", "value38", "value39", "value40", "value41", "value42", "value43", "value44", "value45", "value46", "value47", "value48", "value49", "value50", "value51", "value52", "value53", "value54", "value55", "value56", "value57", "value58", "value59", "value60", "value61", "value62", "value63", "value64", "value65", "value66", "value67", "value68", "value69", "value70", "value71", "value72", "value73", "value74", "value75", "value76", "value77", "value78", "value79", "value80", "value81", "value82", "value83", "value84", "value85", "value86", "value87", "value88", "value89", "value90", "value91", "value92", "value93", "value94", "value95", "value96", "value97", "value98", "value99", "value100", "value101", "value102", "value103", "value104", "value105", "value106", "value107", "value108", "value109", "value110", "value111", "value112", "value113", "value114", "value115", "value116", "value117", "value118", "value119", "value120", "value121", "value122", "value123", "value124", "value125", "value126", "value127", "value128", "value129", "value130", "value131", "value132", "value133", "value134", "value135", "value136", "value137", "value138", "value139", "value140", "value141", "value142", "value143", "value144", "value145", "value146", "value147", "value148", "value149", "value150", "value151", "value152", "value153", "value154", "value155", "value156", "value157", "value158", "value159", "value160", "value161", "value162", "value163", "value164", "value165", "value166", "value167", "value168", "value169", "value170", "value171", "value172", "value173", "value174", "value175", "value176", "value177", "value178", "value179", "value180", "value181", "value182", "value183", "value184", "value185", "value186", "value187", "value188", "value189", "value190", "value191", "value192", "value193", "value194", "value195", "value196", "value197", "value198", "value199", "value200", "value201", "value202", "value203", "value204", "value205", "value206", "value207", "value208", "value209", "value210", "value211", "value212", "value213", "value214", "value215", "value216", "value217", "value218", "value219", "value220", "value221", "value222", "value223", "value224", "value225", "value226", "value227", "value228", "value229", "value230", "value231", "value232", "value233", "value234", "value235", "value236", "value237", "value238", "value239", "value240", "value241", "value242", "value243", "value244", "value245", "value246", "value247", "value248", "value249", "value250", "value251", "value252", "value253", "value254", "value255", "value256", "value257", "value258", "value259", "value260", "value261", "value262", "value263", "value264", "value265", "value266", "value267", "value268", "value269", "value270", "value271", "value272", "value273", "value274", "value275", "value276", "value277", "value278", "value279", "value280", "value281", "value282", "value283", "value284", "value285", "value286", "value287", "value288", "value289", "value290", "value291", "value292", "value293", "value294", "value295", "value296", "value297", "value298", "value299", "value300", "value301", "value302", "value303", "value304", "value305", "value306", "value307", "value308", "value309", "value310", "value311", "value312", "value313", "value314", "value315", "value316", "value317", "value318", "value319", "value320", "value321", "value322", "value323", "value324", "value325", "value326", "value327", "value328", "value329", "value330", "value331", "value332", "value333", "value334", "value335", "value336", "value337", "value338", "value339", "value340", "value341", "value342", "value343", "value344", "value345", "value346", "value347", "value348", "value349",
"value350")
Anyway, it is good practice to avoid long lines to increase code readability. Stick to 80 or 120 character long lines, e.g.:
vector2 <- c("value1", "value2", "value3", "value4", "value5", "value6", "value7",
"value8", "value9", "value10", "value11", "value12", "value13",
"value14", "value15", "value16", "value17", "value18", "value19",
.
.
.
"value344", "value345", "value346", "value347", "value348", "value349",
"value350")

Extracting a dataframe or table from text in R

This is a challenging question, as it might be somehow difficult for the variability that is present. Let's start with the example:
example <- list(c("Birth Centenary of K.S.Stanislavsky.Series:Birth CentenariesCatalog codes:Mi:SU 2710, Sn:SU 2695, Yt:SU 2626, Sg:SU 2797, AFA:SU 2698Variants:Click to see variantsThemes:Actors | Anniversaries and Jubilees | Famous People | MenIssued on:1963-01-15Size:30 x 42 mmColors:Blackish grey greenFormat:StampEmission:CommemorativePerforation:line 12½Printing:RecessPaper:hard thick whiteWatermark:UnwmkFace value:4 Russian kopekPrint run:2,000,000Score:29%\tAccuracy: Very HighBuy Now:2 sale offers from US$ 0.16",
"Birth Centenary of A.S.Serafimovich.Series:Birth CentenariesCatalog codes:Mi:SU 2711, Sn:SU 2696, Yt:SU 2627, Sg:SU 2800, AFA:SU 2699Themes:Anniversaries and Jubilees | Authors | Famous People | Literary People (Poets and Writers) | Literature | MenIssued on:1963-01-19Size:28 x 40 mmFormat:StampEmission:CommemorativePerforation:frame 11½Printing:PhotogravurePaper:ordinaryFace value:4 Russian kopekPrint run:2,500,000Score:26%\tAccuracy: Very HighBuy Now:3 sale offers from US$ 0.11",
"Children in nurserySeries:Soviet ChildrenCatalog codes:Mi:SU 2712, Sn:SU 2697, Yt:SU 2629, Sg:SU 2806, AFA:SU 2700Themes:ChildrenIssued on:1963-01-31Size:42 x 28 mmColors:MulticolorFormat:StampEmission:CommemorativePerforation:comb 11½Printing:PhotogravureFace value:4 Russian kopekPrint run:3,000,000Score:27%\tAccuracy: Very HighDescription:Designer: A. Shmidshtein. Paper: ordinary.Buy Now:2 sale offers from US$ 0.08",
"Children with nurseSeries:Soviet ChildrenCatalog codes:Mi:SU 2713, Sn:SU 2698, Yt:SU 2628, Sg:SU 2807, AFA:SU 2701Themes:ChildrenIssued on:1963-01-31Size:42 x 28 mmFormat:StampEmission:CommemorativePerforation:comb 11½Printing:PhotogravureFace value:4 Russian kopekPrint run:3,000,000Score:25%\tAccuracy: Very HighDescription:Designer: A. Shmidshtein. Paper: ordinary.Buy Now:3 sale offers from US$ 0.08",
"Pioneer campSeries:Soviet ChildrenCatalog codes:Mi:SU 2714, Sn:SU 2699, Yt:SU 2630, Sg:SU 2808, AFA:SU 2702Themes:ChildrenIssued on:1963-01-31Size:42 x 28 mmFormat:StampEmission:CommemorativePerforation:comb 11½Printing:PhotogravureFace value:4 Russian kopekPrint run:3,000,000Score:22%\tAccuracy: Very HighDescription:Designer: A. Shmidshtein. Paper: ordinary.Buy Now:4 sale offers from US$ 0.11",
"Soviet Children.Series:Soviet ChildrenCatalog codes:Mi:SU 2715, Sn:SU 2700, Yt:SU 2631, Sg:SU 2809, AFA:SU 2703Themes:ChildrenIssued on:1963-01-31Size:40 x 28 mmFormat:StampEmission:CommemorativePerforation:comb 11½Printing:PhotogravurePaper:ordinaryFace value:4 Russian kopekPrint run:3,000,000Score:25%\tAccuracy: Very HighBuy Now:2 sale offers from US$ 0.08",
"Dymkov's and Zagorsk toysSeries:Decorative ArtsCatalog codes:Mi:SU 2716, Sn:SU 2701, Yt:SU 2632, Sg:SU 2810, AFA:SU 2704Themes:Art | ToysIssued on:1963-01-31Size:30 x 42 mmFormat:StampEmission:CommemorativePerforation:comb 12 x 12½Printing:Offset lithographyFace value:4 Russian kopekPrint run:3,000,000Score:22%\tAccuracy: Very HighDescription:Designer: E. Komarov. Paper: ordinary.Buy Now:2 sale offers from US$ 0.11",
"Oposhnya potterySeries:Decorative ArtsCatalog codes:Mi:SU 2717, Sn:SU 2702, Yt:SU 2633, Sg:SU 2811, AFA:SU 2705Themes:ArtIssued on:1963-01-31Size:30 x 42 mmFormat:StampEmission:CommemorativePerforation:comb 12 x 12½Printing:Offset lithographyFace value:6 Russian kopekPrint run:3,000,000Score:24%\tAccuracy: Very HighDescription:Designer: E. Komarov. Paper: ordinary.Buy Now:3 sale offers from US$ 0.08",
"Embossing booksSeries:Decorative ArtsCatalog codes:Mi:SU 2718, Sn:SU 2703, Yt:SU 2634, Sg:SU 2812, AFA:SU 2706Themes:Art | BooksIssued on:1963-01-31Size:30 x 42 mmFormat:StampEmission:CommemorativePerforation:comb 12 x 12½Printing:Offset lithographyFace value:10 Russian kopekPrint run:3,000,000Score:27%\tAccuracy: Very HighDescription:Designer: E. Komarov. Paper: ordinary.Buy Now:2 sale offers from US$ 0.44",
"Decorative Arts.Series:Decorative ArtsCatalog codes:Mi:SU 2719, Sn:SU 2704, Yt:SU 2635, Sg:SU 2813, AFA:SU 2707Themes:ArtIssued on:1963-01-31Size:30 x 42 mmFormat:StampEmission:CommemorativePerforation:comb 12 x 12½Printing:Offset lithographyPaper:ordinaryFace value:12 Russian kopekPrint run:3,000,000Score:26%\tAccuracy: Very HighBuy Now:3 sale offers from US$ 0.16"
), NULL, NULL, NULL)
As you can see, it is a list of 4 objects. We can make a vector by unlisting them with unlist(). That's up to you.
The point is that each element comes from a table with his title like this one:
I would like to obtain the same table or dataframe from the text. I observed several points on how the infor is structured:
There are combined words with the difference of the Capital letter, which corresponds to the start of the value of the row and the end of the last word.
Some variables (like Catalog codes and Themes) are formed by different elements.
Occasionally, there might be some rows that can be unpresent in other elements. In the image above, the row Variants appears in that element but not in the rest.
I tried with some functions of the tidyverse environment, but this situation exceeds my capabilities.
It seems like your data stems from webscraping. I'd suggest checking out rvest::html_table() to try and get better formatted results. Otherwise its going to be very messy (i.e. regex).
Very, very messy example code:
untangle <- function(element) {
Title = gsub("^(.*)Series:.*", "\\1", element)
Series = gsub(".*Series:(.*)(Catalog codes:.*)", "\\1", element)
CatalogCodes = gsub(".*Catalog codes:(.*)(Variants|Themes.*)", "\\1", element)
return(data.frame(Title, Series, CatalogCodes, stringsAsFactors=FALSE))
}
for (e in unlist(example)) {
print(untangle(e))
}

Persistent error in caret: "wrong model type for regression"

I am trying to do Leave One Out Cross Validation. I am following all the instructions, but don't really understand what I am doing, so I am getting an error. Maybe my dataset is too small, I can include it here:
clay oc ph_h2o avg_N2O sum_tmax
31.54643 2.654043 6.725000 5.8397204 1644.0
31.54643 2.654043 6.725000 8.9456498 1626.0
31.54643 2.654043 6.725000 36.6636187 1846.5
31.54643 2.654043 6.725000 27.9717408 1651.5
31.54643 2.654043 6.725000 13.7662532 1433.5
31.54643 2.654043 6.725000 28.4065759 1597.5
31.54643 2.654043 6.725000 9.7437375 1585.5
20.15455 1.371111 6.090909 2.8604854 1644.0
20.15455 1.371111 6.090909 11.4821949 1626.0
20.15455 1.371111 6.090909 20.1477475 1846.5
20.15455 1.371111 6.090909 3.9438700 1651.5
20.15455 1.371111 6.090909 4.8634605 1597.5
30.14316 3.224697 7.221811 10.2540652 802.5
30.14316 3.224697 7.221811 17.7039395 841.0
30.14316 3.224697 7.221811 19.3734159 983.5
30.14316 3.224697 7.221811 17.2422255 781.0
30.14316 3.224697 7.221811 17.9839534 412.5
18.06667 1.852857 5.911111 4.1653732 1012.5
18.06667 1.852857 5.911111 4.5732676 1201.0
18.06667 1.852857 5.911111 8.1417138 1003.5
8.11250 0.886250 6.650000 0.4631667 818.0
8.11250 0.886250 6.650000 2.1779861 397.5
8.11250 0.886250 6.650000 1.6355573 641.5
8.11250 0.886250 6.650000 2.8754931 259.5
22.47405 1.816556 5.684229 4.5025055 1324.0
22.47405 1.816556 5.684229 3.6881473 1634.5
22.47405 1.816556 5.684229 4.7470418 1370.0
22.47405 1.816556 5.684229 8.2378739 1559.5
The code I try on these is:
train_control<-trainControl(method="LOOCV")
control<-train(avg_N2O ~., data=slim, trControl=train_control, method="nb")
All classes should be numeric, and they are.
I've used linear regression to look at the relationship of these variables to avg_N2O, but it has been suggested that I use LOOCV. I would like to have a predictive model in the end and this is my training set.

Resources