Reading VertNet .txt files - r

I am a beginner R user so I apologize in advance. I am attempting to read a .txt data file from VertNet.org into my workspace. I have attempted to load the .txt file like I have successfully done before but this method is not identifying all of the data columns present in the file.
Sciurus_carolinensis_total <- read.delim("Sciurus_carolinensis_total.txt")
Below is the code of the columns that I want in my final data frame.
Sciurus_carolinensis_total <- select(Sciurus_carolinensis_total, c(genus, specificepithet, sex, year, month, day, countrycode, stateprovince, county, decimallatitude, decimallongitude, lengthinmm, lengthtype))
Below is the top few rows of the .txt file. I wasn't sure if this would be the appropriate way to provide this information. Let me know if another format would be better.
type modified license rightsholder accessrights bibliographiccitation references...
PhysicalObject CC0 California Academy of Sciences. CAS Mammalogy (MAM). Recor...
PhysicalObject 2009-06-05 15:46:10.0 CC0 California Academy of Sciences. CAS M...
PhysicalObject 2019-08-25 10:42:36.0 http://creativecommons.org/publicdomain/zer...
PhysicalObject 2019-09-18 16:45:02.0 http://creativecommons.org/publicdomain/zer...
PhysicalObject 2018-02-28 https://creativecommons.org/publicdomain/zero/1.0/ ht...
The data above is truncated (each line is actually well over 1400 characters) and actually has embedded tabs, but the HTML here does not show that. Here's the results of dput for that text, not truncated:
c("type\tmodified\tlicense\trightsholder\taccessrights\tbibliographiccitation\treferences\tinstitutionid\tcollectionid\tdatasetid\tinstitutioncode\tcollectioncode\tdatasetname\tbasisofrecord\tinformationwithheld\tdatageneralizations\tdynamicproperties\toccurrenceid\tcatalognumber\trecordnumber\trecordedby\tindividualcount\tsex\tlifestage\treproductivecondition\tbehavior\testablishmentmeans\toccurrencestatus\tpreparations\tdisposition\tassociatedmedia\tassociatedreferences\tassociatedsequences\tassociatedtaxa\tothercatalognumbers\toccurrenceremarks\torganismid\torganismname\torganismscope\tassociatedoccurrences\tassociatedorganisms\tpreviousidentifications\torganismremarks\tmaterialsampleid\teventid\tfieldnumber\teventdate\teventtime\tstartdayofyear\tenddayofyear\tyear\tmonth\tday\tverbatimeventdate\thabitat\tsamplingprotocol\tsamplingeffort\tfieldnotes\teventremarks\tlocationid\thighergeographyid\thighergeography\tcontinent\twaterbody\tislandgroup\tisland\tcountry\tcountrycode\tstateprovince\tcounty\tmunicipality\tlocality\tverbatimlocality\tminimumelevationinmeters\tmaximumelevationinmeters\tverbatimelevation\tminimumdepthinmeters\tmaximumdepthinmeters\tverbatimdepth\tminimumdistanceabovesurfaceinmeters\tmaximumdistanceabovesurfaceinmeters\tlocationaccordingto\tlocationremarks\tdecimallatitude\tdecimallongitude\tgeodeticdatum\tcoordinateuncertaintyinmeters\tcoordinateprecision\tverbatimcoordinates\tverbatimlatitude\tverbatimlongitude\tverbatimcoordinatesystem\tverbatimsrs\tfootprintwkt\tfootprintsrs\tgeoreferencedby\tgeoreferenceddate\tgeoreferenceprotocol\tgeoreferencesources\tgeoreferenceverificationstatus\tgeoreferenceremarks\tgeologicalcontextid\tearliesteonorlowesteonothem\tlatesteonorhighesteonothem\tearliesteraorlowesterathem\tlatesteraorhighesterathem\tearliestperiodorlowestsystem\tlatestperiodorhighestsystem\tearliestepochorlowestseries\tlatestepochorhighestseries\tearliestageorloweststage\tlatestageorhigheststage\tlowestbiostratigraphiczone\thighestbiostratigraphiczone\tlithostratigraphicterms\tgroup\tformation\tmember\tbed\tidentificationid\tidentificationqualifier\ttypestatus\tidentifiedby\tdateidentified\tidentificationreferences\tidentificationverificationstatus\tidentificationremarks\tscientificnameid\tnamepublishedinid\tscientificname\tacceptednameusage\toriginalnameusage\tnamepublishedin\tnamepublishedinyear\thigherclassification\tkingdom\tphylum\tclass\torder\tfamily\tgenus\tsubgenus\tspecificepithet\tinfraspecificepithet\ttaxonrank\tverbatimtaxonrank\tscientificnameauthorship\tvernacularname\tnomenclaturalcode\ttaxonomicstatus\ttaxonremarks\tlengthinmm\tlengthtype\tlengthunitsinferred\tmassing\tmassunitsinferred\tunderivedlifestage\tunderivedsex\tdataset_url\tdataset_citation\tgbifdatasetid\tgbifpublisherid\tdataset_contact_email\tdataset_contact\tdataset_pubdate\tlastindexed\tmigrator_version\thasmedia\thastissue\twascaptive\tisfossil\tisarch\tvntype\thaslength", "PhysicalObject\t\tCC0\t\t\tCalifornia Academy of Sciences. CAS Mammalogy (MAM). Record ID: urn:catalog:CAS:MAM:24943. Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)\thttp://portal.vertnet.org/o/cas/mam?id=urn-catalog-cas-mam-24943\t\t\t\tCAS\tMAM\t\tPreservedSpecimen\t\t\t\turn:catalog:CAS:MAM:24943\t24943\tCBC3\tC. B. Clark\t\tfemale\tadult\t\t\t\t\tSN\t\t\t\t\t\tSFSU\tMeasurements: 17 1/2-7.9-2 1/4-0.75 in; no wt. Skin only.\t\t\t\t\t\t\t\t\t\t\t1968-10-31\t\t305\t305\t1968\t10\t31\t31 Oct 1968\t\t\t\t\t\t\t\tNorth America; USA; California; San Francisco Co.\tNorth America\t\t\t\tUSA\t\tCalifornia\tSan Francisco Co.\t\tNear North Lake, Golden Gate Park, San Francisco.\t\t\t\t\t\t\t\t\t\t\t\t37.7700200000\t-122.5014300000\tNAD27\t241\t\t\t37.7700228000\t-122.5014321000\t\t\t\t\tKristina Yamamoto\t2002-12-01\tMaNIS georeferencing guidelines\tTerrain Navigator 5.03 USGS 1:24,000\tunverified\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tDouglas J. Long\t2000-05-26\t\t\t\t\t\tSciurus carolinensis pennsylvanicus\t\t\t\t\tAnimalia; Chordata; Mammalia; Rodentia; Sciuridae\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\tpennsylvanicus\t\t\t\t\t\t\t\t17\ttotal length\t1\t0.75\t1\tadult\tF\t\tCalifornia Academy of Sciences. CAS Mammalogy (MAM). Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)\t6ce7290f-47f6-4046-8356-371f5b6749df\t66522820-055c-11d8-b84e-b8a03c50a862\tmflannery#calacademy.org\tMaureen Flannery\t2019-07-23\t2019-08-04\tno migrator\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2009-06-05 15:46:10.0\tCC0\t\t\tCalifornia Academy of Sciences. CAS Mammalogy (MAM). Record ID: urn:catalog:CAS:MAM:28420. Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)\thttp://portal.vertnet.org/o/cas/mam?id=urn-catalog-cas-mam-28420\t\t\t\tCAS\tMAM\t\tPreservedSpecimen\t\t\tWeight=482.9 g.; Length=455.0 mm.\turn:catalog:CAS:MAM:28420\t28420\t\tUnknown\t\tfemale\tAdult\texternal characters\t\t\t\tTISSUE; SN\t\t\t\t\t\t\tWeight taken on 5 October 1995. Tail: 196.0 mm., HF: 60.0 mm., ear: 26.0 mm.\t\t\t\t\t\t\t\t\t\t\t1993-08-24\t\t236\t236\t1993\t8\t24\t24 August 1993\t\t\t\t\t\t\t\tNorth America; USA; California; San Mateo Co.\tNorth America\t\t\t\tUSA\t\tCalifornia\tSan Mateo Co.\t\tRedwood City\t\t\t\t\t\t\t\t\t\t\t\t37.4698300000\t-122.2227100000\tNAD27\t9469\t\t\t37.4698329000\t-122.2227144000\t\t\t\t\tJulian A. Kapoor\t2002-12-22\tMaNIS georeferencing guidelines\tTerrain Navigator 5.03 USGS 1:24,000\tunverified\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia; Chordata; Mammalia; Rodentia; Sciuridae\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\t\t\t\t\t\t\t\t455.0\ttotal length\t0\t482.9\t0\tAdult\tF\t\tCalifornia Academy of Sciences. CAS Mammalogy (MAM). Source: http://ipt.calacademy.org:8080/ipt/resource.do?r=mam (source published on 2019-07-23)\t6ce7290f-47f6-4046-8356-371f5b6749df\t66522820-055c-11d8-b84e-b8a03c50a862\tmflannery#calacademy.org\tMaureen Flannery\t2019-07-23\t2019-08-04\tno migrator\t0\t1\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2019-08-25 10:42:36.0\thttp://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Record ID: http://arctos.database.museum/guid/CHAS:Mamm:2019.1.74?seid=4307578. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)\thttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74\tCHAS\t113\t\tCHAS\tMammal specimens\t\tPreservedSpecimen\tmask part attribute location\t\tverbatim preservation date=2017-12-10 ; sex=male ; total length=504 mm; tail length=219 mm; weight=630 g; fat deposition=subcutaneous fat: heavy\thttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74?seid=4307578\tCHAS:Mamm:2019.1.74\t\tCollector(s): Steve Sullivan; Preparator(s): Yuqing Wang\t1\tmale\t\t\t\t\t\tskull; skin, study\t\t\t\t\t\tpreparator number=YW-02\t\thttp://arctos.database.museum/guid/CHAS:Mamm:2019.1.74\t\t\t\t\t<i>Sciurus carolinensis</i> (accepted ID) identified by Yuqing Wang on 2017-12-10; method: student Remark: Eastern gray squirrel.\t\t\t\t\t2014-11-04\t\t308\t308\t2014\t11\t04\tNov 4 2014\t\t\t\t\t\t\t\tNorth America, United States, Illinois, Cook County\tNorth America\t\t\t\tUnited States\t\tIllinois\tCook County\t\tPeggy Notebaert Nature Museum, 2430 North Cannon Drive, Chicago\tPNNM, Chicago, Cook, IL\t\t\t\t\t\t\t\t\tSteve Sullivan\t\t41.926469\t-87.634817\tnot recorded\t131\t\t\t\t\t\t\t\t\tSteve Sullivan\t2014-11-04\tGeoLocate\tGeoLocate\tunverified\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tA\t\tYuqing Wang\t2017-12-10\t\tstudent\tEastern gray squirrel.\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia; Chordata; Mammalia; Rodentia; Sciuridae; Sciurinae;\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t504\ttotal length\t0\t630\t0\t\tmale\t\tChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)\t4837d6b0-19fe-4fe4-9ddf-b62dd17a060e\tf2489500-dbab-4fbc-95ed-19eead127483\tdroberts#naturemuseum.org\tDawn Roberts\t2019-07-07\t2019-09-21\tno migrator\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2019-09-18 16:45:02.0\thttp://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Record ID: http://arctos.database.museum/guid/CHAS:Mamm:3720?seid=4235951. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)\thttp://arctos.database.museum/guid/CHAS:Mamm:3720\tCHAS\t113\t\tCHAS\tMammal specimens\t\tPreservedSpecimen\tmask part attribute location\t\tverbatim collector=E.V. Komarek ; sex=male ; unformatted measurements=TL: 435, T: 202, HF: 60, Et.: 363g ; total length=435 mm; tail length=202 mm; hind foot with claw=60 mm; weight=363 g\thttp://arctos.database.museum/guid/CHAS:Mamm:3720?seid=4235951\tCHAS:Mamm:3720\t\tCollector(s): Edwin V. Komarek\t1\tmale\t\t\t\t\t\tskull; skin, study\t\t\t\t\t\tUUID=149e86ee-2118-410f-8248-a0d859201335; secondary identifier=Scc-24\t\"INTERNAL NOTES: collection date listed as January 7, 1936 in Mammal Catalog Book (taped spine), but as \"January 1, 1936\" in 2nd Mammal Catalog Book, needs to be verified [A.King]. DATA HISTORY: Inventory catalogued/verified by Collections staff (2008-2010 inventory). Record last updated in Excel (prior to Arctos migration) by Dawn R. Roberts (2013-11-30). Date listed as entered in original FileMaker database: 1988-07-06.\"\thttp://arctos.database.museum/guid/CHAS:Mamm:3720\t\t\t\t\t<i>Sciurus carolinensis</i> (accepted ID) identified by unknown; method: legacy Remark: Eastern Gray Squirrel.<br><i>Sciurus carolinensis carolinensis</i> identified by unknown; method: legacy\t\t\t\t\t1936-01-01\t\t1\t1\t1936\t01\t01\t[transcribed directly into formatted date fields]\t\t\t\t\t\t\t\tNorth America, United States, Georgia, Charlton County\tNorth America\t\t\t\tUnited States\t\tGeorgia\tCharlton County\t\tChase Prairie, Okefenokee Swamp\tChase Prairie, Okefenokee Swamp, Georgia\t\t\t\t\t\t\t\t\tEdwin V. Komarek\tGeoreferenced by John Keating on 16 June 2015 [achinn 20 December 2018].\t30.816062\t-82.226234\tWorld Geodetic System 1984\t2435\t\t30.816062/-82.226234\t\t\tdecimal degrees\t\t\t\tEdwin V. Komarek\t1936-01-01\tGeoLocate\tGeoLocate\tunverified\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tA\t\tunknown\t\t\tlegacy\tEastern Gray Squirrel.\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia; Chordata; Mammalia; Rodentia; Sciuridae; Sciurinae;\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t435\ttotal length\t0\t363\t0\t\tmale\t\tChicago Academy of Sciences. CHAS Mammalogy Collection (Arctos). Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=chas_mammals (source published on 2019-07-07)\t4837d6b0-19fe-4fe4-9ddf-b62dd17a060e\tf2489500-dbab-4fbc-95ed-19eead127483\tdroberts#naturemuseum.org\tDawn Roberts\t2019-07-07\t2019-09-21\tno migrator\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 67dd2632-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=67dd2632-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"62\",\"stomach contents\":\"Masticated nut meats\",\"tail length in mm\":\"200\",\"total length in mm\":\"480\",\"weight\":\"515.3\",\"ear length from notch\":\"31\",\" left gonad length in mm\":\"26\",\" left gonad width in mm\":\"15\",\" right gonad length in mm\":\"26\",\" right gonad width in mm\":\"15\",\"weight unit\":\"g\" }\t67dd2632-537e-11e6-9649-a4a3446a4726\t16608\t\tDon Schoffler\t\tmale\t\tTestes scrotal\t\t\tpresent\tskeleton\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t1993-03-25\t\t84\t84\t1993\t3\t25\t1993-03-25\t\t\t\t\tcollecting method: shot\t\t\tNorth America | United States | New York | | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\t\t\tSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\tNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\t\t\t\t\t\t\t\t\t\t\t42.276659\t-76.655259\tWGS84\t3225\t\t\t\t\t\t\t\t\tDBCreator\t\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t480\ttotal length\t0\t515.3\t1\t\tmale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6806d370-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=6806d370-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"59\",\"tail length in mm\":\"163\",\"total length in mm\":\"440\",\"weight\":\"583.5\",\"ear length from notch\":\"26\",\"weight unit\":\"g\" }\t6806d370-537e-11e6-9649-a4a3446a4726\t21304\t\tEdward S. Thomas\t\tmale\tadult\tTestes scrotal 22 x 12 mm\t\t\tpresent\tstudy skin - 1; skeleton - 1; tissue (frozen) - 1\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t2010-04-06\t\t96\t96\t2010\t4\t6\t2010-04-06\t\t\t\t\t\t\t\tNorth America | United States | New York | Tompkins County | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\tTompkins\t\tDryden Township, Ellis Hollow\tNorth America | United States | New York | Tompkins County | Dryden Township, Ellis Hollow\t\t\t\t\t\t\t\t\t\t\t42.42766\t-76.38849\tWGS84\t2078\t\t\t\t\t\t\t\t\tDBCreator\t2007-06-13\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t440\ttotal length\t0\t583.5\t1\tadult\tmale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t1\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6810ae67-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=6810ae67-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"64\",\"tail length in mm\":\"230\",\"total length in mm\":\"505\",\"weight\":\"580\",\"weight unit\":\"g\" }\t6810ae67-537e-11e6-9649-a4a3446a4726\t1832\t\tWilliam J. Hamilton Jr.\t\tmale\tadult\tTestes enlarged & descended\t\t\tpresent\tstudy skin - 1; skull - 1\t\t\t\t\t\t\tRec'd from W.J. Hamilton, Jr.\t\t\t\t\t\t\t\t\t\t\t1938-10-05\t\t278\t278\t1938\t10\t5\t1938-10-05\t\t\t\t\tcollecting method: killed by car\t\t\tNorth America | United States | New York | Tompkins County | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\tTompkins\t\tNewfield\tNorth America | United States | New York | Tompkins County | Newfield\t\t\t\t\t\t\t\t\t\t\t42.362018\t-76.590778\tnot recorded (forced WGS84)\t3036\t\t\t\t\t\t\t\t\t\t\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t505\ttotal length\t0\t580\t1\tadult\tmale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-03-20\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 681424b0-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=681424b0-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"71\",\"tail length in mm\":\"208\",\"total length in mm\":\"476\",\"weight\":\"455\",\"weight unit\":\"g\" }\t681424b0-537e-11e6-9649-a4a3446a4726\t3842\tWJHJ 2460\tWilliam J. Hamilton Jr.\t\tmale\tadult\tTestes small, not descended.\t\t\tpresent\tstudy skin - 1; skull - 1\t\t\t\t\t\t\tSkull broken but saved. Fleas preserved.; Received from W.J. Hamilton, Jr.\t\t\t\t\t\t\t\t\t\t\t1947-01-28\t\t28\t28\t1947\t1\t28\t1947-01-28\t\t\t\t\t\t\t\tNorth America | United States | New York | Tompkins County | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\tTompkins\t\tIthaca Township, Cayuga Heights, Highland Road\tNorth America | United States | New York | Tompkins County | Ithaca Township, Cayuga Heights, Highland Road\t\t\t\t\t\t\t\t\t\t\t42.465699\t-76.490741\tWGS84\t1488\t\t\t\t\t\t\t\t\tDBCreator\t2009-08-26\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis leucotis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\tleucotis\tsubspecies\t\t\t\tICZN\t\t\t476\ttotal length\t0\t455\t1\tadult\tmale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 68340671-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=68340671-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"63\",\"tail length in mm\":\"210\",\"total length in mm\":\"405\",\"weight\":\"557\",\"ear length from notch\":\"18\",\" left gonad length in mm\":\"25\",\" left gonad width in mm\":\"15\",\" right gonad length in mm\":\"32\",\" right gonad width in mm\":\"15\",\"weight unit\":\"g\" }\t68340671-537e-11e6-9649-a4a3446a4726\t17452\t\tDon Schoffler\t\tmale\t\t\t\t\tpresent\tskeleton\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t1993-03-25\t\t84\t84\t1993\t3\t25\t1993-03-25\t\t\t\t\tcollecting method: shot\t\t\tNorth America | United States | New York | | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\t\t\tSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\tNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\t\t\t\t\t\t\t\t\t\t\t42.276659\t-76.655259\tWGS84\t3225\t\t\t\t\t\t\t\t\tDBCreator\t\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t405\ttotal length\t0\t557\t1\t\tmale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 6834ab99-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=6834ab99-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"62\",\"tail length in mm\":\"220\",\"total length in mm\":\"465\",\"weight\":\"536\",\"ear length from notch\":\"32\",\"weight unit\":\"g\" }\t6834ab99-537e-11e6-9649-a4a3446a4726\t18586\t\tDora E. Worbs\t\tfemale\t\t\t\t\tpresent\tstudy skin - 1; skeleton - 1\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t1995-11-02\t\t306\t306\t1995\t11\t2\t1995-11-02\t\t\t\t\tcollecting method: killed by cat\t\t\tNorth America | United States | New York | Tompkins County | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\tTompkins\t\tCaroline Township, White Church & Ridgeway Roads, ~4.3 km S Brooktondale\tNorth America | United States | New York | Tompkins County | Caroline Township, White Church & Ridgeway Roads, ~4.3 km S Brooktondale\t\t\t\t\t\t\t\t\t\t\t42.3443\t-76.3857\tWGS84\t112\t\t\t\t\t\t\t\t\tDBCreator\t2009-01-26\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t465\ttotal length\t0\t536\t1\t\tfemale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1", "PhysicalObject\t2018-02-28\thttps://creativecommons.org/publicdomain/zero/1.0/\t\thttp://vertnet.org/resources/norms.html\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Record ID: 68384b01-537e-11e6-9649-a4a3446a4726. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\thttp://portal.vertnet.org/o/cumv/mamm?id=68384b01-537e-11e6-9649-a4a3446a4726\thttp://grbio.org/cool/i64g-wjcr\thttp://grbio.org/cool/67hr-z96t\t\tCUMV\tMamm\t\tPreservedSpecimen\t\t\t{\"hind foot length with claw in mm\":\"70\",\"stomach contents\":\"Empty\",\"tail length in mm\":\"233\",\"total length in mm\":\"475\",\"weight\":\"512\",\"ear length from notch\":\"31\",\"weight unit\":\"g\" }\t68384b01-537e-11e6-9649-a4a3446a4726\t17444\t\tunknown\t\tfemale\t\tShrunken, abdominal\t\t\tpresent\tskeleton\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t1994-03-23\t\t82\t82\t1994\t3\t23\t1994-03-23\t\t\t\t\tcollecting method: shot\t\t\tNorth America | United States | New York | | | | |\tNorth America\t\t\t\tUnited States\tUS\tNew York\t\t\tSchuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\tNorth America | United States | New York | Schuyler/Tompkins County, Cauyta Township, ~4.5 km ESE of Cayuta, Arnot Forest\t\t\t\t\t\t\t\t\t\t\t42.276659\t-76.655259\tWGS84\t3225\t\t\t\t\t\t\t\t\tDBCreator\t\t\t\trequires verification\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tSciurus carolinensis\t\t\t\t\tAnimalia | Chordata | Mammalia | Rodentia | Sciuridae | Sciurus\tAnimalia\tChordata\tMammalia\tRodentia\tSciuridae\tSciurus\t\tcarolinensis\t\tspecies\t\t\t\tICZN\t\t\t475\ttotal length\t0\t512\t1\t\tfemale\t\tCornell University Museum of Vertebrates. CUMV Mammal Collection. Source: http://ipt.vertnet.org:8080/ipt/resource.do?r=cumv_mamm (source published on 2018-07-02)\t35720b3e-aded-4b83-b4f1-967f1d457d6a\tcf9ceb80-9f3d-11da-b791-b8a03c50a862\tcbd63#cornell.edu\tCasey Dillman\t2018-07-02\t2018-07-03\t2018-01-08\t0\t0\t0\t0\t0\tspecimen\t1")
Below is the code I used to load the .txt file and create the 13 column data frame.
setwd()
install.packages("tidyverse")
library(tidyverse)
install.packages("lubridate")
library(lubridate)
install.packages("cowplot")
library(cowplot)
Sciurus_carolinensis_total <- readr::read_tsv("Sciurus_carolinensis_total.txt") %>% select(genus, specificepithet, sex, year, month, day, countrycode, stateprovince, county, decimallatitude, decimallongitude, lengthinmm, lengthtype)
Thank you for the help. It was far more simple a solution than I thought.

Related

Names list and store element in different categories in R

dataframe tidy2:
I'd like to convert the part of dataframe that satisfies my specific conditions into a list. For example, for all communities contain "BAYSHIRE", I want to store their facility number, name and licensee as the first element in a list and called this element "BAYSHIRE", for communities contain "EMERITUS" or "BROOKDALE", store their facility number name and licensee as the second element in a list.
key = list("BAYSHIRE", c("EMERITUS", "BROOKDALE"), "AGEMARK")
big_organization <- list()
for (i in 1:length(key)){
org = tidy2 %>%
filter(str_detect(Licensee, key[[i]]) | str_detect(Facility_name, key[[i]] )) %>%
select(Facility_number, Facility_name, Licensee)
big_organization = append(big_organization, or)
}
big_organization
Output for above code:
ck.imgur.com/88Ya3.png
Expected output:
$BAYSHIRE
Facility_number Facility_name Licensee
374604274 VISTA DEL LAGO MEMORY CARE DEL DIOS CARE,LLC;BAYSHIRE, LLC
374604267 CLOISTERS OF THE VALLEY, LLC DEL RIO CARE, LLC; BAYSHIRE, LLC
374603778 HERITAGE HILLS HAWKES O-SIDE 1 LLC; BAYSHIRE LLC
$EMERITUS, BROOKDALE
Facility_number Facility_name Licensee
336413087 BROOKDALE MURRIETA BLC CHANCELLOR-MURRIETA LH LLC
197606301 BROOKDALE MONROVIA BLC GABLES-MONROVIA LP
197606676 BROOKDALE GARDENS OF TARZANA BLC GARDENS-TARZANA, LP
496803339 BROOKDALE PAULIN CREEK BLC LODGE AT PAULIN INC GP, BLC LODGE AT PAULIN LP
306002955 BROOKDALE NOHL RANCH BLC NOHL RANCH LLC
198204758 BROOKDALE OCEAN HOUSE BLC OCEAN HOUSE LP
374601046 BROOKDALE PLACE OF SAN MARCOS BLC-BROOKDALE PLACE OF SAN MARCOS LP
306003639 BROOKDALE BREA BREA BREA LLC; EMERITUS CORPORATION
347003712 BROOKDALE SYLVAN RANCH BREA CITRUS HEIGHTS LLC; EMERITUS CORPORATION
197606945 BROOKDALE CENTRAL WHITTIER BREA WHITTIER LLC; EMERITUS CORPORATION
.
.
.
.

JOINing databases with SQLite

I have 4 databases relating to the America's Cup.
SELECT * FROM teams
>
Code | Country | TeamName
ITA |Italy | Luna Rossa Prada Pirelli Team
NZ |New Zealand | Emirates Team New Zealand
UK |United Kingdom | INEOS Team UK
USA |United States of America | NYYC American Magic
4 rows
SELECT * FROM races
>
Race Tournament Date Racedate
RR1R1 RR 15-Jan 18642
RR1R2 RR 15-Jan 18642
RR1R3 RR 16-Jan 18643
RR2R1 RR 16-Jan 18643
RR2R2 RR 17-Jan 18644
RR2R3 RR 17-Jan 18644
RR3R1 RR 23-Jan 18650
RR3R2 RR 23-Jan 18650
RR3R3 RR 23-Jan 18650
SFR1 SF 29-Jan 18656
1-10 of 31 rows
SELECT * FROM tournaments
>
Tournament Event TournamentName
RR Prada Cup Round Robin
SF Prada Cup Semi-Final
F Prada Cup Final
AC America's Cup Americas Cup
4 rows
SELECT *
FROM results
>
Race Code Result
FR1 ITA Win
FR1 UK Loss
FR2 UK Loss
FR2 ITA Win
FR3 UK Loss
FR3 ITA Win
FR4 ITA Win
FR4 UK Loss
FR5 ITA Win
FR5 UK Loss
1-10 of 62 rows
and I'm trying to write an SQL query that will output the number of races each team won by tournament, and show the output. The output table should include the full name of the Event, the Tournament and the full name of each team. My query at the moment looks like this:
SELECT TeamName, Result, Event, tournaments.Tournament
FROM teams LEFT JOIN results
ON teams.Code = results.Code
LEFT JOIN races
ON results.Race = races.Race
LEFT JOIN tournaments
ON races.Tournament = tournaments.Tournament
WHERE Result = 'Win'
ORDER BY tournaments.Tournament
which outputs:
TeamName Result Event Tournament
Emirates Team New Zealand Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
Luna Rossa Prada Pirelli Team Win America's Cup AC
Luna Rossa Prada Pirelli Team Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
Luna Rossa Prada Pirelli Team Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
Emirates Team New Zealand Win America's Cup AC
When I try to COUNT(Result) AS NumberOfWins, I get:
TeamName Result NumberOfWins Event Tournament
Luna Rossa Prada Pirelli Team Win 31 Prada Cup F
1 row
Why does adding the count count only Luna Rossa's wins? How can I change the query to fix it?
Why does adding the count count only Luna Rossa's wins?
Count() is an aggregate function and produces one result per GROUP.
As you have no GROUP BY clause the entire result set is a single group and hence the single result.
The reason why you got Tournament F is due to
If the SELECT statement is an aggregate query without a GROUP BY clause, then each aggregate expression in the result-set is evaluated once across the entire dataset. Each non-aggregate expression in the result-set is evaluated once for an arbitrarily selected row of the dataset. The same arbitrarily selected row is used for each non-aggregate expression. Or, if the dataset contains zero rows, then each non-aggregate expression is evaluated against a row consisting entirely of NULL values. As per SQLite SELECT -
How can I change the query to fix it?
So you need a GROUP BY clause. To create groups upon which the count() function will work on.
You probably want GROUP BY Tournament,TeamName
e.g.
SELECT TeamName, Result, Event, tournaments.Tournament, count(*)
FROM teams LEFT JOIN results
ON teams.Code = results.Code
LEFT JOIN races
ON results.Race = races.Race
LEFT JOIN tournaments
ON races.Tournament = tournaments.Tournament
WHERE Result = 'Win'
GROUP BY Tournament,Teamname
ORDER BY tournaments.Tournament

extract data from XML files - R

I'm new to extracting data from XML file. I'm trying to process the following an XML file using R XML packages. The information I want is in the attribute values.
I encounter two difficulties:
some attribute values exist in one node, but not in another node. For example, "DRP" has the information in the second but not in the first
some attributes has multiple values for an individual and i don't know how to link them to that individual. For example, "EmpHs" has multiple records for an individual (identified by indvlPK).
Ideally I want the output data has the structure similar to the following:
lastNm
firstNm
indvlPK
fromDt
orgNm
hasCustComp
GIGAX
JEFFREY
2783477
03/2004
GATEWAY FINANCIAL ADVISORS, INC
GIGAX
JEFFREY
2783477
03/2004
GFA IN
GIGAX
JEFFREY
2783477
01/2007
UNITED FIRST
HINSON
BRIAN
2783737
07/1996
LINCOLN FINANCIAL ADVISORS CORPORATION
Y
HINSON
BRIAN
2783737
07/1996
FIRST FINANCIAL GROUP
Y
Is there any way I can parse the data correctly? Thanks!
The code I used but didn't give me what I want:
doc <- "Test.xml"
ind <- xmlParse(doc)
xmltop = xmlRoot(ind)
temp1 <- data.frame(unlist(getNodeSet(xmltop,"//Info/#lastNm")))
temp2 <- data.frame(unlist(getNodeSet(xmltop,"//Info/#firstNm")))
temp3 <- data.frame(unlist(getNodeSet(xmltop,"//Info/#indvlPK")))
temp4 <- data.frame(unlist(getNodeSet(xmltop,"//EmpHs/#fromDt")))
temp5 <- data.frame(unlist(getNodeSet(xmltop,"//DRP/#hasCustComp")))
The data is here:
<?xml version="1.0" encoding="ISO-8859-1"?>
<IAPDIndividualReport GenOn="2021-03-29">
<Indvls>
<Indvl>
<Info lastNm="GIGAX" firstNm="JEFFREY" midNm="W" indvlPK="2783477" actvAGReg="Y" link="https://adviserinfo.sec.gov/individual/summary/2783477"/>
<OthrNms/>
<CrntEmps>
<CrntEmp orgNm="CAMBRIDGE INVESTMENT RESEARCH ADVISORS, INC." orgPK="134139" str1="1776 PLEASANT PLAIN RD." city="FAIRFIELD" state="IA" cntry="United States" postlCd="52556-8757">
<CrntRgstns>
<CrntRgstn regAuth="MO" regCat="RA" st="APPROVED" stDt="2010-09-09"/>
</CrntRgstns>
<BrnchOfLocs>
<BrnchOfLoc city="O&apos;FALLON" state="MO" cntry="United States"/>
</BrnchOfLocs>
</CrntEmp>
</CrntEmps>
<Exms>
<Exm exmCd="S63" exmNm="Uniform Securities Agent State Law Examination" exmDt="1996-08-20"/>
<Exm exmCd="S65" exmNm="Uniform Investment Adviser Law Examination" exmDt="1999-12-21"/>
</Exms>
<Dsgntns/>
<PrevRgstns>
<PrevRgstn orgNm="WOODBURY FINANCIAL SERVICES, INC." orgPK="421" regBeginDt="2009-01-05" regEndDt="2009-12-03">
<BrnchOfLocs>
<BrnchOfLoc city="OFALLON" state="MO"/>
<BrnchOfLoc city="OFALLON" state="MO"/>
<BrnchOfLoc city="DUBLIN" state="CA"/>
</BrnchOfLocs>
</PrevRgstn>
<PrevRgstn orgNm="FSC SECURITIES CORPORATION" orgPK="7461" regBeginDt="2004-10-29" regEndDt="2008-12-01">
<BrnchOfLocs>
<BrnchOfLoc city="O&apos;FALLON" state="MO"/>
<BrnchOfLoc city="ST. PETERS" state="MO"/>
</BrnchOfLocs>
</PrevRgstn>
<PrevRgstn orgNm="GATEWAY FINANCIAL ADVISORS, INC." orgPK="115025" regBeginDt="2004-11-11" regEndDt="2006-10-11">
<BrnchOfLocs>
<BrnchOfLoc city="ST. PETERS" state="MO"/>
</BrnchOfLocs>
</PrevRgstn>
</PrevRgstns>
<EmpHss>
<EmpHs fromDt="03/2004" orgNm="GATEWAY FINANCIAL ADVISORS, INC" city="OFALLON" state="MO"/>
<EmpHs fromDt="03/2004" orgNm="GFA INC" city="OFALLON" state="MO"/>
<EmpHs fromDt="01/2007" orgNm="UNITED FIRST" city="OFALLON" state="MO"/>
<EmpHs fromDt="09/2010" orgNm="CAMBRIDGE INVESTMENT RESEARCH ADVISORS, INC" city="FAIRFIELD" state="IA"/>
<EmpHs fromDt="09/2010" orgNm="CAMBRIDGE INVESTMENT RESEARCH, INC" city="FAIRFIELD" state="IA"/>
</EmpHss>
<OthrBuss>
<OthrBus desc="1)STONEBRIDGE WEALTH MANAGEMENT GROUP, 728 HAWK RUN DR, O&apos;FALLON, MO, 3/2008 AS INDEPENDENT INSURANCE AGENT FOR VARIOUS INDEPENDENT INSURANCE COMPANIES. INV REL - 40/MO - 20/TRADING. 2)UNITED FIRST FINANCIAL MORTGAGE SOFTWARE SALES. START 6/1/07, 10 HOURS PER MONTH, 5 DURING TRADING HOURS. NO OWNERSHIP INTEREST. 3)MORTGAGE STOP INC., 728 HAWK RUN DR., OFALLON, MO 63368. LOAN OFFICER PROCESSING LOAN APPS FOR CLIENTS. START 6/1/2002, 25 HOURS PER MONTH, 10 DURING TRADING HOURS. NO OWNERSHIP. 4)CIRA, 1776 PLEASANT PLAIN RD, FAIRFIELD, IA, AS ADVISORY REP OF A RIA. INV REL - 40 HR/WK - 40/TRADING. SEE EMPLOYMENT HISTORY FOR START DATE. 5) THE MORTGAGE SHOP, 355 MID RIVERS MALL DRIVE, STE E, ST. PETERS, MO 63376. MORTGAGE ORIGINATOR SINCE 01/01/99. NOT INVESTMENT RELATED. WORKS 60 HOURS PER MONTH, 20 OF WHICH ARE DURING TRADING HOURS. 6.365 PROPERTIES LLC, O&apos;FALLON, MO, 8/2018 AS OWNER OF LLC THAT BUYS, SELLS, & HOLDS REAL ESTATE. NIR - 20/MO - 0/TRADING. 7. BEST OFFER HOMES, LLC, 728 HAWK RUN DRIVE, O&apos;FALLON, MO, REAL ESTATE SALES/MORTGAGE ORIGINATION/ ACCOUNTING/FINANCIAL ACTIVITIES, 06/16/20, NIR, 20/MO- 0/TRADING 8. GIGAX WEALTH MANAGEMENT, 728 HAWK RUN DRIVE, OFALLON, MO, INDEPENDENT INSURANCE AGENT FOR VARIOUS INDEPENDENT INSURANCE COMPANIES,11/23/20, INV REL, 10 HR/WK- 10 TRADING HR."/>
</OthrBuss>
<DRPs/>
</Indvl>
<Indvl>
<Info lastNm="HINSON" firstNm="BRIAN" midNm="TROY" indvlPK="2783737" actvAGReg="Y" link="https://adviserinfo.sec.gov/individual/summary/2783737"/>
<OthrNms/>
<CrntEmps>
<CrntEmp orgNm="BRIDGEWORTH WEALTH MANAGEMENT" orgPK="164100" str1="101 25TH STREET NORTH" city="BIRMINGHAM" state="AL" cntry="United States" postlCd="35203">
<CrntRgstns>
<CrntRgstn regAuth="AL" regCat="RA" st="APPROVED" stDt="2015-05-12"/>
<CrntRgstn regAuth="TX" regCat="RA" st="APPROVED_RES" stDt="2015-05-01"/>
</CrntRgstns>
<BrnchOfLocs>
<BrnchOfLoc str1="400 MERIDIAN STREET" str2="SUITE 200" city="HUNTSVILLE" state="AL" cntry="United States" postlCd="35801"/>
<BrnchOfLoc str1="101 25TH STREET NORTH" city="BIRMINGHAM" state="AL" cntry="United States" postlCd="35203"/>
</BrnchOfLocs>
</CrntEmp>
</CrntEmps>
<Exms>
<Exm exmCd="S63" exmNm="Uniform Securities Agent State Law Examination" exmDt="1996-10-11"/>
</Exms>
<Dsgntns>
<Dsgntn dsgntnNm="Certified Financial Planner"/>
<Dsgntn dsgntnNm="Chartered Financial Consultant"/>
<Dsgntn dsgntnNm="Personal Financial Specialist"/>
</Dsgntns>
<PrevRgstns>
<PrevRgstn orgNm="LINCOLN FINANCIAL ADVISORS CORPORATION" orgPK="3978" regBeginDt="2000-04-25" regEndDt="2015-05-11">
<BrnchOfLocs>
<BrnchOfLoc city="HUNTSVILLE" state="AL"/>
<BrnchOfLoc city="HUNTSVILLE" state="AL"/>
</BrnchOfLocs>
</PrevRgstn>
</PrevRgstns>
<EmpHss>
<EmpHs fromDt="04/2015" orgNm="BRIDGEWORTH, LLC" city="HUNTSVILLE" state="AL"/>
<EmpHs fromDt="07/1996" toDt="04/2015" orgNm="LINCOLN FINANCIAL ADVISORS CORPORATION" city="HUNTSVILLE" state="AL"/>
<EmpHs fromDt="07/1996" toDt="04/2015" orgNm="FIRST FINANCIAL GROUP" city="BIRMINGHAM" state="AL"/>
<EmpHs fromDt="04/2015" orgNm="LPL FINANCIAL LLC" city="HUNTSVILLE" state="AL"/>
</EmpHss>
<OthrBuss>
<OthrBus desc="1) 04/30/2015: BRIDGEWORTH FINANCIAL, LLC - DBA FOR LPL BUSINESS (ENTITY FOR LPL BUSINESS) - INV REL - AT REPORTED BUSINESS LOCATIONS - START 01/01/2015 - 1% OF TIME SPENT 2) 04/30/2015: BRIDGEWORTH, LLC - INV REL - AT REPORTED BUSINESS LOCATION(S) - REGISTERED INVESTMENT ADVISOR HYBRID - START 01/2015 - 99% OF TIME SPENT. 3) 5/11/2015: NO BUSINESS NAME - INVESTMENT RELATED - AT REPORTED BUSINESS LOCATION(S) - NON-VARIABLE INSURANCE - STARTED 4/1/2015 - TIME SPENT 1% - LINES OF INSURANCE INCLUDE TERM, WHOLE, UNIVERSAL, LTC, DISABILITY. 4) 6/2/2017 - Bridgeworth Financial - Investment Related - At Reported Business Location(s) - DBA for LPL Business (entity for LPL business) - Started 04/30/2015 - 5 Hours Per Month/3 Hours During Securities Trading. 5) 5/8/2018 - Foster Properties Ltd - Not Investment Related - Home Based - Other-Family Business - Started 12/22/1997 - 1 Hours Per Month/0 Hours During Securities Trading - Handle the majority of business matters for this family business."/>
</OthrBuss>
<DRPs>
<DRP hasRegAction="N" hasCriminal="N" hasBankrupt="N" hasCivilJudc="N" hasBond="N" hasJudgment="N" hasInvstgn="N" hasCustComp="Y" hasTermination="N"/>
</DRPs>
</Indvl>
</Indvls>
</IAPDIndividualReport>

shyquote does not work! What to do?

I would like to convert the following list into a list where all names are between " " (spaces)
I tried shQuote, gsub(" ", "", ) and these methods; Creating a comma separated vector, but no success so far...
George Ezra, Faith No More, Above & Beyond, Paloma Faith, Gavin James, DJ’s Waxfiend, Jebroer, Adje, Pop Evil, Jick munro & the amazing laserbeams, Robbie Williams, Avicii, The Script, Anouk, Kensington, Eagles of Death Metal, Dotan, The Wombats, Selah Sue, Shappard, John Coffey, Magic!, Joost van Bellen, East Camoran Folkcore, Foo Fighters, Pharrel Williams, Sam Smith, One Republic, Rise Agianst, De Jeugd van Tegenwoordig, Counting Crows, Fiddler’s Green, Thyphoon, Kovacs, Kitty, Daisy & Lewis, Oscar and the Wolf, Nick Mulvey, Urbanus, Willie Wartaal, Doppelgang, Ewert and the two dragons, Pierce Brothers,Kovacs, The Kendolls, Stringcaster, Sunday Sun, Toy Dolls, A$AP Rocky, Ride, Eskmo, Temples, The Pop Group, Blank Mass, Cairo Liberation Front, Daniel Norgren, Follakzoid, Ghost Culture, John Coffey, Kevin Morby, Kuenta I Tambu, Marmozets, Mourn, Patten, Sue The Night, The Coathangers, Tora, Vessels, The Libertines, Noel Gallagher’s High Flying Brids, Noel Gallagher, AltJ, Altj, Royal Blood, Sohn, The Jesus & Mary Chain, The Tallest Man On Earth, Black Mountain, Chet Faker, Death Cab For Cutie, Ear Sweatshirt, Evian Christ, Frist Aid Kit, Future Islands, Jonny Greenwood, Mew, Of Monsters And Men, The Vaccines, Ariel Pink, Alvvays, Wolf Alice, Weval, BADBADNOTGOOD, Bass Drum Of Death, Yak, Daniel Romand, Dan Dercon, Eagulls, Gengahr, Fickle friends, Steve Gunn, Liima, Hookworms, Kate Tempest, Kiasmds, Strand of Oaks, Little May , Matthew E. White, Metz, Off!, St. Paul, St. Paul & The Broken Bones, Pissed Jeans, Pretty Vicious, Reigning Sound, Outfit, Sunset Sons, Waxahatchee, Daniel Wilson, Yung Lean, Kindess, Hinds,Damien Rice, The War On Drugs, Iggy Pop, FKA Twigs, Patti Smith And Her Band Perform Horses, Flying Lotus, Fat Freddy’s Drop, Damian Jr Gong Marley, Alabama Shakes, The Gaslamp Killer, Max Richter, Motorpsycho, Goat, Songhoy Blues, Andrew Brid, Glass Animals, King Gizzard & The Lizard Wizard, Misun, JD MCPherson, Happyness, Dolomite Minor, Meridian Brothers, Death From Above 1979, Blaudzun, Oscar And The Wolf, Clark, Ghost Poet, Omar Souleyman, Rhye, Bejamin Booker, Orkesta Mendoza, Ganz,The Chemical Brothers, Patrick Watson, Bleachers, The War on Drugs, The Antlers, Hot Chip, Rico & Sticks, Awolnation
A simple strsplit() will do.
my.bands <- "George Ezra, Faith No More, Above & Beyond, Paloma Faith, Gavin James, DJ’s Waxfiend, Jebroer, Adje, Pop Evil, Jick munro & the amazing laserbeams, Robbie Williams, Avicii, The Script, Anouk, Kensington, Eagles of Death Metal, Dotan, The Wombats, Selah Sue, Shappard, John Coffey, Magic!, Joost van Bellen, East Camoran Folkcore, Foo Fighters, Pharrel Williams, Sam Smith, One Republic, Rise Agianst, De Jeugd van Tegenwoordig, Counting Crows, Fiddler’s Green, Thyphoon, Kovacs, Kitty, Daisy & Lewis, Oscar and the Wolf, Nick Mulvey, Urbanus, Willie Wartaal, Doppelgang, Ewert and the two dragons, Pierce Brothers,Kovacs, The Kendolls, Stringcaster, Sunday Sun, Toy Dolls, A$AP Rocky, Ride, Eskmo, Temples, The Pop Group, Blank Mass, Cairo Liberation Front, Daniel Norgren, Follakzoid, Ghost Culture, John Coffey, Kevin Morby, Kuenta I Tambu, Marmozets, Mourn, Patten, Sue The Night, The Coathangers, Tora, Vessels, The Libertines, Noel Gallagher’s High Flying Brids, Noel Gallagher, AltJ, Altj, Royal Blood, Sohn, The Jesus & Mary Chain, The Tallest Man On Earth, Black Mountain, Chet Faker, Death Cab For Cutie, Ear Sweatshirt, Evian Christ, Frist Aid Kit, Future Islands, Jonny Greenwood, Mew, Of Monsters And Men, The Vaccines, Ariel Pink, Alvvays, Wolf Alice, Weval, BADBADNOTGOOD, Bass Drum Of Death, Yak, Daniel Romand, Dan Dercon, Eagulls, Gengahr, Fickle friends, Steve Gunn, Liima, Hookworms, Kate Tempest, Kiasmds, Strand of Oaks, Little May , Matthew E. White, Metz, Off!, St. Paul, St. Paul & The Broken Bones, Pissed Jeans, Pretty Vicious, Reigning Sound, Outfit, Sunset Sons, Waxahatchee, Daniel Wilson, Yung Lean, Kindess, Hinds,Damien Rice, The War On Drugs, Iggy Pop, FKA Twigs, Patti Smith And Her Band Perform Horses, Flying Lotus, Fat Freddy’s Drop, Damian Jr Gong Marley, Alabama Shakes, The Gaslamp Killer, Max Richter, Motorpsycho, Goat, Songhoy Blues, Andrew Brid, Glass Animals, King Gizzard & The Lizard Wizard, Misun, JD MCPherson, Happyness, Dolomite Minor, Meridian Brothers, Death From Above 1979, Blaudzun, Oscar And The Wolf, Clark, Ghost Poet, Omar Souleyman, Rhye, Bejamin Booker, Orkesta Mendoza, Ganz,The Chemical Brothers, Patrick Watson, Bleachers, The War on Drugs, The Antlers, Hot Chip, Rico & Sticks, Awolnation"
my.bands.vector <- strsplit(my.bands, ', ')[[1]] ## you could probably stop here, but you asked for a list, which means something specific in R
my.bands.list <- as.list(my.bands.vector)
> str(my.bands.list)
List of 159
$ : chr "George Ezra"
$ : chr "Faith No More"
$ : chr "Above & Beyond"
$ : chr "Paloma Faith"
$ : chr "Gavin James"
[list output truncated]
And if you want to convert back to a string with 's in the string:
paste(shQuote(my.bands.list, type = "sh"), collapse = ', ')
[1] "'George Ezra', 'Faith No More', 'Above & Beyond', 'Paloma Faith', 'Gavin James', 'DJ’s Waxfiend', 'Jebroer', 'Adje', 'Pop Evil', 'Jick munro & the amazing laserbeams', 'Robbie Williams', 'Avicii', 'The Script', 'Anouk', 'Kensington', 'Eagles of Death Metal', 'Dotan', 'The Wombats', 'Selah Sue', 'Shappard', 'John Coffey', 'Magic!', 'Joost van Bellen', 'East Camoran Folkcore', 'Foo Fighters', 'Pharrel Williams', 'Sam Smith', 'One Republic', 'Rise Agianst', 'De Jeugd van Tegenwoordig', 'Counting Crows', 'Fiddler’s Green', 'Thyphoon', 'Kovacs', 'Kitty', 'Daisy & Lewis', 'Oscar and the Wolf', 'Nick Mulvey', 'Urbanus', 'Willie Wartaal', 'Doppelgang', 'Ewert and the two dragons', 'Pierce Brothers,Kovacs', 'The Kendolls', 'Stringcaster', 'Sunday Sun', 'Toy Dolls', 'A$AP Rocky', 'Ride', 'Eskmo', 'Temples', 'The Pop Group', 'Blank Mass', 'Cairo Liberation Front', 'Daniel Norgren', 'Follakzoid', 'Ghost Culture', 'John Coffey', 'Kevin Morby', 'Kuenta I Tambu', 'Marmozets', 'Mourn', 'Patten', 'Sue The Night', 'The Coathangers', 'Tora', 'Vessels', 'The Libertines', 'Noel Gallagher’s High Flying Brids', 'Noel Gallagher', 'AltJ', 'Altj', 'Royal Blood', 'Sohn', 'The Jesus & Mary Chain', 'The Tallest Man On Earth', 'Black Mountain', 'Chet Faker', 'Death Cab For Cutie', 'Ear Sweatshirt', 'Evian Christ', 'Frist Aid Kit', 'Future Islands', 'Jonny Greenwood', 'Mew', 'Of Monsters And Men', 'The Vaccines', 'Ariel Pink', 'Alvvays', 'Wolf Alice', 'Weval', 'BADBADNOTGOOD', 'Bass Drum Of Death', 'Yak', 'Daniel Romand', 'Dan Dercon', 'Eagulls', 'Gengahr', 'Fickle friends', 'Steve Gunn', 'Liima', 'Hookworms', 'Kate Tempest', 'Kiasmds', 'Strand of Oaks', 'Little May ', 'Matthew E. White', 'Metz', 'Off!', 'St. Paul', 'St. Paul & The Broken Bones', 'Pissed Jeans', 'Pretty Vicious', 'Reigning Sound', 'Outfit', 'Sunset Sons', 'Waxahatchee', 'Daniel Wilson', 'Yung Lean', 'Kindess', 'Hinds,Damien Rice', 'The War On Drugs', 'Iggy Pop', 'FKA Twigs', 'Patti Smith And Her Band Perform Horses', 'Flying Lotus', 'Fat Freddy’s Drop', 'Damian Jr Gong Marley', 'Alabama Shakes', 'The Gaslamp Killer', 'Max Richter', 'Motorpsycho', 'Goat', 'Songhoy Blues', 'Andrew Brid', 'Glass Animals', 'King Gizzard & The Lizard Wizard', 'Misun', 'JD MCPherson', 'Happyness', 'Dolomite Minor', 'Meridian Brothers', 'Death From Above 1979', 'Blaudzun', 'Oscar And The Wolf', 'Clark', 'Ghost Poet', 'Omar Souleyman', 'Rhye', 'Bejamin Booker', 'Orkesta Mendoza', 'Ganz,The Chemical Brothers', 'Patrick Watson', 'Bleachers', 'The War on Drugs', 'The Antlers', 'Hot Chip', 'Rico & Sticks', 'Awolnation'"
Here's the double quote version, notice that double quotes must be escaped.
paste(shQuote(my.bands.list, type = "cmd"), collapse = ', ')
[1] "\"George Ezra\", \"Faith No More\", \"Above & Beyond\", \"Paloma Faith\", \"Gavin James\", \"DJ’s Waxfiend\", \"Jebroer\", \"Adje\", \"Pop Evil\", \"Jick munro & the amazing laserbeams\", \"Robbie Williams\", \"Avicii\", \"The Script\", \"Anouk\", \"Kensington\", \"Eagles of Death Metal\", \"Dotan\", \"The Wombats\", \"Selah Sue\", \"Shappard\", \"John Coffey\", \"Magic!\", \"Joost van Bellen\", \"East Camoran Folkcore\", \"Foo Fighters\", \"Pharrel Williams\", \"Sam Smith\", \"One Republic\", \"Rise Agianst\", \"De Jeugd van Tegenwoordig\", \"Counting Crows\", \"Fiddler’s Green\", \"Thyphoon\", \"Kovacs\", \"Kitty\", \"Daisy & Lewis\", \"Oscar and the Wolf\", \"Nick Mulvey\", \"Urbanus\", \"Willie Wartaal\", \"Doppelgang\", \"Ewert and the two dragons\", \"Pierce Brothers,Kovacs\", \"The Kendolls\", \"Stringcaster\", \"Sunday Sun\", \"Toy Dolls\", \"A$AP Rocky\", \"Ride\", \"Eskmo\", \"Temples\", \"The Pop Group\", \"Blank Mass\", \"Cairo Liberation Front\", \"Daniel Norgren\", \"Follakzoid\", \"Ghost Culture\", \"John Coffey\", \"Kevin Morby\", \"Kuenta I Tambu\", \"Marmozets\", \"Mourn\", \"Patten\", \"Sue The Night\", \"The Coathangers\", \"Tora\", \"Vessels\", \"The Libertines\", \"Noel Gallagher’s High Flying Brids\", \"Noel Gallagher\", \"AltJ\", \"Altj\", \"Royal Blood\", \"Sohn\", \"The Jesus & Mary Chain\", \"The Tallest Man On Earth\", \"Black Mountain\", \"Chet Faker\", \"Death Cab For Cutie\", \"Ear Sweatshirt\", \"Evian Christ\", \"Frist Aid Kit\", \"Future Islands\", \"Jonny Greenwood\", \"Mew\", \"Of Monsters And Men\", \"The Vaccines\", \"Ariel Pink\", \"Alvvays\", \"Wolf Alice\", \"Weval\", \"BADBADNOTGOOD\", \"Bass Drum Of Death\", \"Yak\", \"Daniel Romand\", \"Dan Dercon\", \"Eagulls\", \"Gengahr\", \"Fickle friends\", \"Steve Gunn\", \"Liima\", \"Hookworms\", \"Kate Tempest\", \"Kiasmds\", \"Strand of Oaks\", \"Little May \", \"Matthew E. White\", \"Metz\", \"Off!\", \"St. Paul\", \"St. Paul & The Broken Bones\", \"Pissed Jeans\", \"Pretty Vicious\", \"Reigning Sound\", \"Outfit\", \"Sunset Sons\", \"Waxahatchee\", \"Daniel Wilson\", \"Yung Lean\", \"Kindess\", \"Hinds,Damien Rice\", \"The War On Drugs\", \"Iggy Pop\", \"FKA Twigs\", \"Patti Smith And Her Band Perform Horses\", \"Flying Lotus\", \"Fat Freddy’s Drop\", \"Damian Jr Gong Marley\", \"Alabama Shakes\", \"The Gaslamp Killer\", \"Max Richter\", \"Motorpsycho\", \"Goat\", \"Songhoy Blues\", \"Andrew Brid\", \"Glass Animals\", \"King Gizzard & The Lizard Wizard\", \"Misun\", \"JD MCPherson\", \"Happyness\", \"Dolomite Minor\", \"Meridian Brothers\", \"Death From Above 1979\", \"Blaudzun\", \"Oscar And The Wolf\", \"Clark\", \"Ghost Poet\", \"Omar Souleyman\", \"Rhye\", \"Bejamin Booker\", \"Orkesta Mendoza\", \"Ganz,The Chemical Brothers\", \"Patrick Watson\", \"Bleachers\", \"The War on Drugs\", \"The Antlers\", \"Hot Chip\", \"Rico & Sticks\", \"Awolnation\""

Merging two files horizontally and formatting

I have two files as follows:
File_1
Austin
Los Angeles
York
San Ramon
File_2
Texas
California
New York
California
I want to merge them horizontally as follows:
Austin Texas
Los Angeles California
York New York
San Ramon California
I am able to merge horizontally by using paste command, but the formatting is going haywire.
Austin Texas
Los Angeles California
York New York
San Ramon California
I realize that paste is working as it is supposed to, but can someone point me in the right direction to get the formatting right.
Thanks.
paste is using a tab when 'merging' the file, so maybe you have to post-process the file and remove the tab with spaces:
paste File_1 File_2 | awk 'BEGIN { FS = "\t" } ; {printf("%-20s%s\n",$1,$2) }'
result:
Austin Texas
Los Angeles California
York New York
San Ramon California
Firstly you have to check number of characters in the longest line. Than you may use fmt to pad line from the first file to greater length. Finish it using paste.
If you have an idea about the field width, you could do something like this:
IFS_BAK="$IFS"
IFS=$'\t'
paste file_1 file_2 \
| while read city state; do
printf "%-15s %-15s\n" "$city" "$state"
done
IFS="$IFS_BAK"
Or this shorter version:
paste file_1 file_2 | while IFS=$'\t' read city state; do
printf "%-15s %-15s\n" "$city" "$state"
done
Or use the column tool from bsdmainutils:
paste file_1 file_2 | column -s $'\t' -t

Resources