Extract textbook names and journal articles from various syllabi - r

I am trying to extract textbook names, and other journal articles in syllabi collected from various courses using R. My basic assumption is that most of these will be in some kind of a citation format (e.g. APA, MLA, etc). While I can try to create regex-s to extract this information, I was wondering if anyone has tried to do this before, or if an R package exists that I may be able to use to extract this information from differently formatted text.
Below are two examples of the syllabi that I am working with. In Sample 1, the book name is not in a citation format, but in sample 2, it is in a citation format. Both samples have been truncated to meet stackoverflow character limits.
SAMPLE 1:
"ABC State University ARTS 3366 Intermediate Digital Photography Fall 2015 JCM 4127 T/TH 2­4:30 pm Lecturer: John Smith Office Hours: T/TH prior to and after class Email: ​johnsmith#abcstate.edu Alternate email: johnsmith#gmail.com Prerequisites​: ARTS 3364 ­ Introduction to Digital Photography Course Description & Objectives: This course is designed to expand and build on the skills and knowledge acquired in Introduction to Digital Photography. This course builds on the skills and knowledge acquired in Introduction to Digital Photography. Specifically, we will use the history, critical analysis, and production of photography books to: (1) explore the complexities of the medium in social, political, and aesthetic contexts; (2) develop more advanced and conceptually driven photography work; (3) work toward a greater understanding of how photography books function as self­contained art, cultural, and political objects; (4) learn how to choose subject matter and continually explore, experiment, and refine our work. The final outcomes ofthe class will be the creation of an on­demand book and an accompanying folio of fine prints. We will use digital cameras, inkjet printers, Adobe Photoshop, Lightroom, and Macintosh computers in this course. Through lectures, discussions and readings, we will explore and discuss historical trends in traditional (analog) photography, as well as emerging practices in contemporary digital imaging. This will serve as a foundation to help determine the approach, subject matter, and style of the work created for class. In addition to refining these skills, students will also address the practical and theoretical roles of digital imagery. The course objective will be to focus on technical, aesthetic, and conceptual growth of a student’s endeavors in the digital medium. This course requires the completion of: all assignments (on time), participation in all group critiques and completion of a Twelve to Fifteen image final portfolio of prints or equivalent, and three projects throughout the semester. Requirements: Coursework: This course requires the completion of: all assignments (on time), participation in all group critiques and completion of a 12­15 image final portfolio of prints or equivalent, the creation of a book printed with an on demand printing service, as well as making new photographs consistently throughout the entire semester. Suggested (not required)Books: Adobe Photoshop Lightroom 5 Book, The: The Complete Guide for Photographers By Martin Evening Published Jun 30, 2013 by Adobe Press The Photographer’s Playbook 307 Assignments and Ideas Edited by Jason Fulford and Gregory Halpern Published by Aperture On Being a Photographer: A Practical Guide ​ ​by David Hurn and Bill Jay Local Stores:"
SAMPLE 2:"Physical Education Activity ProgramHealth & Fitness Strength TrainingKINE 198-837Instructor: JANE DOE Office: PEAP 230Office Hours: By appointmentPhone: (000) 000-0000E-Mail: jdoe#xyz.edu A. Activity Instructor: Jane DoeOffice: PEAP 250Office Hours: By appointmentClass Time: Thursday 2:20 pmPhone: (000) 000-0000Email: jdoe#xyz1.edu Class Meeting Site: PEAP 117B. Activity Instructor: Jane Doe Phone:Office: PEAP 239Email: jdoe#xyz1.eduOffice Hours: Thursday 10:00 am – 12:00 pmClass Time: Thursday 2:20 pmClass Meeting Site: PEAP 118C. Activity Instructor: John doe Office: PEAP 250/Doe 213KOffice Hours: Tuesday 1:00-2:00 pmClass Time: Thursday 2:20 pmPhone:Email: johndoe#xyz.eduClass Meeting Site: PEAP 120Attire: Proper clothes and shoes designed specifically for strength training on activitydays.Required Materials:Bounds, L., Agnor, D., Darnell,G., & Brekken Shea, K. (2012). Health & Fitness: AGuide to a Healthy Lifestyle (5th edition). Dubuque, IA: Kendall/Hunt Publishing Co.ISBN 978-1-4652-0712-8Cissik, J. (2001). The Basics of Strength Training (3rd Edition). McGraw-Hill,Primus Custom PublishingCourse Description:Health and Fitness is intended for the student who is seeking knowledge and practicalapplication of wellness choices to their life. The course consists of two components,lecture and activity. Students will meet face-to-face one day per week for the activityportion of the class and work approximately the equivalent of one day per week onlinewith lecture materials. The lecture portion will cover current health issues includingmental and physical health, nutrition, human sexuality, communicable and noncommunicable diseases, use and abuse of drugs, and safety. The activity portion willconsist of 14 class days and cover basic knowledge and techniques of strength trainingand improving the individual’s fitness through the utilization of this knowledge.Course Rationale:Research indicates that daily health/fitness related behaviors enhance learning anddetermine the quality and longevity of our life."

Related

Transform a large list into a tibble with one column containing all elements [duplicate]

This question already has answers here:
Convert a list to a data frame
(26 answers)
Closed 3 years ago.
I'am sorry for this question because it seems quite obvious but I can't come up with a solution myself. I have a large list of 130 elements each a list of 10 single character strings.
I want to have this as a combined tibble with one column containing all strings.
If I try do.call(dplyr::bind_rows, y) on my list I still get an error: Error: Argument 1 must have names
For more insight about the list I will post the console output of the first sublist by calling dput(bribe.test[[1]])
dput(bribe.test[1])
list(list("\r\n Supercharge your R/C vehicle and also this systems will boost horsepower and performance of any RC nitro engines, visit us to get online xtm racing, xtm racing rail, xtm racing engine, xtm xt2 engine, and xtm nitro engine. Visit # https://rbinnovations.com/collections/super-chargers/xtm-racing\r\n ",
"\r\n The Powermatic 2+ or Powermatic 2 Plus Electric Cigarette Rolling Machine uses an electric spoon-fed cigarette injector that will make king size or 100's cigarettes in a few seconds and you can buy it online with us at Hard Working Products. Visit https://hardworkingproducts.com/powermatic-2\r\n ",
"\r\n Hello sir, My uncle just coming india yesterday night at ahmedabad airport from New Zealand. And i gave him 2 iphone , iphone 8 plus and iphone 11 pro.. and they called by custom department. The officer told him that they are not allowed with these phone. They force him to pay 42,000/- custom duty for these phone. He just arrived that's why they haven't got money at that time. But his son gave him 600 nzd for his expenses. And these bloody corrupt office force him to pay 600$. They felt helpless at that time and gave 600$ with the passport.My uncle dont know his name. You can check cameras if you want, he was at counter around 1:00 o'clock at night. It is bloody bad experience with them. I'm going to tell my friends and all the relatives which are here to not go india ever..\nI'm felling helpless to come my home country. If you can then take strict actions against these bloody corrupt officers who are cheating with our nation. Please take strict action. Hope you can save our nation from this corrupt officers\nSingapor airlines \nSQ530 arrived at 21:50 evening on 6/1/20\nThank you\r\n ",
"\r\n Date of the incident: 29th December 2019\nTime of incident: Around 8 PM in the evening\nPlace of incident: ECR road, Pondicherry to Tamil Nadu check post.\nWhile driving back from Pondicherry to our stay near ECR road, we (4 people in the car) took 8 beer cans of 500 ml each. At the checkpost (just 100mtrs before our lodge) police stopped us, started checking the vehicle. We voluntarily declared the beer quantity and handed over to them.\nThey asked us to pay Rs 4200 and go else, they will create a case on us and arrest us, seize the vehicle. Since we took the vehicle from self drive agency, we really wanted come out of this. We apologise to them as we weren't aware of the border lines between the states. Requested them to dispose the beers and let us go. My 5 year old daughter was crying seeing the officers are not allowing me to leave. Nothing was fruitful and we literally beg them to leave us. Language was a big barrier as we don't know tamil and none of the officers understand English/hindi properly. Somehow a communication happened and I had to show them the account balance online as I didn't have that much cash with me. Finally, the officer agreed to leave us with a cost of Rs 500 and 4 beer cans.\nWe noticed at the same time, 4 college students from chennai were also got caught with a bag full of Liquors. The officer was very casual to them and also denied money from them even though they offered him 200 rupees. They may be from families where the indian law does not get applied easily. I understand that.\nWe can't speak tamil or pondi language.. Is this what you are angry on us? Is this what you discriminate us? Don't you ruin the future of your own students in the name of partiality??\r\n ",
"\r\n Dear Sir,\nThis is not the first time I am facing this issue with Rohit Gas Agency. I tried to bring it to the notice of Indane. Its of no use. Rohit Gas Agency provides worst service. We do not have option. To Deliver the Cylinder, the Delivery boy demands Rs 50 everytime. This is a common issue. If not paid he shouts badly on road and moves out. Rohit Gas Agency is always unreachable. These bugs working in the Gas Agency are eating up the money paid by Gas Subscribers. \nMany a times, the cylinder is not delivered to home. We are forced to collect the Cylinder paying additional bribe of Rs 50 near Godown. If not paid, we need to lift the cylinder and carry the same back till the car parking and drive back home. \nThe Gas Delivery - Rohit Gas Agency is unfit to manage the delivery business. Please look into the complaints and reviews on google atleast. \nRegards\nPrashanth .P\r\n ",
"\r\n I paid bribe today to a police officer who came for passport verification of my mother. Even after providing all supporting documents and required information, officer asked to pay 500Rs for Chai Pani. When I asked to reduce the amount, officer said that it is decided by higher officials of police. \nI feel very bad after paying, this practice is so common in UP. Please take necessary actions against this to prevent civilians from such corrupt people. \nOfficer Name - Indrapal Singh\nThana - New Agra Police Station\nDate - 6th Jan 2020\nPlace - Agra\r\n ",
"\r\n I have asked to pay bribe to avoid huge penalty for putting tent sheet on car windows. Police asked me to pay 1100 rs fine or pay bribe instead of that. Since I don't had that much money and I was in urgency, I paid bribe to escape from the situation. This was happened at corporation circle church opposite to church at 12 30 PM. \r\n ",
"\r\n Help desk officer prashant who are trapping people to make work done by giving bribes to higher officials at malakpet rto malakpet Hyderabad \r\n ",
"\r\n Get free shipping when you buy the Revolution the great american electric cigarette machine, within the continental US from https://hardworkingproducts.com/revolution-electric-cigarette-machine-made-in-america and also you will get this machine at best market price in USA.\r\n ",
"\r\n I Would like to Inform you that a lot of corruption is going on in the DC Office Bangalore Urban Dept. I am not paid bribe directly there is lot more agents have to collect the money and some one has do the deel not direct deel with D C Officer. Brib agents collecting the money and send it to direct DC officer house. The Officer have a one more home office in Kumarakrupa road bangalore. the deeling files as going their for officer signature. One agent is doing his job in that office his name called Mahendre Kumar (Shift car No.KA 04 MK 282) Please do the action for this. Govt officers also been included in this deels and they get commission also.\nNames Sadanada Swamy , Basavaraju, G N Shivamurthy. \r\n "))
You could use unlist with tibble
df_tib <- tibble::tibble(col = unlist(bribe.test))
Or data.frame
df1 <- data.frame(col = unlist(bribe.test), stringsAsFactors = FALSE)

API or scrape Google Jobs

I want to record the job posting information from this search. Is anyone aware of an API or can you confirm it's possible to scrape with Python beautiful soup? (I'm familiar with scraping, I just can't see how to get this website)
Disclosure: I work at SerpApi.
You can use google-search-results package to get data from Google Jobs listings. Check a demo at Repl.it.
from serpapi import GoogleSearch
params = {
"engine": "google_jobs",
"q": "sustainability jobs in mi",
"google_domain": "google.com",
"api_key":
"API_KEY"
}
client = GoogleSearch(params)
data = client.get_dict()
print("Job results")
for job_result in data['jobs_results']:
print(f"""Title: {job_result['title']}
Company name: {job_result['company_name']}
Description: {job_result['description']}
""")
print("Filters")
for chip in data['chips']:
print(f"Type: {chip['type']}\n")
print("Options")
for option in chip['options']:
print(option['text'])
Response
{
"jobs_results":[
{
"title":"Sustainability Analyst",
"company_name":"Amcor",
"location":"Ann Arbor, MI",
"via":"via LinkedIn",
"description":"Amcor Limited Job Posting\n\nRole: Sustainability Analyst\n\nLocation: TBD, ideally in the US (Ann Arbor, MI)\n\nAbout Amcor\n\nAmcor (ASX: AMC;\n\nAmcor is proud of its recent pledge to design all of our packaging to be recyclable or reusable by 2025. The job holder will play a very important and exciting role in Amcor’s journey to deliver this important commitment.\n\nPosition Overview\n\nRead more about Amcor’s sustainability commitment:\n\nThe Sustainability function plays a key role in positioning Amcor as THE leading packaging company for the environment delivering on Amcor’s sustainability strategy, the 2025 pledge and as a supplier of choice for responsible packaging.\n\nThe Sustainability Analyst is responsible for analyzing, reporting, and coordinating selected global Sustainability activities with direction from the VP Sustainability.\n\nEssential Responsibilities And Duties\n• Track legislative activity, analyze for risk and opportunity, help to prioritize actions\n• Assist with drafting... positions, coordinate Amcor activity and governance around advocacy (mostly in industry group participation)\n• Assists with internal reporting and communications, including preparing decks for internal meetings\n• Partnership administration, tracking projects and payments, and liaising with corporate finance on dept budget\n• Manage compliance statements, including anti-slavery statements, conflict minerals etc.\n• Coordinates the International Costal Cleanup, as needed with other partners\n• Other similar duties as required to support the corporate sustainability program\n\nQualifications\n• Education: Master's Degree or equivalent in related field preferred\n• Three to five years of experience\n• Strong analytical skills, including ability to interpret and graphically display environmental performance data\n• Excellent written and verbal communications skills\n• Excellent working knowledge of Microsoft Office\n• Demonstrated professional work characteristics including high initiative, dependability, and ability to manage confidential information\n• Must be well organized and comfortable interfacing with all levels of management\nAmcor Leadership Framework Competencies\n• Drive for Results\n• Influencing Others\n• Customer Focus\n• Learning on the Fly\n• Interpersonal Savvy\n• Organizational Awareness\n• Priority Setting\n• Organizing\n• Functional / Technical Skills\n• Strong Computer Skills\n\nRelationships\n• Amcor Leadership\n• Direct Reports\n• External Vendors\n• Government agencies\n• Global partners/ Nonprofit organizations\n• Industry organizations\nExpected Travel: 10% Travel\n\nThe information contained herein is not intended to be an all-inclusive list of the duties and responsibilities of the job, nor are they intended to be an all-inclusive list of the skills and abilities required to do the job.\n\n#North America",
"extensions":[
"Over 1 month ago",
"Full-time"
]
},
{
"title":"Environmental Jobs in Michigan,USA",
"company_name":"freelancejobopenings.com",
"location":"Michigan",
"via":"via Freelance Job Openings",
"description":"Environmental Jobs in Michigan,USA\n\nSummer Camp Instructor\n\nenvironmental learning center at barr lake state park with a satellite office in fort collins and fieldwork outposts in environmental science, leadership, and or outdoor adventure programs for diverse audiences in formal and non formal outdoor and classroom environmental studies, biological sciences, natural resource management, or related field, with a focus in ornithology.\n\n strong summer, birding, camp, education, colorado, outdoors, teaching\n\nwebsite: barefoot student summer camp\n\nSITE LEAD\n\nenvironmental changes, and sudden work schedule changes.\n• tech savvy: frito lay is an industry leader site: fritolay the site lead is accountable for ensuring the building is operating at top performance to deliver the zone sops strategy and ensures a safe working environment. the role requires cross functional understanding in order to drive operations success.\n\nwe are open 24 hours a day, which means\n\nField Service ... Chromatography Spectrometry Instruments - Grand Rapids, MI\n\nenvironmental testing, and forensic toxicology looking to hire field service engineer to support lcms and gcms platforms. travel to client labs to perform calibrations, diagnose problems with equipment field service chromatography spectrometry instruments grand rapids, mi\n\nleader in liquid chromatography mass spectrometry and gas chromatography mass spectrometry, supporting clinical research, drug discovery, food and environmental testing, and forensic toxicology looking to hire field service engineer to support\n\nUTA Test Engineer\n\nenvironmental demands may be referenced in an attempt to municate the manner in which this position traditionally is performed. about capgemini:\n\na global leader in consulting, technology services and digital transformation, capgemini is at the forefront of innovation to address the entire breadth of clients’ opportunities in the evolving world of cloud, digital and platforms. building on its strong 50 year heritage and deep industry specific expertise, capgemini enables organizations to realize\n\nIndustrial Water/Wastewater Design Engineer\n\nenvironmental, civil, or chemical\n• 4+ years of industrial water wastewater system environmental, civil or chemical\n• water wastewater treatment design experience in variety industrial markets\n• experience with biological and physical chemical treatment design build experience\n\nwhat we offer engineering water wastewater\n\nbusiness line design and consulting services group (dcs)\n\ncountry",
"extensions":[
"13 hours ago",
"Full-time"
]
}
]
}
If you want more information, check out SerpApi documentation.

How to read a list of values into a data table in a sandbox?

I have a list of data. It's all a single column, each row is a comment from a post asking for book recommendations. Here's an example, containing the first 2 entries:
"My recommendations from books I read this year:<p>Bad Blood : Man, this book really does read like a Hollywood movie screenplay. The rise and fall of Theranos, documented through interviews with hundreds of ex-employees by the very author who came up with the first expose of Theranos. Truly shows the flaws in the "fake it before you make it" mindset and how we glorify "geniuses".<p>Shoe Dog : Biography of the founder of Nike. Really liked how it's not just a book glorifying the story of Nike, but tells the tale of how much effort, balance and even pure luck went into making the company the household name it is today.<p>Master Algorithm : It's a book about the different fields of Machine learning (from Bayesian to Genetic evolution algos) and talks about the pros and cons of each and how these can play together to create a "master algorithm" for learning. It's a good primer for people entering the field and while it's not a DIY, it shows the scope of the problem of learning as a whole.<p>Three Body Problem: Finally, after years of people telling me to read this (on HN and off), I read the trilogy (Remembrance of Earth's Past), and I must say, the series does live up to the hype. Not only is it fast paced and deeply philosophical, but it's presented in a format very accessible to casual readers as well (unlike many hard sci-fi books which seem to revel in complexity). If I had to describe this series in a single line, it's "What would happen if China was the country that made first contact with an alien race?"","A selection:<p>Sapiens (Yuval Noah Harari, 2014 [English]) - A bit late to the party on this one. Mostly enjoyed it, especially the early ancient history stuff, but I felt it got a bit contrived in the middle - like the author was forcing it. Overall a good read though.<p>How to Invent Everything (Ryan North, 2018) - First book I've pre-ordered in a long time. A look at the history of civilization and technology through a comedic lens. Pretty funny and enjoyable.<p>The Rise of Theodore Roosevelt (Edmund Morris, 1979) - Randomly happened across this book while browsing a used bookstore for some stuff to read on a summer vacation. Loved it. It's big, but reads pretty quick for a biography. I've been a fan of TR since I first really learned about him in High School and I would recommend this for anyone interested in TR/The West/Americana.<p>Jaws (Peter Benchley, 1974) - Quite a bit darker than the movie.<p>Sharp Objects (Gillian Flynn, 2006) - I enjoyed Gone Girl (book and film) so I wanted to read this before the HBO series. To be honest...not my cup of tea. It was <i>okay</i>.<p>The Art of Racing in the Rain (Garth Stein, 2008) - Made me cry on an airplane. Thankfully my coworkers were on a different flight."
(Notice, comments are separated by ",")
I'm trying to load this list into a data table in an R sandbox (rapporter.net). But because of browser security, I can't load a local file (fread, read.table).
How can I read raw data into a data table in R?

Wysiwg Editor of Advanced Custom Field is not giving output with proper HTML Tags

I have Wordpress: Version:4.8
Plugin: Advanced Custom Field:Version 4.4.11
I have Created Custom Field: Details
The Field Type: Wysiwg Editor
Source Code Data:
<p>From The Architect - The Split House negotiates a complex range of conditions typical of emerging coastal developments. New houses for ‘downsizers’ in a suburban mode, paved driveways and letterboxes prevail, vying for the expansive views to Port Philip Bay, and backed by the relatively wild, coastal woodland of Mt Martha Public Park. The Split House provides a range of spatial relationships to its site and the broader territory that carefully balances the owners’ desire for privacy and engagement with their surrounds. The house comprises two relatively simple volumes linked by a splayed stair that also acts as a seating area for people to gather, listen to music, sit in the sun. Occupying separate levels that follow the natural contours of the site, the two pavilions provide a separation between the upper, main living/master bedroom zone and rumpus room/guest bedrooms below. Through the curation of windows and doors a range of direct and indirect connections to the landscape provide multiple opportunities for occupation throughout the year. Smaller elements, such as integrated seating, stairs and study nooks provide spaces for quiet contemplation, juxtaposed with larger communal areas for family and friends to come together.</p>
<p>The award-winning house was designed for a couple who have an eye on retirement, with two adult children who moved out of the family home, the idea was to create a retreat that would bring the family together on weekends and over the summer months.</p>
<p>MUD OFFICE helped to finish off a beautifully executed home by providing the landscape that flows around the house.</p>
When i Check the Output in the front End:
<p>From The Architect - The Split House negotiates a complex range of conditions typical of emerging coastal developments. New houses for ‘downsizers’ in a suburban mode, paved driveways and letterboxes prevail, vying for the expansive views to Port Philip Bay, and backed by the relatively wild, coastal woodland of Mt Martha Public Park. The Split House provides a range of spatial relationships to its site and the broader territory that carefully balances the owners’ desire for privacy and engagement with their surrounds. The house comprises two relatively simple volumes linked by a splayed stair that also acts as a seating area for people to gather, listen to music, sit in the sun. Occupying separate levels that follow the natural contours of the site, the two pavilions provide a separation between the upper, main living/master bedroom zone and rumpus room/guest bedrooms below. Through the curation of windows and doors a range of direct and indirect connections to the landscape provide multiple opportunities for occupation throughout the year. Smaller elements, such as integrated seating, stairs and study nooks provide spaces for quiet contemplation, juxtaposed with larger communal areas for family and friends to come together.The award-winning house was designed for a couple who have an eye on retirement, with two adult children who moved out of the family home, the idea was to create a retreat that would bring the family together on weekends and over the summer months.
MUD OFFICE
helped to finish off a beautifully executed home by providing the landscape that flows around the house.
</p>
It is not giving the ouput as written in the Wysiwg Editor of Advanced Custom Fields. Can anybody help me with finding out the issue with this?
You will get proper output when you try with apply_filters:
<?php echo apply_filters('the_content', get_post_meta( $post->ID, 'custom_field_name', true )); ?>

Bypass Style Formatting when Parsing RSS Feed in R

I am trying to scrape and parse the following RSS feed http://www.nestle.com/_handlers/rss.ashx?q=068f9d6282034061936dbe150c72d197. I have no problem to extract the basic items that I need (e.g., title, description, pubDate) using the following code:
library(RCurl)
library(XML)
xml.url <- "http://www.nestle.com/_handlers/rss.ashx?q=068f9d6282034061936dbe150c72d197"
script <- getURL(xml.url)
doc <- xmlParse(script)
titles <- xpathSApply(doc,'//item/title',xmlValue)
descriptions <- xpathSApply(doc,'//item/description',xmlValue)
pubdates <- xpathSApply(doc,'//item/pubDate',xmlValue)
My problem is that the output for item "description" includes not only the actual text but also a lot of style formatting expressions. For example, the first element is:
descriptions[1]
[1] "<p><iframe height=\"322\" src=\"https://www.youtube-nocookie.com/embed/fhESDXnlMa0?rel=0\" frameBorder=\"0\" width=\"572\"></iframe><br />\n<br />\n<p><em>Nescafé</em> is partnering with Facebook to launch an immersive video, pioneering new technology just released for the platform.</p>\n<p>\nThe <em>Nescafé</em> <a class=\"externalLink\" title=\"Opens in a new window: Nescafé on Facebook\" href=\"https://www.facebook.com/Nescafe/videos/vb.203900255471/10156233581755472/?type=2&theater\" target=\"_blank\">‘Good Morning World’ video</a> stars people in kitchens across the world, performing the hit song ‘Don’t Worry’ using spoons, cups, forks and a jar of coffee. Uniquely, viewers can rotate their smartphones through 360˚ to explore the video, the first time this has been possible on Facebook.</p>\n<p>\n“We know young coffee lovers pick up their phone at the start of every day looking to be entertained by real experiences. The 360˚ video allows us to be engaging in an innovative way,” said Carsten Fredholm, Senior Vice President of Nestlé’s Beverage Strategic Business Unit.\n</p>\n<p><em>Nescafé</em> recently teamed up with Google to offer the first virtual reality coffee experience through the <em>Nescafé 360˚</em> app. It also became the first global brand to move its website onto Tumblr, to strengthen connections with younger fans by allowing them to create and share content.</p>\n<p>The Nestlé brand is one of only six globally to partner Facebook for the launch of this technology.</p></p>"
I can think of a regex approach to replace the unwanted character strings. However, is there a way to access the plain text elements of item "description" directly through xpath?
Any help with this issue, is very much appreciated. Thank you.
You can do:
descriptions <- sapply(descriptions, function(x) {
xmlValue(xmlRoot(htmlParse(x)))
}, USE.NAMES=FALSE)
which gives (via cat(stringr::str_wrap(descriptions[[1]], 70)):
In a move that will provide young Europeans increased access to
jobs and training opportunities, Nestlé and the Alliance for YOUth
have joined the European Pact for Youth as founding members. Seven
million people in Europe under the age of 25 are still inactive -
neither in employment, education or training. The European Pact for
Youth, created by European CSR business network CSR Europe and the
European Commission, aims to work together with businesses, youth
organisations, education providers and other stakeholders to reduce
skills gaps and increase youth employability. As part of the Pact, the
Alliance for YOUth will focus on setting up âdual learningâ schemes
across Europe, combining formal education with apprenticeships and on-
the-job training to help match skills with jobs on the market. The
Alliance for YOUth is a group of almost 200 companies mobilised by
Nestlé to help young people in Europe find work. It has pledged to
create 100,000 employability opportunities by 2017 and has already met
half of this target in its first year. Luis Cantarell, Executive Vice
President for Nestlé and co-initiator of the European Pact for Youth,
said: âPromoting a cultural shift to dual learning schemes based on
business-education collaboration is at the heart of Nestléâs youth
employment initiative since its start in 2013. The European Pact for
Youth will help to build a skilled workforce and will tackle youth
unemployment.â Learn more about the European Pact for Youth and read
their press release.
There are \n characters at various points in the resultant text (in almost all the descriptions) but you can gsub those away.

Resources