Extract sentences with specific word/patterns in it

Extract sentences with specific word/patterns in it - r

I´m trying to extract sentences with the word "privacy|Privacy" in it. The sentences can be found in text inside my dataframe. The text is safed as a list of multiple chr. strings, because I´m working with a bunch of different files. I can´t manage to get it to work with grep, but made it work using gsub. The problem I got now is, that it only extracts the first sentence of the text and doesn´t include the next ones.
csv_edgar$privacy_1A <- gsub(".*?([^\\.]*(privacy|Privacy[^\\.]*).*","\\1", csv_edgar$item_1A, ignore.case=TRUE). Thats the code I´m using atm.
Text:
The Company employs information technology systems to support its
business, including ongoing phased implementation of an ERP system as
part of business transformation on a worldwide basis over the next
several years. Security breaches and other disruptions to the
Company’s information technology infrastructure could interfere with
the Company’s operations, compromise information belonging to the
Company and its customers, suppliers, and employees, exposing the
Company to liability which could adversely impact the Company’s
business and reputation. In the ordinary course of business, the
Company relies on information technology networks and systems, some of
which are managed by third parties, to process, transmit and store
electronic information, and to manage or support a variety of business
processes and activities. Additionally, the Company collects and
stores certain data, including proprietary business information, and
may have access to confidential or personal information in certain of
our businesses that is subject to privacy and security laws,
regulations and customer-imposed controls. Despite our cybersecurity
measures (including employee and third-party training, monitoring of
networks and systems, and maintenance of backup and protective
systems) which are continuously reviewed and upgraded, the Company’s
information technology networks and infrastructure may still be
vulnerable to damage, disruptions or shutdowns due to attack by
hackers or breaches, employee error or malfeasance, power outages,
computer viruses, telecommunication or utility failures, systems
failures, service providers including cloud services, natural
disasters or other catastrophic events. It is possible for such
vulnerabilities to remain undetected for an extended period, up to and
including several years. While we have experienced, and expect to
continue to experience, these types of threats to the Company’s
information technology networks and infrastructure, none of them to
date has had a material impact to the Company. There may be other
challenges and risks as the Company upgrades and standardizes its ERP
system on a worldwide basis. Any such events could result in legal
claims or proceedings, liability or penalties under privacy laws,
disruption in operations, and damage to the Company’s reputation,
which could adversely affect the Company’s business. Although the
Company maintains insurance coverage for various cybersecurity risks,
there can be no guarantee that all costs or losses incurred will be
fully insured.

You could use str_extract_all with an alternation:
regex <- "[A-Z][^.]+\\b(?:Privacy|privacy)\\b[^.]+\\."
sentences <- str_extract_all(input, regex)[[1]]
[1] "Additionally, the Company collects and stores certain data, including proprietary business information, and may have access to confidential or personal information in certain of our businesses that is subject to privacy and security laws, regulations and customer-imposed controls."
[2] "Any such events could result in legal claims or proceedings, liability or penalties under privacy laws, disruption in operations, and damage to the Company<U+2019>s reputation, which could adversely affect the Company<U+2019>s business."
In the snippet above, input is the sample text you provided in the question.

Suggesting awk command:
awk '/[pP]rivacy/{print}' RS="." input.txt
Result from provided sample
Additionally, the Company collects and stores certain data, including proprietary business information, and may have access to confidential or personal information in certain of our businesses that is subject to privacy and security laws, regulations and customer-imposed controls
Any such events could result in legal claims or proceedings, liability or penalties under privacy laws, disruption in operations, and damage to the Company’s reputation, which could adversely affect the Company’s business

Related

When a G-Suite form is embedded on external website, does any form data get stored on the host site?

This question comes up because of very specific HIPAA requirements. A Covered Entity(CE) eg, doctor can't use a cloud storage provider (CSP) unless they have a Business Associate Agreement (BAA) with the CSP, even if the data are encrypted and the CSP has no access. I'm not a security expert, but most web hosts' security would IMO satisfy HIPAA, IF there were a BAA.
There's a conduit exception for video, ISPs, and other electronic equivalents of USPS that do not store electronic Protected Health Information (e-PHI.)
I don't know why, but the web hosts who will sign a BAA charge $100-300/ mo for very basic hosting other sites charge $5-15/mo for. I think they're preying on CE ignorance with the perception there's lots of money sloshing around, true for radiology, but not for primary care.
G-Suite will execute a BAA, which makes G-Suite a reasonably-priced solution for gathering Protected Health Information (PHI) patient input, while keeping the CE compliant with HIPAA.
It's worth noting that "HIPAA compliance" is ONLY a property of CEs and Electronic Medical Records, not other software or sites. Any other product or service claiming "HIPAA compliance" is misrepresenting itself.
I find Google Sites not as user-friendly as most web hosts. There's less hand-holding for doing things like installing WP add-ins, or adding SSL certificates. Or maybe Google just does a terrible job of explaining how to actually DO something with a site hosted there. In any case, it seems easier to run a website on a web host that's set up to manage software and WP plug-ins for amateurs.
I'm willing to be educated on this. (24 hours later--I did a lot of self-education-see answer below.)
The basic HIPAA privacy requirements are rather simple:
CEs can use PHI to treat and carry out essential functions, but must
not share it with anyone not entitled to it.
The basic HIPAA security requirements are also simple:
Make a security risk analysis.
Implement reasonable security measures and
Document why various measures were taken or not.
Some elements are required, others must simply be addressed, evaluated and documented.
For example, 2FA is "addressable" as is data encryption, but making an analysis, having physical security and employee training are required.
So my question is whether a G-Suite form embedded in a website on another web host stores any data on that web host, or does it all go back to G-Suite, eg G-Drive, where it's secure and covered by a BAA?

The problem when you know very little about a topic is, you don't know what to ask. I know a bunch about HIPAA, not much about HTML. I did a lot more research, and there's at least two answers.
The short answer is, NO, the embedded frame is an iframe HTTPS linked to G-Suite.
The form in the iframe is a window into docs.google.com, so data never gets off docs.google.com, where it's covered by G-Suite's BAA. The host site is in effect a conduit.
<iframe src="https://docs.google.com/forms......…</iframe>
Note https
Embedding the form does not create a HIPAA violation.
The second answer is, G-Suite has its own content management system and website builder, which requires very little technical skill. Thus there's no need to install Wordpress or anything else, you just drag-and-drop to create a site. All the back end stuff is done for you. Duh. And they execute a BAA, all for $6 a month. So G-Suite is much simpler, in fact so simple that only a child can do it. Their help pages leave much to be desired.
Bottom line--for small covered entities, G Suite is a very economical website solution that doesn't create a HIPAA violation. Wish I knew this yesterday!
FYI: HIPAA compliant Cloud Services

Can firebase be PIPPA and HIPPA compliant?

Has anyone been able to build hippa and Pippa compliant apps using firebase. More specifically for real time and chat messaging?
some rules require data to reside in Canada.

The Seven Fundamental Elements of an Effective Hippa Compliance Program represent the barebones requirements that HIPAA compliant apps must have in place in order to address HIPAA privacy and security standards. The Seven Elements include:
Implementing written policies, procedures and standards of conduct
Designating a compliance officer and compliance committee
Conducting effective training and education
Developing effective lines of communication
Conducting internal monitoring and auditing
Enforcing standards through well-publicized disciplinary guidelines
Responding promptly to detected offenses and undertaking corrective action

Android In-App Purchase transfer reconiliation

We have recently been tasked with adding In-App Purchases to our Android mobile application. We have completed this and have started testing small purchases internally in our production environment prior to release.
We're successfully performing the purchase and storing the result in our back-end database. We have a service that contacts the Google API daily to query about the PaymentState for the transaction.
Today our first test purchase changed to 1 (Payment Received).
We have not yet received the money transfer in our bank account, but it's probably on the way.
Our question is, once the PaymentState has changed to 1, how can we reconcile this with our bank account?
Our finance department doesn't like the sound of just trusting that we got the money. We want to ensure that each payment is accounted for in cash.
How are others accomplishing this?
Thanks

Does your bank offer any interface for a computer to interact with your bank account?
In Germany, almost all banks support either the Home Banking Computer Interface (HBCI) or it's successor the Financial Transaction Service (FinTS) to connect arbitrary computer programs with the bank account and pretty much provide all services available on their web-based online banking sites via those interfaces as well.
With such an interface you could then check the transactions on your bank account programmatically and simply check if the transaction reference provided by Google has already arrived on your bank account.
Without knowing where you're based and what your bank is / what interface they provide, it's hard to provide more details. (There are multi-national/somewhat universal electronic interface standards for how banks communicate with one another, but these are usually not open to the customers and most likely don't provide the required data about one account's individual transactions)

r shiny - is uploaded data safe and secure?

I'm building a shiny app where users upload transaction data to get access to an analytics dashboard. Can I assure these people that their data is secure from sniffers/hackers and will be removed from the shiny server when their session expires? How does this actually work in Shiny? (Note that I'll be hosting my app on shinyapps.io)

This is not to do with shiny, but whatever server you're storing the data on, how you're using encryption/hashing, and software/app security methods you've used to protect against specific vulnerabilities.
Having said that, here's the (rather minimal, IMHO) security statement for shinyapps.io:
shinyapps.io is secure-by-design. Each Shiny application runs in its
own protected environment and access is always SSL encrypted. Standard
and Professional plans offer user authentication, preventing anonymous
visitors from being able to access your applications.
I would say that the burden will heavily fall on you to use good encryption and data storage practices.
There are many official and unofficial guidelines you can look to for guidance on data storage. One which big companies, particularlly companies going public, must follow is Sarbanes-Oxley.
From grtcorp.com:
The Sarbanes-Oxley Act (SOX Act) was passed by Congress and signed
into law in 2002 in response to major cases of financial fraud, of
which the rise and collapse of Enron is the best known. The overall
focus of the measure is on financial reporting responsibilities, and
ensuring that financial audits are genuinely independent.
However, SOX also includes provisions that relate to the security and
preservation of financial data. And the standards set out for its
implementation "recognized that senior management can't just certify
controls ON the system, these controls also have to control the way
financial information is generated, accessed, collected, stored,
processed, transmitted, and used through the system."
Senior management is thus held ultimately responsible for financial
data security, including putting in place appropriate controls and
procedures to ensure this data security. The good news is that
powerful tools, including data discovery and Data Masking, are
available to meet these standards.
I would also encourage you to familiarize yourself with OWASP's list of the top 10 major web app vulnerabilities:
https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project

Deploying app with Crashlytics to Apple Appstore - do I need a privacy policy?

I am about to submit an app to the Apple AppStore built in Swift that uses Crashlytics to capture crash information. As users of Crashlytics know, some information about usage, duration, crashes, etc. is captured and stored on the Crashlytics servers. My application does not ask for, store or attempt to capture any user data.
My question is about the privacy policy for my application. Since I don't capture any user data, I want to state that in my privacy policy but I'm not sure that's factual since I am using Crashlytics. Any feedback on people that have used Crashlytics in their app and have an actual privacy policy?
Thanks
--Vinny

Quick answer: yes, you need that privacy policy. There are ways to get it done fast, too.
Longer answer:
Third parties (here Crashlytics)
When dealing with a third party service like this, often a quick look into their legal documents will help (for Crashlytics in this case as described in your question).
(...) At all times during the term of this Agreement, Developer shall
maintain a privacy policy (a) that is readily accessible to users from
its website or within its online service (as applicable), (b) that
fully and accurately discloses to its users what information is
collected about its users and (c) that states that such information is
disclosed to and processed by third party providers like Crashlytics
in the manner contemplated by the Services, including, without
limitation, disclosure of the use of technology to track users’
activity and otherwise collect information from users. (...)
And
Developer shall at all times comply with all applicable laws, rules
and regulations relating to data collection, privacy and security,
including, without limitation, the Children’s Online Privacy
Protection Act (“COPPA”). Crashlytics may, at its sole discretion from
time to time during the Term of this Agreement, audit Developer Data
to verify compliance.
Crashlytics is actually being unusually vocal about this topic.
The App Store
At the time of writing (and since iOS8) Apple requires privacy policies for 5 categories:
Kids Category, HomeKit, HealthKit, Apple Pay, and Keyboard Extentions. Also they require privacy policies for user registrations (more). I can't tell if any of the above for your app is true. Apple still says in their App Store Review Guidelines that you need to be compliant with all applicable laws. This brings us to the third and most important reason.
Privacy related regulations
All of the above is just there because of global privacy regulations, these companies would most likely not care otherwise. As soon as you work with User data you are mostly under an obligation to disclose these facts. It's personal data like names, addresses or the tracking of user behaviour. It's been written at length why analytics services need privacy policies. All of it is more important as soon as you share data and use third party services for it. Mostly the disclosure or some kind of consent is the condition for it's compliant usage.
If you are interested in reading more about the matter in the context of mobile apps I'd suggest any of these documents:
ICO UK
Ireland
USA/California
Canada
Australia
Hope this helps.
(For proper disclosure: I do some work for iubenda, a tool that helps creating privacy policies for apps and websites)

Vinny, I think it's not mandatory (I've seen apps using Crashlytics wihtout a privacy policy), but it's recommended to have transparency in the communications with your users.
Crashlytics already has a privacy policy so you can just use that policy and add a statement informing that you are not collecting any sensitive information from the user, such as email or phone number.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex