User ownership of personal information [closed] - privacy

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
At the moment it seems that most webapps store their user-data centrally.
I would like to see a movement towards giving the user total access and ownership of their own personal information and data; ultimately allowing the user to choose where their data is stored.
As an example - with an application like facebook, the user's profile data could exist on any device that they own (e.g. their mobile phone) ... facebook would then request the data from the user, and make use of it.
Does anyone see this idea becoming a reality? Is it a ridiculous idea?
CLARIFICATION:
The information would at least need to be cache-able. The motivation behind the idea was to give the user more control over their own data - the user is self-publishing an
authoritative version of what they are happy for the world to see.
I'm imagining a future which is largely dictated by choices which are made now. Perhaps physical location of the data isn't actually important - and is more a symbolic gesture... but I think that decoupling the relationship between our information and the companies that make use of it could be a positive thing.
But perhaps, the details do need a bit more work ;)

What's with performance? Imagine you want to search for data that is located on hundreds of mobile phones or private distributed systems.

what your describing is simulator to a combination of OpenID Attribute Exchange, Portable Contacts and OpenSocial. Having one repository of user data that every other provider would feed off. Its nice for a user but I would not go so far as to tie it to a specific device. Rather a federated identity that you control from one vendor's website/application.

I am with you on this one.
And I think the key technology might be RDF. Since protocols such as F.O.A.F. are already used in these social applications, it is a small step from $Facebook storing your RDF Graph, to you storing it yourself, and saying: This is me, these are my friends, or anything else you might want someone to know.
This approach might be globalised to other personal information you might ened an authorised party to know, like Health Records.

There are quite a few conceptual problem with what you are suggesting.
Firstly, everytime you reconnected to the system, you would need to upload your personal information back into the system so that it could interact with you. This adds quite an overhead to the signin/handshake/auth with the remote system.
Secondly, alot of online systems (particularly online communities) rely on you leaving an online profile of yourself so that other users can interact with you (via your profile) when you yourself are offline. This data would have to be kept somewhere central.
At the very least, the online system would need a very basic profile to represent you, so that you could login & authenticate against... which sounds like a contradiction to what you are suggesting.

Performance would suffer should the user have physical possession of the data; e.g., thumb drive, local drive. However, if a "padded cell" solution were possible where the user has complete rights to a vault that the application could reach quickly, then there might be a possibility.
This really isn't a technology solution, rather one of corporate policy. Facebook could easily craft a policy that states that your records are yours, just like a bank should. They just don't. For that matter, many other institutions who are supposed to guard our personal information - our property if I can evoke John Locke - but fail miserably. If they reviewed their practices for violation of policy and were honest, you could trust. Unfortunately this just doesn't happen.
The IRS, Homeland Security and other agencies will always require that an institution yield access to assets. In the current climate I can't see how it would be allowed for individuals to remain in physical possession of electronic records that a bank or institution would use online.
Don't misinterrpret me - I think your idea is a good one to pursue, but it's more of a corporate policy issue than a technical solution.

You need to clarify what you mean by ownership. Are you trying to ensure that the data is only stored on your own devices? As others have pointed out, this will make building social networks impossible. You would disappear from Facebook when you weren't connected to it, for example.
Or are you trying to ensure that a single authoritative copy exists and that services defer to it? This might be more possible, and would require essentially synching the master copy on your cell phone with the server when possible.
Or are you trying to ensure that you can edit/delete your account at any time? Most sites already work like this.

The user still wouldn't be sure they 'own' their data, simply because they'd have to upload it every time they connect, and the company it's being sent to could still do whatever it wants with it. It could just not display your profile when you're not online, but still keep a copy of it somewhere.

Total access, ownership and location choices of personal information and data is an interesting goal but your example illustrates some fundamental architecture issues.
For example, Facebook is effectively a publishing mechanism. Anything you put on a public profile has essentially left the realm of information that you can reasonably expect to keep private. As a result, let's assume that public forums are outside the scope of your idea.
Within the realm of things that you can expect to keep private, I'm a big fan of encryption combined with physical and network security balanced against the need for performance. You use the mobile phone as an example. In that case, you almost certainly have at least three problems:
What encryption is used on the phone? Any?
Physical security risk is quite high - have you ever had an expensive portable electronic device stolen? There seems to be quite the stolen phone market out there....
The phone becomes a network hotspot - every service that needs your information would need to make an individual connection to your phone before it could satisfy a request. Your phone needs to be on, you need to have a sufficiently fat data pipeline, etc.
If you flip your idea around, however, it becomes clear that any organization that does require persistent storage of your sensitive private information (aka SPI) should meet some fundamental (and audit-able) requirements:
Demonstrated need to persist the information: many web services already ask "should I remember you?" or "do you want to create an account?" I think the default answer should always be "NO" unless I say otherwise explicitly.
No resale or sharing of SPI. If I didn't tell my bank or my bookstore that they can share my demographic information, they shouldn't be able to. Admittedly, my phone number and address are in the book, so I can't expect that I'll stay off of every mailing list but this would at least make things less convenient for the telemarketers.
Encryption all the time. My SPI should never be stored in the clear.
Physical security all the time. My SPI should never be on a laptop drive.
Given all of the above, it would be possible for you to partially achieve the goal of controlling the dissemination of your SPI. It wouldn't be perfect. The moment you type anything in, there is immediately a non-zero risk that someone somewhere has somehow figured out to monitor or capture it. Even so, you would have some control of where your information goes, some belief that it would only go where you tell it to go and that the probability of it being stolen is somewhat reduced.
Admittedly, that's a lot of weasel words in a row....

We are currently developing a platform to allow people exercise the right to access their personal data (habeas data) against any holder of such data.
Rather than following the approach you suggest, we actually pursue a different strategy: we take snapshots of the personal data as it is in the ddbb of the "data holder" whenever the individual wants to access her data.
Our objective is to give people freedom in the management of their own personal data, allowing them to share it with others based on their previous consent.
I would like to further discuss with you should you be interested.

Please read Architecture Astronauts.

Related

When a G-Suite form is embedded on external website, does any form data get stored on the host site?

This question comes up because of very specific HIPAA requirements. A Covered Entity(CE) eg, doctor can't use a cloud storage provider (CSP) unless they have a Business Associate Agreement (BAA) with the CSP, even if the data are encrypted and the CSP has no access. I'm not a security expert, but most web hosts' security would IMO satisfy HIPAA, IF there were a BAA.
There's a conduit exception for video, ISPs, and other electronic equivalents of USPS that do not store electronic Protected Health Information (e-PHI.)
I don't know why, but the web hosts who will sign a BAA charge $100-300/ mo for very basic hosting other sites charge $5-15/mo for. I think they're preying on CE ignorance with the perception there's lots of money sloshing around, true for radiology, but not for primary care.
G-Suite will execute a BAA, which makes G-Suite a reasonably-priced solution for gathering Protected Health Information (PHI) patient input, while keeping the CE compliant with HIPAA.
It's worth noting that "HIPAA compliance" is ONLY a property of CEs and Electronic Medical Records, not other software or sites. Any other product or service claiming "HIPAA compliance" is misrepresenting itself.
I find Google Sites not as user-friendly as most web hosts. There's less hand-holding for doing things like installing WP add-ins, or adding SSL certificates. Or maybe Google just does a terrible job of explaining how to actually DO something with a site hosted there. In any case, it seems easier to run a website on a web host that's set up to manage software and WP plug-ins for amateurs.
I'm willing to be educated on this. (24 hours later--I did a lot of self-education-see answer below.)
The basic HIPAA privacy requirements are rather simple:
CEs can use PHI to treat and carry out essential functions, but must
not share it with anyone not entitled to it.
The basic HIPAA security requirements are also simple:
Make a security risk analysis.
Implement reasonable security measures and
Document why various measures were taken or not.
Some elements are required, others must simply be addressed, evaluated and documented.
For example, 2FA is "addressable" as is data encryption, but making an analysis, having physical security and employee training are required.
So my question is whether a G-Suite form embedded in a website on another web host stores any data on that web host, or does it all go back to G-Suite, eg G-Drive, where it's secure and covered by a BAA?
The problem when you know very little about a topic is, you don't know what to ask. I know a bunch about HIPAA, not much about HTML. I did a lot more research, and there's at least two answers.
The short answer is, NO, the embedded frame is an iframe HTTPS linked to G-Suite.
The form in the iframe is a window into docs.google.com, so data never gets off docs.google.com, where it's covered by G-Suite's BAA. The host site is in effect a conduit.
<iframe src="https://docs.google.com/forms......…</iframe>
Note https
Embedding the form does not create a HIPAA violation.
The second answer is, G-Suite has its own content management system and website builder, which requires very little technical skill. Thus there's no need to install Wordpress or anything else, you just drag-and-drop to create a site. All the back end stuff is done for you. Duh. And they execute a BAA, all for $6 a month. So G-Suite is much simpler, in fact so simple that only a child can do it. Their help pages leave much to be desired.
Bottom line--for small covered entities, G Suite is a very economical website solution that doesn't create a HIPAA violation. Wish I knew this yesterday!
FYI: HIPAA compliant Cloud Services

Where do APIs get their information from

After some time being working with Restful APIs I would like to know a bit more about their internal functionality.
I would like a simple explanation about how the API`s get access to the data that they provide as responses to our requests.
There are APIs, for example weather API`s or sports APIs that are capable to provide responses with very recent data (such as sports results), I am wondering where or how they get that updated info almost as soon as it is available.
I have seen here on SO questions with answers pointing to API design tutorials, but not to this particular topic.
An API is usually simply a facade (or an interface if you prefer) to some information resource. The idea behind it is to "hide" any complexity from the user, to unify several services to a single access point or even to keep the details about the implementation of the actual service a secret.
This being said you probably understand now that there can't be one definitive answer to the question "where do APIs get their info from?". But some common answers are:
other APIs
some proprietary/in-house developed service/database
etc.
For sports APIs - probably they are being provided by some sports media, which has the results as soon as they get out, so they just enter them in their DB and immediately they become available through their API.
For weather forecasts - again as with the sports API they are probably provided by a company dealing with weather forecasts.
If it's easier for you you can think of the "read-only" APIs as rss feeds in a way.
I hope this clears the things a bit for you.
You could have a look at Stack Share to see what companies use for databases and whatnot. But there isn't a universal answer, every company uses whatever works for them.
This usually means that te company has its own database in which the data is stored. But they might also get their data from another company.
But a 'database' is not just SQL, maybe they use unstructured data or any of the other options to store data.
That's where the "whatever works" comes from. The company chooses a solution they go with which best fits their needs.

Can we store sensitive client information with the admins without them(admins) identifying it?

I am trying to design a pairing application for my university this valentine. How is it supposed to work, you ask?? The clients will submit preferences to the server and after some days if any two clients have the same preferences, they will be notified -- not in any other case. A fool-proof framework design needs to be built for this purpose. What I am willing to do is to ensure my clients that even though they will be submitting their favourite responses to me via my website, I will still not be able to see those as if I would, this application will have issues of privacy. I am trying to match the user preferences with each other, they will obviously be encrypted and there is no way I can match any two unless I decrypt them at some point in my server locally -- assuming the fact that RSA encryption mechanism has a very little probability of collision of hashed values and I definitely cannot match them :) . The bottleneck here then is >> never ever decrypt the client preferences locally on the admin's machine/server. One approach which is currently on my mind is to introduce a salt while encrypting which will stay safe in the hands of the client, but still decryption needs to be done at some point in time to match these hashes. Can there be some alternative approach for this type of design, I think I might be missing something.
Turn things around. Design a mailbox-like system and use pseudonyms. Instead of getting an email once a match has been found, make people request it. That way you can leave the preferences unencrypted. Just the actual user has to be hidden from public. Start with an initial population of fake users to hide your early adaptors and you will be done.
I.e.: In order to post preferences, I'll leave a public key to contact me. The server searches matches and posts encrypted messages to a public site. Everyone can see these messages (or not, if you design properly) but I am the only one that can read them.

Security for Exposing Internal Web-based application to the World

We have an internal CRM system which is currently a website that can only be accessed inside our intranet. The boss is now wanting to have it exposed to the outside world so that people can use it from home and on the road. My concern is security based in the fact we will be exposing our Customer base to the outside world. I have implemented 3 layers of security as follows:
User Name and Strong password combination to login
SSL on all data being pushed across the line
Once the user is logged in and authenticated the server passes them a token which must be used in all communication with the server from than on.
Basically Im a bit of newb in the respect of web security. Can anyone give me advice on whether I am missing anything? Or something should be changed?
There's a whole world of stuff you should consider, and it'll be really hard to quickly answer this - so I'll point you at a range of resources that should help you out / get you started.
First, I'll plug http://security.stackexchange.com, for any specific questions you have - they could be a great help.
Now, on to more immediate things you should check:
Are your systems behind a firewall? I'd recommend at least your DB is placed on a server that is not directly available to the outside world.
Explore and run a range of (free) security tools against your site to try and find any problems. e.g.:
https://asafaweb.com
http://sectools.org/
Read up on common exploits (e.g. SQL injection) and make sure you are guarding against them:
https://www.owasp.org/index.php/Top_10_2010-Main
https://www.owasp.org/index.php/Category:Vulnerability
How is your token being passed around, and what happens to it if another user gets hold of it (e.g. after it being cached on another machine)?
Make sure you have a decent password protection policy (decent complexity, protects against brute force attacks by locking accounts after 3 attempts).
If this is a massive concern for you (consider the risk to your business in a worst case scenario) consider getting an expert in, or someone to run a security test against your systems?
Or, as mrunion excellently points out in the comments above (+1), have you considered other more secure ways of opening this up, so that you don't need to publish this on the web?
Hope that gets you started.

Detecting uploading using HTTP and copying using USB device

I have been asked by my (pananoid!) boss to do two things
1. Detect when a user uploaded files to the net using HTTP. So for example how can I detect if a user uploads fire to a free webserver somewhere and can hense steal company data
Detect that a user is copying files to a USB device and what the name of these files are. Also if they copy a zip file to log the contents of the zip file, in case someone just zips up some company files and takes it like that.
Firstly is number 1 possible? and for number 2 can i detect the file names that are copied?
Secondly, any likes to software that does this?
Note that I am the network admin and everyone who I will monitor has local admin rights on their computer and we do not want to further restrict users access.
Thanks a lot
"Note that I am the network admin and
everyone who I will monitor has local
admin rights on their computer and we
do not want to further restrict users
access."
You can have liberty or security, but not both. The number of paths to get data out of an unlocked box are too many to enumerate. Someone zipped up the files and put them on a thumb drive? What if they used tar or shar or pasted them into a Word document, or printed them to a PDF file and sent it out via e-mail steganographically embedded in pornography?
Yeah, a former coworker was stupid enough to send a huge set of huge, logged e-mails to his future employer a couple of days prior to leaving, but you can't count on people being quite that stupid.
What your boss wants isn't possible given a moderately motivated thief and not wanting to "further restrict" access.
Given freely available cryptographically secure tools like OpenSSH (ssh, scp) are usable by almost anyone, what he's asking for is not possible.
I agree with all of you, websense, a DLP, a proxy, a network monitoring, can help you to identify and stop activities not permited by your policies. By the way, a tech should be sustained by a policy on information security and an awareness program. So you have two fields to build-up. one way people must be warned because of the information security policy and constantly informed by the awareness program, then (second) if someone breaks the policy, the technology has to do its work. warn you.
There's basically no way to prevent a malicious employee from stealing and exporting data, short of strip searches when entering and leaving the building and no outside network access whatsoever.
Your boss should be more concerned with accidental data leakage (ie, mistyped email address or mistaken reply alls) and breach containment. The series of technologies dedicated to the former are called Data Leakage Prevention. I'm not hip to all their jive, but I bet more than a few companies would be willing to promise you the world if you showed interest.
The latter is mostly done by closely following the "least privilege" mindset. A guy from sales should not be able to use CVS to check out the source code to the product, and a developer shouldn't be able to access the payroll database. Always only grant the minimum amount of access required to someone in order for them to do their job.
Short answer: No. Not unless you're willing to "further restrict access".
The access restriction for http uploads would be a filtering internet proxy. Make everyone go through Websense or something, and you have a log of everything they did online.
For the USB devices, no. Your option there, and how companies with security needs of that magnitude deal with that issue, is to tightly lock down the clients and disable USB key use. (as well as CD burners, floppy drives if you still have those, etc) Again, that's going to require intrusive software, something like Landesk, + removing local admin so users can't take the software off.

Resources