The Cloudera documentation says that Hadoop does not support on disk encryption. Would it be possible to use hardware encrypted hard drives with Hadoop?
eCryptfs can be used to do per-file encryption on each individual Hadoop node. It's rather tedious to setup, but it certainly can be done.
Gazzang offers a turnkey commercial solution built on top of eCryptfs to secure "big data" through encryption, and partners with several of the Hadoop and NoSQL vendors.
Gazzang's cloud-based Encryption Platform for Big Data helps
organizations transparently encrypt data stored in the cloud or on
premises, using advanced key management and process-based access control
lists, and helping meet security and compliance requirements.
Full disclosure: I am one of authors and current maintainers of eCryptfs. I am also Gazzang's Chief Architect and a lead developer.
If you have mounted a file system on the drive then Hadoop can use the drive. HDFS stores its data in the normal OS file system. Hadoop will not know whether the drive is encrypted or not and it will not care.
Hadoop doesn't directly support encryption, though a compression codec can be used used for encryption/decryption. Here are more details about encryption and HDFS.
Regarding h/w based encryption, I think Hadoop should be able to work on it. As Spike mentioned, HDFS is like any other Java application and stores it's data in the normal OS file systems. FYI, MapR uses Direct I/O for better HDFS performance.
See also Intel's Rhino. Not open source yet...
https://github.com/intel-hadoop/project-rhino/
https://hadoop.intel.com/pdfs/IntelEncryptionforHadoopSolutionBrief.pdf
Related
I need to SFTP a file from a server to mainframe. While transfering the file it should be in the form of TAPE dataset when recieved to the mainframe. Is that possible?
If what you mean by "in the form of TAPE dataset when recieved to the mainframe" is that the transferred data should go directly to tape rather than DASD, then it may be possible.
If the mainframe is running Dovetailed Technologies Co:Z SFTP server, that product provides mechanisms for allocating mainframe files in a detailed, mainframe, and shop-specific manner.
The z/OS provided sftp is based on the IBM OpenSSH implementation. As such, it does not support MVS datasets as of z/OS 2.4. Assuming by TAPE dataset you're referring to a traditional PS format.
OpenSSH's sftp does not have built-in support for MVS™ data sets. However, there are alternate (indirect) ways to access MVS data sets within sftp.
The above quote from IBM's website can be accessed here
As #cschneid indicated, other products and offerings can provide additional capability but it is not provided with z/OS base operating system.
I am looking for JCL Script/Procedures in mainframe which can facilitate file transfer from Unix server to Mainframe.I am required to do FTPS for the Outbound Jobs (pull the file from UNIX server to mainframe Host).
Rather than a JCL, just do it a shell script. Here is a good site on using such commands:
https://blog.eduonix.com/shell-scripting/how-to-automate-ftp-transfers-in-linux-shell-scripting/
Once you have that working in the shell script in USS, you should be able to call the shell script from a JCL so you can execute it on a scheduled batch job if you need it.
Kenny's suggestion is fairly reasonable. IBM's documentation on how to write JCL for FTP(S)-related tasks is available in their "z/OS Communications Server: IP User's Guide and Commands" publication, IBM Publication No. SC27-3662. The current revision appears to be SC27-3662-30, but later revisions are possible. You can easily find this publication online, and make sure you don't skip the section beginning with the title "Submitting FTP requests in batch." Make sure you set the security options correctly (of course).
Please note that you're asking about FTPS, i.e. TLS encryption applied to either or both (preferably both) of the FTP channels (control and data). SFTP is another file transfer protocol based on SSH that z/OS also supports.
Another possible approach that you'll fairly often find available on z/OS installations is to use IBM MQ Advanced for z/OS's Managed File Transfer (MFT) feature to retrieve the file(s) using FTPS. As the name suggests, this'll be managed and have at least some error handling capabilities.
Yet another possible approach if you prefer HTTPS protocol is to use the z/OS Client Web Enablement Toolkit's HTTPS protocol enabler to fetch the file. That's a built-in, standard feature in all currently supported z/OS releases, and you can use it from a relatively simple REXX script for example. Details are available here (z/OS 2.3 variant of the documentation):
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.ieac100/ieac1-cwe-http.htm
We have a IBM Host System Z sitting in our cellar. Now the issue is that i have no clue about Mainframes!!! (It's not USS btw.)
The Problem: How can i transfer a file from the host system to a windows machine.
Usually on UNIX systems i would just install and ssh daemon and connect to it via. a program called winscp. After that transfer the file in binary so that it does not convert something (Ultraedit and other Editors can handle this).
With the host system it seems to be a bit difficult as the original format from IBM is EBCDIC and i have no idea if there is a state of the art SFTP server program for the host. Could anybody be so kind and enlighten me? From my current expirience with IT there must be a state of the art sftp connection to that system? I appreciate any help/hints/solutions.
Thank you,
O.S
If the mainframe "sitting in [your] cellar" is running z/OS then it has Unix System Services installed. You can't have z/OS without it.
There is an SFTP package available (for free) for z/OS.
You can test to see about Unix System Services by firing up a 3270 emulator going to ISPF option 3.17, putting a forward slash (/) in the Pathname field and pressing the mainframe Enter key. Another way would be to key OMVS at a TSO READY prompt, which will start up a 3270-based Unix shell.
It is possible that USS is simply not available to you; if you're running any supported release of z/OS then USS is present. There could be concerns about supporting something outside a particular group,
Or, depending on what OS you have running on your System z, it's possible you don't have z/OS. You could have z/VM, you could have zLinux, you could have TPF. However, if you're running zLinux, you have linux, which has sftp installed, and which uses ASCII, not EBCDIC.
As cschneid says, however, if you have z/OS, you have USS. TCP/IP, among other things, won't run without it. Also note that z/OS TCP/IP has an FTP server, so you can connect that way if the FTP server is set up. If security is an issue, FTPS is supported, although it's painful to set up. With the native FTP server, you can convert from EBCDIC to ASCII when you're doing the transfer. There's also an NFS server available. And SMB as well, I believe.
And there's an FTP client available as well, so you could FTP from z/OS to your system, if you wanted to.
Maybe a better thing to do would explain what you're trying to do with the data, and what the data is, in general. You can edit files directly on the mainframe, using either TSO, ISPF, or OMVS editors. There are a lot of data types that the mainframe supports that you're not going to be able to handle on a non-z system unless you go through an export process. I'm not really clear on whether you want to convert the file to ASCII when you transfer it or not.
While the others are correct that all recent releases of z/OS have USS built-in, there's quite a bit of setup work that needs to be done in order for individual users to have access to USS capabilities like SFTP. Out of the box, you get USS "minimal mode" that just has enough of USS to support the TCP/IP stack and so forth. USS "full function mode" requires setup:
HFS filesystems need to be allocated
Your security package needs to be manage UIDs/GIDs for your users
etc etc etc
Still, with these details and with nothing more than the software you're entitled to as part of your z/OS license, you can certainly run SFTP and all the other UNIX style network services you're used to.
A good place to start is the UNIX Services Planning guide: http://publibz.boulder.ibm.com/epubs/pdf/bpxzb2c0.pdf
I am currently working on a Hadoop project that requires data encryption (because the data will be stored in S3). While I primarily expect to access the data though Hive, it would be nice to be able to access it via Pig and any other MapReduce methods.
I know Hadoop has built-in support for compression codecs like gzip, snappy, etc... Is there any support for encryption codecs as well (specifically, GPG)? Has anyone written a GPG SerDe (or anything similar) that is publicly available?
Last I knew Hadoop has no internal support for encryption whatsoever. Seems like you could overload the CompressionCodec with your GPG code, ala http://www.mail-archive.com/common-user#hadoop.apache.org/msg06229.html
Happy Hacking & let us know if you find a solution!
As a follow-up of this question: sqlite-over-a-network-share
If I put the SQlite DB on a network share, but will not access it concurrently from different machines. I only have the SQLite db stored on a share so a cluster of failover computers can take over where one machine left off.
Are there any inherent problems with that approach?
Interested in knowing your experiences (After 5 years). Per Eric Grange's helpful hint:
"SQLite uses POSIX advisory locks to implement locking on Unix"... "POSIX advisory locking is known to be buggy or even unimplemented on many NFS implementations" ... "Your best defense is to not use SQLite for files on a network filesystem."
Having said that, if your NFS server is rock-solid (ie, NetApp) and your clients are rock-solid (ie, probably not Linux; see for instance http://nfsworld.blogspot.co.at/2006/10/review-of-why-nfs-sucks-paper-from.html).
POSIX advisory locking over NFS is also impelmentation-dependent: From the File locking Wikipedia article: "On Linux prior to 2.6.12, flock calls on NFS files would act only locally. Kernel 2.6.12 and above implement flock calls on NFS files using POSIX byte-range locks. These locks will be visible to other NFS clients that implement fcntl-style POSIX locks, but invisible to those that do not." If there's doubt, you can use nfstrace to determine what your OS is trying to do.
What happens if node A has begun a transaction, locked the table-file, then crashed? Will node B see the advisory lock and refuse to write to the file?