Using NDMP instead of CIFS mounting - mount

I have a weird but interesting use-case. I use CIFS to mount shares from a File Server (NetApp, EMC etc) to an application server (win/linux server where my application runs). My application needs to process each of the file from the shares that I mount via CIFS. My application also needs access to the meta-data of these files such as Name, Size, ACLs etc.
I would like to see if I can achieve the same via NDMP. I have some very basic questions regarding this use-case. It would be great if you guys can help me out here.
Is this even something which is achievable?
Can I transfer only shares that are interesting to me instead of the entire volume?

NDMP is essentially an application protocol to control backup/restore operations. The protocol is supple enough to do interesting things like data migration or tape cloning as well.
However it is not a file access protocol so "mounting" anything via NDMP isn't possible unless a NDMP server vendor writes a NDMP extension to do so: which will be a rather silly given there are specialized protocols that do just that.
Hope this helps.

NDMP is designed for data management (backup, recovery, etc.) and not as a file access protocol such as CIFS. If you application is a backup application then yes, you can use NDMP to control and copy subsets of the data from your filer to the application server. Note that the format of the data will be what the filer provides (via NDMP) and not in your control. Hope that helps!

Related

FTPS for transferring file from unix to mainframe

I am looking for JCL Script/Procedures in mainframe which can facilitate file transfer from Unix server to Mainframe.I am required to do FTPS for the Outbound Jobs (pull the file from UNIX server to mainframe Host).
Rather than a JCL, just do it a shell script. Here is a good site on using such commands:
https://blog.eduonix.com/shell-scripting/how-to-automate-ftp-transfers-in-linux-shell-scripting/
Once you have that working in the shell script in USS, you should be able to call the shell script from a JCL so you can execute it on a scheduled batch job if you need it.
Kenny's suggestion is fairly reasonable. IBM's documentation on how to write JCL for FTP(S)-related tasks is available in their "z/OS Communications Server: IP User's Guide and Commands" publication, IBM Publication No. SC27-3662. The current revision appears to be SC27-3662-30, but later revisions are possible. You can easily find this publication online, and make sure you don't skip the section beginning with the title "Submitting FTP requests in batch." Make sure you set the security options correctly (of course).
Please note that you're asking about FTPS, i.e. TLS encryption applied to either or both (preferably both) of the FTP channels (control and data). SFTP is another file transfer protocol based on SSH that z/OS also supports.
Another possible approach that you'll fairly often find available on z/OS installations is to use IBM MQ Advanced for z/OS's Managed File Transfer (MFT) feature to retrieve the file(s) using FTPS. As the name suggests, this'll be managed and have at least some error handling capabilities.
Yet another possible approach if you prefer HTTPS protocol is to use the z/OS Client Web Enablement Toolkit's HTTPS protocol enabler to fetch the file. That's a built-in, standard feature in all currently supported z/OS releases, and you can use it from a relatively simple REXX script for example. Details are available here (z/OS 2.3 variant of the documentation):
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.ieac100/ieac1-cwe-http.htm

Assets Management in a clustered environment

I have a content management system running on a web server, that among others allows the user to upload assets like images, files, etc to the server.
The problem i have is that there will be 2 servers running behind a load balancer and i am trying to find an efficient way to handle the assets management.
The question i have is:
Will the assets be uploaded to one server every time? Or is there a chance that the images/files will end up into server1 or server2 depending on the load?
How to i serve the images if i don't know on which server they end up in? Will i have to keep the directories of these assets (images/files) synchronized between the two servers?
Thanks,
Synchronization is a tough problem to crack. You can do ad-hoc synchronization using Couchdb but that requires good knowledge of the low-level issues. Therefore you need to choose a write master.
DRDB
You could look at DRDB :D Use one server as the write-master and the other as the slave. Then you can server content from both. This approach is amazing for database-pairs.
Note: seperating your code and URL's for write-master and serve-only will be anoying
Couchdb
You could use couchdb but I think that might be overkill. This is for the LARGE amounts of data and high-levels of fault tolerance.
NFS
You could export the asset directory on the write-master as an nfs drive and import it from the other computer. But in this case it wouldn't be load-balanced in all cases -- i.e., only if the files are cached by the slave. You could use a third computer as an NFS server -- this would allow you to scale to more web-servers.
A central NFS server might just be your best solution as you can do without a write-master as every front-end server can perform writes. This is the approach I would use unless I am thinking of going past the peta-byte range :P

Efficient reliable incremental HTTP multi-file (or whole directory) upload software

Imagine you have a web site that you want to send a lot of data. Say 40 files totaling the equivalence of 2 hours of upload bandwidth. You expect to have 3 connection losses along the way (think: mobile data connection, WLAN vs. microwave). You can't be bothered to retry again and again. This should be automated. Interruptions should not cause more data loss than neccessary. Retrying a complete file is a waste of time and bandwidth.
So here is the question: Is there a software package or framework that
synchronizes a local directory (contents) to the server via HTTP,
is multi-platform (Win XP/Vista/7, MacOS X, Linux),
can be delivered as one self-contained executable,
recovers partially uploades files after interrupted network connections or client restarts,
can be generated on a server to include authentication tokens and upload target,
can be made super simple to use
or what would be a good way to build one?
Options I have found until now:
Neat packaging of rsync. This requires an rsync (server) instance on the server side that is aware of a privilege system.
A custom Flash program. As I understand, Flash 10 is able to read a local file as a bytearray (indicated here) and is obviously able to speak HTTP to the originating server. Seen in question 1344131 ("Upload image thumbnail to server, without uploading whole image").
A custom native application for each platform.
Thanks for any hints!
Related work:
HTML5 will allow multiple files to be uploaded or at least selected for upload "at once". See here for example. This is agnostic to the local files and does not feature recovery of a failed upload.
Efficient way to implement a client multiple file upload service basically asks for SWFUpload or YUIUpload (Flash-based multi-file uploaders, otherwise "stupid")
A comment in question 997253 suggests JUpload - I think using a Java applet will at least require the user to grant additional rights so it can access local files
GearsUploader seems great but requires Google Gears - that is going away soon

What's the best solution for file storage for a load-balanced ASP.NET app?

We have an ASP.NET file delivery app (internal users upload, external users download) and I'm wondering what the best approach is for distributing files so we don't have a single point of failure by only storing the app's files on one server. We distribute the app's load across multiple front end web servers, meaning for file storage we can't simply store a file locally on the web server.
Our current setup has us pointing at a share on a primary database/file server. Throughout the day we robocopy the contents of the share on the primary server over to the failover. This scneario ensures we have a secondary machine with fairly current data on it but we want to get to the point where we can failover from the primary to the failover and back again without data loss or errors in the front end app. Right now it's a fairly manual process.
Possible solutions include:
Robocopy. Simple, but it doesn't easily allow you to fail over and back again without multiple jobs running all the time (copying data back and forth)
Store the file in a BLOB in SQL Server 2005. I think this could be a performance issue, especially with large files.
Use the FILESTREAM type in SQL Server 2008. We mirror our database so this would seem to be promising. Anyone have any experience with this?
Microsoft's Distributed File System. Seems like overkill from what I've read since we only have 2 servers to manage.
So how do you normally solve this problem and what is the best solution?
Consider a cloud solution like AWS S3. It's pay for what you use, scalable and has high availability.
You need a SAN with RAID. They build these machines for uptime.
This is really an IT question...
When there are a variety of different application types sharing information via the medium of a central database, storing file content directly into the database would generally be a good idea. But it seems you only have one type in your system design - a web application. If it is just the web servers that ever need to access the files, and no other application interfacing with the database, storage in the file system rather than the database is still a better approach in general. Of course it really depends on the intricate requirements of your system.
If you do not perceive DFS as a viable approach, you may wish to consider Failover clustering of your file server tier, whereby your files are stored in an external shared storage (not an expensive SAN, which I believe is overkill for your case since DFS is already out of your reach) connected between Active and Passive file servers. If the active file server goes down, the passive may take over and continue read/writes to the shared storage. Windows 2008 clustering disk driver has been improved over Windows 2003 for this scenario (as per article), which indicates the requirement of a storage solution supporting SCSI-3 (PR) commands.
I agree with Omar Al Zabir on high availability web sites:
Do: Use Storage Area Network (SAN)
Why: Performance, scalability,
reliability and extensibility. SAN is
the ultimate storage solution. SAN is
a giant box running hundreds of disks
inside it. It has many disk
controllers, many data channels, many
cache memories. You have ultimate
flexibility on RAID configuration,
adding as many disks you like in a
RAID, sharing disks in multiple RAID
configurations and so on. SAN has
faster disk controllers, more parallel
processing power and more disk cache
memory than regular controllers that
you put inside a server. So, you get
better disk throughput when you use
SAN over local disks. You can increase
and decrease volumes on-the-fly, while
your app is running and using the
volume. SAN can automatically mirror
disks and upon disk failure, it
automatically brings up the mirrors
disks and reconfigures the RAID.
Full article is at CodeProject.
Because I don't personally have the budget for a SAN right now, I rely on option 1 (ROBOCOPY) from your post. But the files that I'm saving are not unique and can be recreated automatically if they die for some reason so absolute fault-tolerance is necessary in my case.
I suppose it depends on the type of download volume that you would be seeing. I am storing files in a SQL Server 2005 Image column with great success. We don't see heavy demand for these files, so performance is really not that big of an issue in our particular situation.
One of the benefits of storing the files in the database is that it makes disaster recovery a breeze. It also becomes much easier to manage file permissions as we can manage that on the database.
Windows Server has a File Replication Service that I would not recommend. We have used that for some time and it has caused alot of headaches.
DFS is probably the easiest solution to setup, although depending on the reliability of your network this can become un-synchronized at times, which requires you to break the link, and re-sync, which is quite painful to be honest.
Given the above, I would be inclined to use a SQL Server storage solution, as this reduces the complexity of your system, rather then increases it.
Do some tests to see if performance will be an issue first.

Managing authorized_keys on a large number of hosts

What is the easiest way to manage the authorized_keys file for openssh across a large number of hosts? If I need to add or revoke a new key to an account on 10 hosts say, I must login and add the public key manually, or through a clumsy shell script, which is time consuming.
Ideally there would be a central database linking keys to accounts#machines with some sort of grouping support (IE, add this key to username X on all servers in the web category). There's fork of SSH with ldap support, but I'd rather use the mainline SSH packages.
I'd checkout the Monkeysphere project. It uses OpenPGP's web of trust concepts to manage ssh's authorized_keys and known_hosts files, without requiring changes to the ssh client or server.
I use Puppet for lots of things, including this.
(using the ssh_authorized_key resource type)
I've always done this by maintaining a "master" tree of the different servers' keys, and using rsync to update the remote machines. This lets you edit things in one location, push the changes out efficiently, and keeps things "up to date" -- everyone edits the master files, no one edits the files on random hosts.
You may want to look at projects which are made for running commands across groups of machines, such as Func at https://fedorahosted.org/func or other server configuration management packages.
Have you considered using clusterssh (or similar) to automate the file transfer? Another option is one of the centralized configuration systems.
/Allan

Resources