Our company is using some software that ONLY accepts input from an "Imaging Device" i.e. a TWAIN device (e.g. scanner).
The problem is that we are receiving our files digitally, so using an actual scanner would require us to print, scan, and shred documents that we already have on the computer, but not in the software.
I was curious if anybody has any idea of how we might be able to work around this problem in the meantime. My first thought was to find some way to trick the program into thinking we're using a scanner, via some new 'imaging device' that would just read in the file, and spit it out to the software, but I don't even know where to begin with that.
We put in a feature request, seeing as how this problem should obviously be addressed in the software itself, but the company is notorious for lagging pretty hard when it comes to updates.
The system used by scanners is called TWAIN, so you'd be looking for some sort of virtual twain driver.
A quick google search will produce several hits, I don't have any experience with the software myself so can't advise any further.
Two such providers I found via experts exchange:
http://www.twaintools.de
http://www.scanpoint-usa.com
OK, months late... but in case you are interested, I have a TWAIN driver framework/toolkit that might let you build this fairly easily, depending on just what your scanning app expects, and how hard it is to read images from your digital documents. It's a Microsoft Visual C++ project. No charge but you'd need our permission to redistribute a driver based on it: GenDS
The TWAIN Working Group also has a sample/skeleton driver, I think it's straight C - and used to have some rather bad bugs (Why I wrote mine ;-) but, it might have got better.
Look for the "sample data source and application" on their download page.
And of course I have a 'commercial' version of GenDS that I use to write TWAIN drivers on contract.
Related
I have created an R code script that:
Reads some data from a database
Makes some transformations and..
exports into a csv the modified table.
This code needs to run in a client's machine, but we need to "hide" the actual code from the user.
Is there any useful suggestions on how we can achieve that?
Up front
... it will be nearly impossible to deploy an R <something> to another computer in a way that prevents curious users from accessing the source code.
From a mailing list conversation in 2011, in response to "I would not like anyone to be able to read the code.",
R is an open source project, so providing ways for you to do this is not
one of our goals.
Duncan Murdoch https://stat.ethz.ch/pipermail/r-help/2011-July/282755.html
(Prof Murdoch was on the R Core Team and R Foundation for many years.)
Background
Several (many?) programming languages provide the ability to compile a script or program into an executable, the .exe you reference. For example, python has tools like py2exe and PyInstaller. The tools range from merely compactifying the script into a zip-ball, perhaps obfuscating the script; ... to actually creating a exe with the script either tightly embedded or such. (This part could use some more citations/research.)
This is usually good enough for many people, by keeping the honest out. I say it that way because all you need to do is google phrases like decompile py2exe and you'll find tools, howtos, tutorials, etc, whose intent might be honestly trying to help somebody recover lost code. Regardless of the intentions, they will only slow curious users.
Unfortunately, there are no tools that do this easily for R.
There are tools with the intent of making it easy for non-R-users to use R-based tools. For instance, RInno and DesktopDeployR are two tools with the intent of creating Windows (no mac/linux) installers that support R or R/shiny tools. But the intent of tools like this is to facilitate the IT tasks involved with getting a user/client to install and maintain R on their computer, not with protecting the code that it runs.
Constrain R.exe?
There have been questions (elsewhere?) that ask if they can modify the R interpreter itself so that it does not do everything it is intended to do. For instance, one could redefine base::print in such a way that functions' contents cannot be dumped, and debug doesn't show the code it's about to execute, and perhaps several other protective steps.
There are a few problems with this approach:
There is always another way to get at a function's contents. Even if you stop print.default and the debugger from doing this, there are others ways to get to the functions (body(.), for one). How many of these rabbit holes do you feel you will accurately traverse, get them all ... with no adverse effect on normal R code?
Even if you feel you can get to them all, are you encrypting the source .R files that contain your proprietary content? Okay, encrypting is good, except you need to decrypt the contents somehow. Many tools that have encrypted contents do so to thwart reverse-engineering, so they also embed (obfuscatedly, of course) the decryption key in the application itself. Just give it time, somebody will find and extract it.
You might think that you can download the key on start-up (not stored within the app), so that the code is decrypted in real-time. Sorry, network sniffers will get the key. Even if you retrieve it over https://, tools such as https://mitmproxy.org/ will render this step much less effective.
Let's say you have recompiled R to mask print and such, have a way to distribute source code encrypted, and are able to decrypt it in a way that does not easily reveal the key (for full decryption of the source code files). While it takes a dedicated user to wade through everything above to get to the source code, none of the above steps are required: they may legally compel you to release your changes to the R interpreter itself (that you put in place to prevent printing function contents). This doesn't reveal your source code, but it will reveal many of your methods, which might be sufficient. (Or just the risk of legal costs.)
R is GPL, and that means that anything that links to it is also "tainted" with the GPL. This means that anything compiled with Rcpp, for instance, will also be constrained/liberated (your choice) by the GPL. This includes thoughts of using RInside: it is also GPL (>= 2).
To do it without touching the GPL, you'd need to write your interpreter (relatively from scratch, likely) without code from the R project.
Alternatives
Ultimately, if you want to release R-based utilities/apps/functionality to clients, the only sure-fire way to allow them to use your code without seeing it is to ... control the computers on which R will run (and source code will reside). I'll add more links supporting this claim as I find them, but a small start:
https://stat.ethz.ch/pipermail/r-help/2011-July/282717.html
https://www.researchgate.net/post/How_to_make_invisible_the_R_code
Options include anything that keeps the R code and R interpreter completely under your control. Simple examples:
Shiny apps, self-hosted (or on shinyapps.io if you trust their security); servers include Shiny Server (both free and commercial versions), RStudio Connect (commercial only), and ShinyProxy. (The list is not known to be exclusive.)
Rplumber is an API server, not a shiny server. The intent is for single HTTP(s) endpoint calls, possibly authenticated, supporting whatever HTTP supports (post, get, etc). This can be served in various ways, see its hosting page for options.
Rserve. I know less about this, but from what I've experienced with it, I've not had as much luck integrating with enterprise systems (where, e.g., authentication and fine-control over authorization is important). This does allow near-raw access to R, so it might not be what you want (especially when the intent is to give to clients who may not be strong R users themselves).
OpenCPU should be discussed, but not as a viable candidate for "protect your code". It is very similar to rplumber in that it provides HTTP endpoints, but it supports endpoints for every exported function in every package installed in its R library. This includes the base package, so it is not at all difficult to get the source code of any function that you could get on the R console. I believe this is a design feature, even if it is perfectly at odds with your intent to protect your code.
Anything that can call R or Rscript. This might be PHP or mod_python or similar. Any web-page serving language that can exec("/usr/bin/Rscript",...) can take its output and turn it around to the calling agent. (It might also be possible, for example, for a PHP front-end to call an opencpu endpoint that only permits connections from the PHP-serving host.)
i try to find a good combination of libraries for managing a real-time communication (client/server) using Haxe (only Haxe, not openfl or other framework base on Haxe) targeting flash (swf) for the client and no preference for the server except don't use neko.
The goal is to make a simple tchat and put a display representation of all clients on an aera. Each client can move his representation in this area, and the other sees the movement.
I find some Lib to make this :
https://github.com/soywiz/haxe-ws
https://github.com/MattTuttle/hxnet
haxe-js-kit
But I'm not sure of the best way to adopt.
Do you have any suggestion/remarks/tips to choose the better way ?
Disclaimer: I wrote the library that I am sharing here.
My somewhat new library mphx may be able to help you. It can manage 'rooms' of connections, allows client to server and server to client messaging in the form of events, and best of all, is cross platform. It also works in the web with websockets.
It was originally an extention of HxNet, however I wanted it to be easier to use. Connecting and sending a 'message' with data just takes a few lines.
I have a few examples in the github repository, the simplest being the 'basic' example. One of your requests you have is that it doesn't rely on one of the big libraries (open fl, etc) and mphx doesn't. The basic example proves that, and only runs in terminal. That being said, it can be used with haxeflixel, for that you can see the other examples.
It sounds like your main goal is to have simple, graphic multiplayer. For that you can look at the 'movement' haxeflixel example.
Documentation is still a little skim, and the code is alpha, so it might change or break. That can probably be said for most of the library's you listed though. The best way to install it is like this
haxelib git mphx https://github.com/5Mixer/mphx.git
That will not install the examples though. To run them, either download the repository as a zip, or just git clone it, and go into the examples folder.
Library: https://github.com/5Mixer/mphx
Old video's I made. A little outdated, most likely.
Video 1: https://www.youtube.com/watch?v=07J0wLXwH0g
Video 2: https://www.youtube.com/watch?v=MUx2CUtsnTU
Very often I get into the projects that have requirements of transferring file data into table. And almost always I've worked at ODI (Oracle Data Integrator) only.
I want to know what are the different ETL tools available and how are they different from ODI and what are the restrictions in each case (like file size limit or column size restriction or processing time etc).
I wish somebody could help.
If somebody can share personal experience on these tools, that would be welcome too. Thanks!
I'm working on the same type of projects that you're in.
Right now I'm working with IBM DataStage. It seems like a good and powerful tool, but it's lacking a good documentation and a strong community.
There's also Pentaho, I have no experience about it, but it seems pretty popular and it's also open source
I have a bunch of R scripts which I am running on a Windows machine and want to ensure that the code remains unread by those not intended to see it. On a Linux box, I could wrap the R code in a bash script #! and make an encrypted (and perhaps even a limited-life) executable shell script. What are my options to do something on similar lines under Windows?
My answer is a bit late, but I believe this is a good question. Unfortunately, I don't believe that there is a solution, or at least an easy one, at the present time.
The difficulty is common because, for most interpreted languages, including R, it is often possible to turn on logging and inspection of all commands being run. This can negate many tricks to obfuscate the code.
For those who prefer to think of code being open == good, one should know that a common reason to obfuscate the code is if one is consulting with a client that hires multiple vendors. It is not uncommon for a client to take scripts from vendor A and ask vendor B why it doesn't work with their system. (This may be done by a low-level IT flunkie, rather than someone responsible for the NDA contracts.) If A & B are competitors, A's code has just been handed to B. When scripts == serious programs, then serious code has been given away.
The ways I've seen this addressed are:
Make a call to a compiled language, and use standard protections available there.
Host the executable on a different server, and use calls to the server to execute the calculations. (In R, there are multiple server-side options.)
Use compiled (preprocessed / bytecode) code within the language.
Option 2 is actually easier and better when the code may be widely distributed, not just for IP reasons. A major advantage is that it lets you upgrade the code without having to go through the pain of a site-wide release process. If new libraries are needed, no problem - update the server.
Option 3 is done in Matlab with .p files, and can be done with py2exe for Python on Windows. In R, the new bytecode compilation may be analogous, but I am not familiar enough with it to address any differences between .Rc files in the R context and .p files in the Matlab context. For more info on the compiler, see: http://www.inside-r.org/r-doc/compiler/compile
Hosting computations on the server is great for working with unsophisticated users, because it is easier to iterate quickly in response to bugs or feature requests. The IP protection is simply a benefit.
This is not a specifically R-oriented strategy. (And it's a bit unclear what your constraints or goals really are anyway.) If you want a cross-platform encryption method, you should look into the open-source program TrueCrypt. It supports creating encrypted files that can be mounted as volumes on any machine that supports the volume formatting method. I have tested this across the Mac PC divide , since the Mac can read FAT files, but have no experience with how it might work across the Linux-PC chasm.
(Their TODO list for Windows includes;"Command line options for volume creation (already implemented in Linux and Mac OS X versions)". So I don't see any clear way to use this from within R without you running the program from the OS.)
I don't think this is possible because the R interpreter has to be able to decrypt and read the code in order to execute it which means that whoever is using that interpreter will also be able to decrypt and read the code.
I am by no means an expert, so I reserve the right to be 100% wrong about that statement.
I believe the best solution is to ensure value comes from the expertise and services provided by your company and it's employers---not from keeping secrets.
Failing that, you could try separating the code into a client/server model. That way the client just sends data and receives results---they never have access to the code that runs on the server.
However, the scientist in me just said "that solution sucks and I would never trust results provided under such conditions".
I would like to develop a Network Inventory application that works on any operating system.
Reports on every possible resource attacehd to a network.
Reports all pertinent details of hardware and software.
Thats (and i hate to use the phrase) my "End Game".
However I am running before i can crawl here.
I have no experience of this type of development, e.g. discovering a computers hardware and software settings.
I've spent almost two weeks googling and come up short! :-(.
So I am turning to you to ask these questions:-
My first step is to find an existing open source project i can incorporate into my own code that extracts the fine grained details i am after, e.g. EVERYTHING there is to know about the hardaware and software on a single machine.
Does this project exist? or do i have to develop that first?
Have i got to write all this in C?
I am guessing getting this information about a computer is going to be easier than for printers, scanners, routers etc... e.g. everything else you would find attached to a network.
Once i have access to a single computers details i then need to investigate how i can traverse an entire newtork of printers, scanners, routers, load balancers, switches, firewalls, workstations, servers, storeage devices, laptops, monitors, the list goes on and on
One problem i have is i dont have a 1000 machine newtork to play on!
Is there any such resource available on theinternet? (is that a silly question?)
Anywho, if you dont ask you wont find out!
One aspect iam really looking forward to finding out how to travers the entire network,
should i be using TCP/IP for this?
Whats a good site, blog, usergorup, book for TCP/IP development?
How do i go about getting through firewalls?
How many questions can i ask in one go? :-)
My previous question on this topic ended up with PYTHON being championed as the language/script to go with to develop this application in.
Having looked at a few PYTHON examples they all seemed to be related to WINDOWS networks
and interrogating Windows Management Instrumentation (WMI). I had the feeling you cant rely on whats in WMI, and even if you can that s no good for UNIX netwrks.
Surely there exist common code for extracting hardware and software details from a computer? Why cant i find it on the internet?
Pease help?
Theres no prizes though :-(
Thanks in advance
I would like to appologise if i have broken forum rules or not tried hard enough on my own before asking for assistance.
I just would like to start moving forward with this as its one of the best projects i have been involved with.
I am inspired by the many differnt number of challenges involved and that if i manage to produce a useful application at the end of it it would hopefully be extremely helpful to many people.
That sit
Thanks in advance
DD
as a software vendor of a discovery solution, I can just say: Respect, that you want to start a new one :-). Just in case you are interested in what it could look like: http://www.jdisc.com
Now to some of our experience:
Programming Language:
I wouldn't write it in C. Use Java or .NET. Those languages have great advantages when it comes to tracking down errors or problems. For instance, in Java (and I guess also in .NET), you can see the stack trace when something is failing. For some pieces of code (e.g. WMI access), you might need to use C++ or C (e.g. access to native APIs from Microsoft). Use a native interface or a COM bridge from Java. In .NET, it should even be easier to access the Windows APIs).
Devices:
well, network printers, router, and switches are actually easier to discover. They usually expose their information via SNMP. SNMP is pretty easy to use and pretty robust. Getting information from Windows (or even Unix) systems is a bit trickier. Protocols can be blocked, misconfigured, messed up... We had cases, where WMI was simply hanging when requesting data from a remote device.
Test Devices:
Since we are also a smaller company, we also do not have 1000 different devices to test with. But, there are some things that might help:
a) For SNMP devices use a SNMP simulator. We use MIMIC 9.0 from Gambit Solutions and we are pretty happy with it. You can import SNMP walks from network devices and simulate the device as if it would be in your network.
b) Secondly, use virtualization whenever possible. With VMware, you can install Windows, Linux, or even Solaris. We also use a project called GNS3 to emulate Cisco Routers, Firewalls or Juniper routers.
c)You can test the rest of the devices only, if you have a customer that helps you with testing and implementing new devices.
This are just some ideas to start with. But I have to tell you, that it is not trivial and it takes a lot of time....
Hope that you got some ideas to start with...
I don't know that it's open source, but we use Spiceworks (http://www.spiceworks.com) here as an IT management platform. You may get some use out of exploring that.