finding duplicate source code - plsql

I'm analyzing some legacy code. It is about 80.000 lines of old plsql code. On a fist look there is quite some duplication in the source which needs to be removed. Instead off doing diff's manual and looking at each file there must be some tool/commandline confu out there to detect duplicate lines of source code.
My goal is to make an educated guess about the minimal size of a rewrite of source and about how much actual knowledge is captured in this program. I wrote some a basic static code analyzer to find the amount of control statements IF ELSE FOR etc and Functions in each file.
But duplicated code still needs to be removed from my statistics.

Have you looked at Simian - Similarity Analyser? (Just checked and it's no longer free, but it is available for a period of 15 days for evaluation purposes.)
Simian (Similarity Analyser)
identifies duplication in Java, C#, C,
C++, COBOL, Ruby, JSP, ASP, HTML, XML,
Visual Basic, Groovy source code and
even plain text files. In fact, simian
can be used on any human readable
files such as ini files, deployment
descriptors, you name it.
I have used it in practice and it does work well.

Sonar has duplication detection and claims to support PL/SQL, though I've never used it for that.

You would need to beg/borrow/steal/write a plsql parser and compare the resulting abstract syntax trees. With the size of the code base you have, that might be worthwhile. There would be other uses for the parser once you're done.

How about this:
http://sourceforge.net/projects/sddforeclipse/
It is opensource, and is said to be used by commercial software. It is a plugin to Eclipse, by the way.

Related

Control over OpenAPI 3.0 package generation for jersey-jaxrs

I'm using openapi-generator for jersey-jaxrs (OpenAPI 3.0). I'd like to control the package where my code is being generated.
I'm setting the api-package, model-package, package-name, and invoker-package options, all to a xxx.yyy.zzz value.
My problem is that most of the code is generated under gen.xxx.yyy.zzz, and it's not discoverable by the part of the code generated under xxx.yyy.zzz. Implicitly, gen is prepended to the package name. I understand this is convenient in many cases, but not mine. Is there any generator option to avoid this?
I've learned a bit about the Mustache templates and they seem like a possible solution, but maybe a bit too much for my requirements.
Ultimately, I can move the code in gen to the other (non-gen) package manually, and it works, but this is quite inconvenient.
Finally, I found out that you can mark folders in IntelliJ IDEA as "generated sources root", which makes it discoverable to the rest of the project's code.
This doesn't solve my question, but it does solve the problem that originated the question.

Ada dependency graph

I need to create a dependency graph for a software suite that I am working on. In the past the company I work for has always done this manually, but I am guessing that there is a tool somewhere that will do what we need.
The software I am working with is Ada95, and has about 200 code modules/files, with about 40 packages. I need to create a map that will trace every output, individually, back to each input or constant that will have an impact on the output. Does anybody know of a tool that would accomplish this? Or even just partially accomplish it?
AdaCore's GPS (available from http://libre.adacore.com) comes with a command line tool named gnatinspect. You can use this tool to load all cross-reference information generated by the compiler (assuming you are compiling with GNAT). This creates a sqlite database (gnatinspect.db) which contains all information you need. gnatinspect itself provides a number of pre-made queries that might get you at least partially to where you want to go.
You could also look at ASIS, as a way to do this kind of queries directly on the code. I am told this is not so easy to use the first time around though.
There is also an older tool provided with gnat (gnatxref) which does something similar, although it is being superceded by gnatinspect.
Finally, you could look at gnat2xml as an alternative to ASIS if you are more comfortable parsing XML files.

Is it possible to include the os library in lua 4.0?

I'm stuck using the 4.0 version of lua which does not seem to support the os library. Is there a way to include this library into my project?
Or get another way to use the functionality contained within pertaining to date time calculations?
Preferably by using a *.lua file and not a *.c file since I don't have complete access to the code.
When I run the following line,
print(os.time{year=1970, month=1, day=1, hour=0})
I get an error stating:
attempt to index global 'os'(a nil value)
As #Colonel Thirty Two said it's not possible to use the os library. So the time() funciton is not available for me.
Adding to the (totally correct) currently accepted answer (that if "os" access was not allowed to you, you're generally done), there's some very slight chance the Original Programmer may have provided you with some alternative facilities to do your thing (fingers crossed). In a perfect world, those would be described in some kind of a User's Manual for your scripting environment. But if the manual was lost to time (or never existed in the first place), you might possibly try your luck at exploring any preloaded libraries by digging through the result of the globals() Basic Function. (At least I hope that's how it was done in 4.0 too.) That is, if the Original Programmer didn't block globals() for you too...

How to obfuscate lua code?

I can't find anything on Google for some tool that encrypts/obfuscates my lua files, so I decided to ask here. Maybe some professional knows how to do it? (For free).
I have made a simple game in lua now and I don't want people to see the code, otherwise they can easily cheat. How can I make the whole text inside the .lua file to just random letters and stuff?
I used to program in C# and I had this .NET obfuscator called SmartAssembly which works pretty good. When someone would try check the code of my applications it would just be a bunch of letters and numbers together with chinese characters and stuff.
Anyone knows any program that can do this for lua aswell? Just load what file to encrypt, click Encrypt or soemthing, and bam! It works!?
For example this:
print('Hello world!')
would turn into something like
sdf9sd###&/sdfsdd9fd0f0fsf/&
Just precompile your files (chunks) and load binary chunks. luacallows you to strip debugging info. If that is not enough, define your own transformations on the compiled lua, stripping names where possible. There's not really so much demand for lua obfuscators though...
Also, you loose one of the main advantages of using an embedded scripting language: Extensibility.
The simplest obfuscation option is to compile your Lua code as others suggested, however it has two major issues: (1) the strings are still likely to be easily visible in your compiled code, and (2) the compiled code for Lua interpreter is not portable, so if you target different architectures, you need to have different compiled chunks for them.
The first issue can be addressed by using a pre-processor that (for example) converts your strings to a sequence of numbers and then concatenates them back at run-time.
The second issue is not easily addressed without changes to the interpreter, but if you have a choice of interpreters, then LuaJIT generates portable bytecode that will run across all its platforms (running the same version of LuaJIT); note that LuaJIT bytecode is different from Lua bytecode, so it can't be run by a Lua interpreter.
A more complex option would be to encrypt the code (possibly before compiling it), but you need to weight any additional mechanisms (and work on your part) against any possible inconvenience for your users and any loss you have from someone cracking the protection. I'd personally use something sufficiently simple to deter the majority of curious users as you likely stand no chance against a dedicated hacker anyway.
You could use loadstring to get a chunk then string.dump and then apply some transformations like cycling the bytes, swapping segments, etc. Transformations must be reversible. Then save to a file.
Note that anyone having access to your "encryptor" Lua module will know how to decrypt your file. If you make your encrypted module in C/C++, anyone with access to source will too, or to binary of Lua encryption module they could require the module too and unofuscate the code. With interpreted language it is quite difficult to do: you can raise the bar a bit via the above the techniques but raising it to require a significant amount of work (the onlybreal deterent) is very difficult AFAIK.
If you embed the Lua interpreter than you can do this from C, this makes it significantly harder (assuming a Release build with all symbols stripped), person would have to be comfortable with stepping through assembly but it only takes one capable person to crack the algorithm then they can make the code available to others.
Yo still interested in doing this? :)
I thought I'd add some example code, since the answers here were helpful, but didn't get us all the way there. We wanted to save some lua table information, and just not make it super easy for someone to inject their own code. serialize your table, and then use load(str) to make it into a loadable lua chunk, and save with string.dump. With the 'true' parameter, debug information is stripped, and there's really not much there. Yes you can see string keys, but it's much better than just saving the naked serialized lua table.
function tftp.SaveToMSI( tbl, msiPath )
assert(type(tbl) == "table")
assert(type(msiPath) == "string")
local localName = _GetFileNameFromPath( msiPath )
local file,err = io.open(localName, "wb")
assert(file, err)
-- convert the table into a string
local str = serializer.Serialize( tbl )
-- create a lua chunk from the string. this allows some amount of
-- obfuscation, because it looks like gobblygook in a text editor
local chunk = string.dump(load(str), true)
file:write(chunk)
file:close()
-- send from /usr to the MSI folder
local sendResult = tftp.SendFile( localName, msiPath )
-- remove from the /usr folder
os.remove(localName)
return sendResult
end
The output from one small table looks like this in Notepad++ :
LuaS У
Vx#w( # АKА└АJБ┴ JА #
& А &  name
Coulombmetervalue?С╘ ажў

Closure: --namespace Foo does not include Foo.Bar, and related issues

I have a rather big library with a significant set of APIs that I need to expose. In fact, I'd like to expose the whole thing. There is a lot of namespacing going on, like:
FooLibrary.Bar
FooLibrary.Qux.Rumps
FooLibrary.Qux.Scrooge
..
Basically, what I would like to do is make sure that the user can access that whole namespace. I have had a whole bunch of trouble with this, and I'm totally new to closure, so I thought I'd ask for some input.
First, I need closurebuilder.py to send the full list of files to the closure compiler. This doesn't seem supported: --namespace Foo does not include Foo.Bar. --input only allows a single file, not a directory. Nor can I simply send my list of files to the closure compiler directly, because my code is also requiring things like "goog.assers", so I do need the resolver.
In fact, the only solution I can see is having a FooLibrary.ExposeAPI JS file that #require's everything. Surely that can't be right?
This is my main issue.
However, later the closure compiler, with ADVANCED_OPTIMIZATIONS on, will optimize all these names away. Now I can fix that by adding "#export" all over the place, which I am not happy about, but should work. I suppose it would also be valid to use an extern here. Or I could simply disable advanced optimizations.
What I can't do, apparently, is say "export FooLibrary.*". Wouldn't that make sense?
Finally, for working in source mode, I need to do goog.require() for every namespace I am using. This is merely an inconvenience, though I am mentioning because it sort of related to my trouble above. I would prefer to be able to do:
goog.requireRecursively('FooLibrary')
in order to pull all the child namespaces as well; thus, recreating with a single command the environment that I have when I am using the compiled version of my library.
I feel like I am possibly misunderstanding some things, or how Closure is supposed to be used. I'd be interested in looking at other Closure-based libraries to see how they solve this.
You are discovering that Closure-compiler is built more for the end consumer and not as much for the library author.
If you are exporting basically everything, then you would be better off with SIMPLE_OPTIMIZATIONS. I would still highly encourage you to maintain compatibility of your library with ADVANCED_OPTIMIZATIONS so that users can compile the library source with their project.
First, I need closurebuilder.py to send the full list of files to the closure compiler. ...
In fact, the only solution I can see is having a FooLibrary.ExposeAPI JS file that #require's everything. Surely that can't be right?
You would need to specify an --root of your source folder and specify the namespaces of the leaf nodes of your file dependency tree. You may have better luck with the now deprecated CalcDeps.py script. I still use it for some projects.
What I can't do, apparently, is say "export FooLibrary.*". Wouldn't that make sense?
You can't do that because it only makes sense based on the final usage. You as the library writer wish to export everything, but perhaps a consumer of your library wishes to include the source (uncompiled) version and have more dead code elimination. Library authors are stuck in a kind of middle ground between SIMPLE and ADVANCED optimization levels.
What I have done for this case is maintain a separate exports file for my namespace that exports everything. When compiling a standalone version of my library for distribution, the exports file is included in the compilation. However I can still include the library source (without the exports) into a project and get full dead code elimination. The work/payoff balance of this though must be weighed against just using SIMPLE_OPTIMIZATIONS for the standalone library.
My GeolocationMarker library has an example of this strategy.

Resources