Custom AST in JavaCC? - abstract-syntax-tree

I've recently begun designing my first programming language, and being already fluent in java, I've started to build the AST API for my language in java. I plan on compiling to java byte-code, and that is functional and working in the current part of the AST that I've implemented. I've tried a couple different methods of parsing (that all failed) before stumbling upon parser-generators, in particular, JavaCC. I've done some basic research into JavaCC and EBNF, and was wondering if JavaCC could support a fully custom AST API (including constructor arguments and the like) while parsing my language. I wanted to ask this here before doing an deep research and watching/reading tutorials on JavaCC. From what I've seen, JavaCC can support AST's, but I'm not sure what the constraints are. Moreover, I know JavaCC has its own AST API, but I'd like to stick to the one I've developed already, as it is working and matches my language well.

Absolutely. For example you can write nonterminals like this
CommandNode whileCommand() : {
ExpressionNode e ;
CommandNode doPart ;
} {
<WHILE> e = expression() <DO>
doPart = sequence() <ENDWHILE>
{ return new WhileCommand( e, doPart ) ; }
}
The builder pattern can be useful to isolate the parser from some of the details of the AST.

Related

Are Child Packages safe in ADA?

I am very new to ADA language and, during my learning process, I have been seeing a lot of examples which uses this ADA functionality.
It seems to me that it could be useful only for unit testing in order to being able to test the private types and methods of the parent package, but I don't see any advantage for coding in such way, it seems that breaks encapsulation.
Is it a good practice to use them apart from unit testing?
Child packages can be seen as an extension for their parent.
It can be used, for example, to provide functionalities that are not directly linked to your base package.
The typical example is the Input-Output package.
Imagine the following package :
package Temperature is
type Kelvin_Temp is private;
type Celsius_Temp is private;
function build (temp : Positive) return Kelvin_Temp;
function build (temp : Integer) return Celsius_Temp;
function to_celsius (temp : in Kelvin_Temp) return Celsius_Temp;
function to_kelvin (temp : in Celsius_Temp) return Kelvin_Temp;
private
type Kelvin_Temp is 0.0 .. 10_000.0;
type Celsius_Temp is -273.0 .. 10_000.0;
end Temperature;
This package provides basic operations directly linked to the types defined.
What if you want to extend it to provide I/O in text format ?
You can decide to put operations inside the Temperature package but if you want to add other types of I/O such as database I/O, you will have a lot of functions that are not directly linked to your types inside the same file.
You can define
package Temperature.Text_IO is
procedure Put(temp : Celsius_Temp);
procedure Put(temp : Kelvin_Temp);
end Temperature.Text_IO;
and
package Temperature.Database_IO is
procedure insert (Temp : in Celsius_Temp);
procedure insert (Temp : in Kelvin_Temp);
end Temperature.Database_IO;
That's exactly what is done with IO in the standard library
From an encapsulation point of view, your private types will remain private outside of the package hierarchy so you don't break encapsulation.
You can take a look at this presentation given at FOSDEM 2018, discussing (private) packages and enforcing safety:
https://archive.fosdem.org/2018/schedule/event/ada_safety/
Child packages exist to allow programming by extension, but as implemented they also provide a way to bypass the information hiding that pkgs are supposed to enforce.
While it can be convenient to design a hierarchy of pkgs sharing some hidden information, programming by extension is generally a poor idea that emphasizes ease of writing over ease of reading, and coding over software engineering. You might be interested in the article "Breaking the Ada Privacy Act", available here.

Languages supporting complete reflection

Only recently, I discovered that both Java and C# do not support reflection of local variables. For example, you cannot retrieve the names of local variables at runtime.
Although clearly this is an optimisation that makes sense, I'm curious as to whether any current languages support full and complete reflection of all declarations and constructs.
EDIT: I will qualify my "names of local variables" example a bit further.
In C#, you can output the names of parameters to methods using reflection:
foreach(ParameterInfo pi in typeof(AClass).GetMethods()[0].GetParameters())
Trace.WriteLine(pi.Name);
You don't need to know the names of the parameters (or even of the method) - it's all contained in the reflection information. In a fully-reflective language, you would be able to do:
foreach(LocalVariableInfo lvi in typeof(AClass).GetMethods()[0].GetLocals())
Trace.WriteLine(lvi.Name);
The applications may be limited (many applications of reflection are), but nevertheless, I would expect a reflection-complete language to support such a construct.
EDIT: Since two people have now effectively said "there's no point in reflecting local variable names", here's a basic example of why it's useful:
void someMethod()
{
SomeObject x = SomeMethodCall();
// do lots of stuff with x
// sometime later...
if (!x.StateIsValid)
throw new SomeException(String.Format("{0} is not valid.", nameof(x));
}
Sure, I could just hardcode "x" in the string, but correct refactoring support makes that a big no-no. nameof(x) or the ability to reflect all names is a nice feature that is currently missing.
Your introductory statement about the names of local variables drew my interest.
This code will actually retrieve the name of the local var inside the lambda expression:
static void Main(string[] args)
{
int a = 5;
Expression<Func<int>> expr = (() => a);
Console.WriteLine(expr.Compile().Invoke());
Expression ex = expr;
LambdaExpression lex = ex as LambdaExpression;
MemberExpression mex = lex.Body as MemberExpression;
Console.WriteLine(mex.Member.Name);
}
Also have a look at this answer mentioning LocalVariableInfo.
Yes, there are languages where this is (at least kind of) possible. I would say that reflection in both Smalltalk and Python are pretty "complete" for any reasonable definition.
That said, getting the name of a local variable is pretty pointless - by definition to get the name of that variable, you must know its name. I wouldn't consider the lack of an operation to perform that exact task a lacuna in the reflection facility.
Your second example does not "determine the name of a local variable", it retrieves the name of all local variables, which is a different task. The equivalent code in Python would be:
for x in locals().iterkeys(): print x
eh, in order to access a local var you have to be within the stackframe/context/whatever where the local var is valid. Since it is only valid at that point in time, does it matter if it is called 't1' or 'myLittlePony'?

How Can I make async operations of my own with WinRT using IAsyncOperation interface?

I am developing a metro application and I want to create some async operations whose my own classes would implement.
I have found just examples of async using WinRT operations (e.g. CreateFileAsync). I do not find any intance where someone is creating a async method and consuming it.
Now you can do it. Look at this:
http://blogs.msdn.com/b/nativeconcurrency/archive/2011/10/27/try-it-now-use-ppl-to-produce-windows-8-asynchronous-operations.aspx
http://code.msdn.microsoft.com/Windows-8-Asynchronous-08009a0d
WinRT Async Production using C++
Use create_async in C++:
IAsyncOperationWithProgress<IBuffer^, unsigned int>^ RandomAccessStream::ReadAsync(IBuffer^ buffer, unsigned int count, InputStreamOptions options)
{
if (buffer == nullptr)
throw ref new InvalidArgumentException;
auto taskProvider = [=](progress_reporter<unsigned int> progress, cancellation_token token)
{
return ReadBytesAsync(buffer, count, token, progress, options);
};
return create_async(taskProvider);
}
Use AsyncInfo.Run in .NET:
public IAsyncOperation<IInfo> Async()
{
return AsyncInfo.Run(_ =>
Task.Run<AType>(async () =>
{
return await DoAsync();
})
);
}
I posted the same question in Microsoft forums and they gave me two replies. The first was:
Hi Claudio,
In the Developer Preview there isn't an easy way to create your own
async operations. We are aware of this shortcoming and are trying to
solve it for the next pubic release. In the meanwhile, you could
design your API as async and we will provide guidance on how to
convert sync to async.
Thanks
Raman Sharma, Visual C++
When I asked for the hard way to do this, another guy, someone responsible for PPL said me:
We’re planning to do a refresh of the sample pack we released a few
weeks ago and add a few samples on creation of async operations. I
expect that it will happen in a couple of weeks or so. If you keep an
eye on our blog at http://blogs.msdn.com/b/nativeconcurrency, you’ll
be the first to know.
As to how hard it is... The general-purpose solution that we’re
contemplating is about 1000 lines of C++ code making copious use of
template metaprogramming. Most of it will be in the header file so you
can explore it yourself. While a less general solution can be less
complex, you will still need to implement a base class, do the state
management, error handling etc. At this moment I can’t go into more
detail, but I will say that you will love how easy it is to author
async operations with PPL – so hang in there!
Artur Laksberg PPL team
Then, there is no solution at that time. Thank you all.
Yes, see Ben Kuhn's //BUILD/ talk: http://channel9.msdn.com/events/BUILD/BUILD2011/PLAT-203T He shows how to build an asynchronous API.
At the current time, there is no good solution for high level (C++/WX) classes. However if you use the low level C++ interfaces, you can use the WRL::AsyncBase class to help build your async interfaces.
Here is documentation about the AsyncBase class.
It is confusing, but there is a difference between WinRT C++ code and WRL. You can use WRL to code to the ABI layer directly. WRL does not use exceptions, but loves templates. The recommend coding style for WinRT is not the same as WRL.
I am not sure if everyone can do this, but using WRL you in general need to implement a class that inherits:
class CreateAysncOp: public RuntimeClass<IAsyncOperation<result_runtime_class*>,AsyncBase<IAsyncCompletedHandler<result_runtime_class*>>
{
...
Then you can use
hr = MakeAndInitialize<CreateAsyncOp, IAsyncOperation<type_foo*>>(...);
C++ WinRT is now the best way to implement WinRT async methods. This uses co_await and co_return, new C++ language features (in the process of standardization). Read the docs on this page.

Reflection API for Scala

Does anyone know the status of a fully-featured reflection API for Scala?
I know that you can use Java's reflection API to do simple things but this does not work well with Scala's language features. I found an interesting article describing an experimental Scala Mirroring API but as far as I know this is still experimental. I've also found mention of a ScalaSigParser but this seems to be pretty low level.
This is more of a curiosity than anything else as I am currently just playing around with Scala. I thought that the answer to this question might also be useful to others interested in Scala.
The "immutable replacement for the JavaBean style pattern" can be expressed named parameters and optionally the #BeanProperty annotation:
import reflect._
case class A(#BeanProperty val x: String, #BeanProperty val y : Int)
A(x = "s", y = 3)
A(y = 3, x = "s")
Adding methods (more precise: defining a new interface) makes only sense in a statically typed language if the client knowns about the new methods and can compile against the interface. With structural typing clients can define methods they expect to be present in an object. The Scala compiler will transform the structural type into reflection code which may fail at runtime.
type T = {def go(x : Int): Int }
def y(any : Any) = any.asInstanceOf[T].go(2)
class A{
def go(x : Int) = x + 1
}
y(new A())
y(new {}) //this will fail
You can define new classes or traits with the interpreter on the fly. The Interpret method transforms Scala code to byte code.
You've already mentioned the ScalaSigParser which is not exactly easy to work with.
I think the rest of features you like are not there yet.

Must a Language that Implements Monads be Statically Typed?

I am learning functional programming style. In Don't Fear the Monads, Brian Beckman gave a brilliant introduction about Monad. He mentioned that Monad is about composition of functions so as to address complexity.
A Monad includes a unit function that transfers type T to an amplified type M(T); and a Bind function that, given function from T to M(U), transforms type M(T) to another type M(U). (U can be T, but is not necessarily).
In my understanding, the language implementing monad should be type-checked statically. Otherwise, type errors cannot be found during compilation and "Complexity" is not controlled. Is my understanding correct?
There are lots of implementations of monads in dynamically typed languages:
The Maybe Monad in Ruby
OO Monads and Ruby (site is down, but the article is available in the Internet Archive's Wayback Machine)
Monads in Ruby Part 1: Identity, Monads In Ruby Part 1.5: Identity, Monads in Ruby Part 2: Maybe (then again Maybe not)
Monads in Ruby
Monads on the Cheap I: The Maybe Monad in JavaScript, More Monads on the Cheap: Inlined fromMaybe
Monads in Ruby (with nice syntax!), List Monad in Ruby and Python
Haskell-style monad do-notation for Ruby
In general, the Church-Turing-Thesis tells us that everything that can be done in one language can also be done in every other language.
As you can probably tell from the selection of examples above, I am (mostly) a Ruby programmer. So, just as a joke, I took one of the examples above and re-implemented it in a language that I know absolutely nothing about, that is usually thought of as a not very powerful language, and that seems to be the only programming language on the planet for which I was not able to find a Monad tutorial. May I present to you … the Identity Monad in PHP:
<?php
class Identity {
protected $val;
public function __construct($val) { $this->val = $val; }
public static function m_return($a) { return new Identity($a); }
public static function m_bind($id_a, $f) { return $f($id_a->val); }
}
var_dump(Identity::m_bind(
Identity::m_return(1), function ($x) {
return Identity::m_return($x+1);
}
));
?>
No static types, no generics, no closures necessary.
Now, if you actually want to statically check monads, then you need a static type system. But that is more or less a tautology: if you want to statically check types, you need a static type checker. Duh.
With regards to your question:
In my understanding, the language implementing monad should be type-checked statically. Otherwise, type errors cannot be found during compilation and "Complexity" is not controlled. Is my understanding correct?
You are right, but this has nothing to do with monads. This is just about static type checking in general, and applies equally well to arrays, lists or even plain boring integers.
There is also a red herring here: if you look for example at monad implementations in C#, Java or C, they are much longer and much more complex than, say, the PHP example above. In particular, there's tons of types everywhere, so it certainly looks impressive. But the ugly truth is: C#'s, Java's and C's type systems aren't actually powerful enough to express the type of Monad. In particular, Monad is a rank-2 polymorphic type, but C# and Java only support rank-1 polymorphism (they call it "generics", but it's the same thing) and C doesn't support even that.
So, monads are in fact not statically type-checked in C#, Java and C. (That's for example the reason why the LINQ monad comprehensions are defined as a pattern and not as a type: because you simply cannot express the type in C#.) All the static type system does, is make the implementation much more complex, without actually helping. It requires a much more sophisticated type system such as Haskell's, to get actual type-safety for monads.
Note: what I wrote above only applies to the generic monad type itself, as #Porges points out. You can certainly express the type of any specific monad, like List or Maybe, but you cannot express the type of Monad itself. And this means that you cannot type-check the fact that "List IS-A Monad", and you cannot type-check generic operations that work on all instances of Monad.
(Note that checking that Monad also obeys the monad laws in addition to conforming to the monad type is probably too much even for Haskell's type system. You'd probably need dependent types and maybe even a full-blown automatic theorem prover for that.)
It's certainly not the case that a language implementing monads must be statically typed, as your question title asks. It may be a good idea, for the reasons you outline, but errors failing to be detected at compile time has never stopped anyone. Just look at how many people write PHP.
You need closures for the State monad. I looked it up, PHP has closures since 5.3. So that wouldn't be a problem anymore.
No, in php it is not possible to implement monads. You need closures for that. Never the less, the concept of Maybe can be still useful, when you simulate pattern matching with classes:
abstract class Maybe {
abstract public function isJust();
public function isNothing(){
return !$this->isJust();
}
}
class Just extends Maybe {
protected $val = null;
public function __construct($val){
$this->val = $val;
}
public function isJust(){
return true;
}
public function getVal(){
return $this->val;
}
}
class Nothing extends Maybe {
protected $val = null;
public function __construct(){
}
public function isJust(){
return false;
}
}
function just(){
print "isJust";
}
function nothing(){
print "nothing";
}
function MaybeFunc(Maybe $arg){
if(get_class($arg) == 'Just'){
print "Just";
} else {
print "Nothing";
}
}
MaybeFunc(new Just(5));
MaybeFunc(new Nothing());

Resources