How should I refactor my code in this case? - qt

I'm currently using Qt's QTextStream to read many different types (read: different extension) of text files. Each "FileReader" class I create starts to have a similar pattern where it needs to readLine() like this:
// Get the line's first word as float where each word is delimited by a comma
fileData.readLine().split(",")[0].toFloat();
You can imagine I have tens of these lines in my program.
Further, it's possible that toFloat() could fail (e.g. the value read is not convertible to float), so I'm planning to modify the above line like this:
// Get the line's first word as float where each word is delimited by a comma
bool convertible;
fileData.readLine().split(",")[0].toFloat(&convertible);
if(!convertible) throw std::runtime_error("Error!");
Obviously, IMO, the least maintainable code would be to simply repeat the above code to every line that I use readLine(). This is definitely not the path I plan to choose. (I would welcome someone who could prove otherwise the benefits of doing so)
I could think of a few ways to refactor this code.
1) Instead of directly using Qt's QTextStream class, create my own class that owns QTextStream, and then create a method called readFirstTokenAsFloat(). Inside that method I would have error checking as shown above. Then every "FileReader" class would now switch to use this new class. The pros of this approach, IMO, is it accomplishes what I want to do, but the cons, IMO, is that if I were to need to do other things, or if I wanted to use other QTextStream's methods, I would violate the DRY principle (?) by duplicating the same methods, and internally just have a one-liner calling QTextStream.
2) OR I could just inherit from QTextStream. This way I would simply extend its functionality, and would also get all of QTextStream's functionality. But is inheritance a good idea in this case?
3) Any other thoughts? I'm sure someone has come across something like this. Is there a specific name for this pattern?

If you think you're using all QTextStream functionality, then inheritance is the way to go IMO. Inheritance is not a bad thing in itself, should just be avoided in some cases. But if there's at least a method in QTextStream which shouldn't be called at all, then this would probably lead to a weird design (perhaps using a interface would help in this case)
Now, if you're using a subset of the functionalities, then composition (approach number 1) is the way to go.
I would additionaly suggest to create an interface with the "readFirstTokenAsFloat()" method and whichever other method you want, and then implement the interface (and use it on your "FileReaders"). This way, you have a less coupled and easier to change design.
If something wasn't clear or was controversial, feel free to post in the comments, so we can improve the answer =).

class FileReaderBase {
protected:
QTextStream* fileData;
virtual float(readFirstTokenAsFloat);
}
Your FileReaders derive from that class so that they can use the same fileData. You can handle open in the constructor, close in destructor, e.g...
IMHO much better than subclassing into a framework. (favour composition over inheritance)

Related

Access Modifiers ... Why?

Ok so I was just thinking to myself why do programmers stress so much when it comes down to Access Modifiers within OOP.
Lets take this code for example / PHP!
class StackOverflow
{
private var $web_address;
public function setWebAddress(){/*...*/}
}
Because web_address is private it cannot be changed by $object->web_address = 'w.e.', but the fact that that Variable will only ever change is if your programme does $object->web_address = 'w.e.';
If within my application I wanted a variable not to be changed, then I would make my application so that my programming does not have the code to change it, therefore it would never be changed ?
So my question is: What are the major rules and reasons in using private / protected / non-public entities
Because (ideally), a class should have two parts:
an interface exposed to the rest of the world, a manifest of how others can talk to it. Example in a filehandle class: String read(int bytes). Of course this has to be public, (one/the) main purpose of our class is to provide this functionality.
internal state, which noone but the instance itself should (have to) care about. Example in a filehandle class: private String buffer. This can and should be hidden from the rest of the world: They have no buisness with it, it's an implementation detail.
This is even done in language without access modifiers, e.g. Python - except that we don't force people to respect privacy (and remember, they can always use reflection anyway - encapsulation can never be 100% enforced) but prefix private members with _ to indicate "you shouldn't touch this; if you want to mess with it, do at your own risk".
Because you might not be the only developer in your project and the other developers might not know that they shouldn't change it. Or you might forget etc.
It makes it easy to spot (even the compiler can spot it) when you're doing something that someone has said would be a bad idea.
So my question is: What are the major rules and reasons in using private / protected / non-public entities
In Python, there are no access modifiers.
So the reasons are actually language-specific. You might want to update your question slightly to reflect this.
It's a fairly common question about Python. Many programmers from Java or C++ (or other) backgrounds like to think deeply about this. When they learn Python, there's really no deep thinking. The operating principle is
We're all adults here
It's not clear who -- precisely -- the access modifiers help. In Lakos' book, Large-Scale Software Design, there's a long discussion of "protected", since the semantics of protected make subclasses and client interfaces a bit murky.
http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620
Access modifiers is a tool for defensive programming strategy. You protect your code consciously against your own stupid errors (when you forget something after a while, didn't understand something correctly or just haven't had enough coffee).
You keep yourself from accidentally executing $object->web_address = 'w.e.';. This might seem unnecessary at the moment, but it won't be unnecessary if
two month later you want to change something in the project (and forgot all about the fact that web_address should not be changed directly) or
your project has many thousand lines of code and you simply cannot remember which field you are "allowed" to set directly and which ones require a setter method.
Just because a class has "something" doesn't mean it should expose that something. The class should implement its contract/interface/whatever you want to call it, but in doing so it could easily have all kinds of internal members/methods that don't need to be (and by all rights shouldn't be) known outside of that class.
Sure, you could write the rest of your application to just deal with it anyway, but that's not really considered good design.

API design: is "fault tolerance" a good thing?

I've consolidated many of the useful answers and came up with my own answer below
For example, I am writing a an API Foo which needs explicit initialization and termination. (Should be language agnostic but I'm using C++ here)
class Foo
{
public:
static void InitLibrary(int someMagicInputRequiredAtRuntime);
static void TermLibrary(int someOtherInput);
};
Apparently, our library doesn't care about multi-threading, reentrancy or whatnot. Let's suppose our Init function should only be called once, calling it again with any other input would wreak havoc.
What's the best way to communicate this to my caller? I can think of two ways:
Inside InitLibrary, I assert some static variable which will blame my caller for init'ing twice.
Inside InitLibrary, I check some static variable and silently aborts if my lib has already been initialized.
Method #1 obviously is explicit, while method #2 makes it more user friendly. I am thinking that method #2 probably has the disadvantage that my caller wouldn't be aware of the fact that InitLibrary shouln't be called twice.
What would be the pros/cons of each approach? Is there a cleverer way to subvert all these?
Edit
I know that the example here is very contrived. As #daemon pointed out, I should initialized myself and not bother the caller. Practically however, there are places where I need more information to properly initialize myself (note the use of my variable name someMagicInputRequiredAtRuntime). This is not restricted to initialization/termination but other instances where the dilemma exists whether I should choose to be quote-and-quote "fault tolorent" or fail lousily.
I would definitely go for approach 1, along with an easy-to-understand exception and good documentation that explains why this fails. This will force the caller to be aware that this can happen, and the calling class can easily wrap the call in a try-catch statement if needed.
Failing silently, on the other hand, will lead your users to believe that the second call was successful (no error message, no exception) and thus they will expect that the new values are set. So when they try to do something else with Foo, they don't get the expected results. And it's darn near impossible to figure out why if they don't have access to your source code.
Serenity Prayer (modified for interfaces)
SA, grant me the assertions
to accept the things devs cannot change
the code to except the things they can,
and the conditionals to detect the difference
If the fault is in the environment, then you should try and make your code deal with it. If it is something that the developer can prevent by fixing their code, it should generate an exception.
A good approach would be to have a factory that creates an intialized library object (this would require you to wrap your library in a class). Multiple create-calls to the factory would create different objects. This way, the initialize-method would then not be a part of the public interface of the library, and the factory would manage initialization.
If there can be only one instance of the library active, make the factory check for existing instances. This would effectively make your library-object a singleton.
I would suggest that you should flag an exception if your routine cannot achieve the expected post-condition. If someone calls your init routine twice, and the system state after calling it the second time will be the same would be the same as if it had just been called once, then it is probably not necessary to throw an exception. If the system state after the second call would not match the caller's expectation, then an exception should be thrown.
In general, I think it's more helpful to think in terms of state than in terms of action. To use an analogy, an attempt to open as "write new" a file that is already open should either fail or result in a close-erase-reopen. It should not simply perform a no-op, since the program will be expecting to be writing into an empty file whose creation time matches the current time. On the other hand, trying to close a file that's already closed should generally not be considered an error, because the desire is that the file be closed.
BTW, it's often helpful to have available a "Try" version of a method that might throw an exception. It would be nice, for example, to have a Control.TryBeginInvoke available for things like update routines (if a thread-safe control property changes, the property handler would like the control to be updated if it still exists, but won't really mind if the control gets disposed; it's a little irksome not being able to avoid a first-chance exception if a control gets closed when its property is being updated).
Have a private static counter variable in your class. If it is 0 then do the logic in Init and increment the counter, If it is more than 0 then simply increment the counter. In Term do the opposite, decrement until it is 0 then do the logic.
Another way is to use a Singleton pattern, here is a sample in C++.
I guess one way to subvert this dilemma is to fulfill both camps. Ruby has the -w warning switch, it is custom for gcc users to -Wall or even -Weffc++ and Perl has taint mode. By default, these "just work," but the more careful programmer can turn on these strict settings themselves.
One example against the "always complain the slightest error" approach is HTML. Imagine how frustrated the world would be if all browsers would bark at any CSS hacks (such as drawing elements at negative coordinates).
After considering many excellent answers, I've come to this conclusion for myself: When someone sits down, my API should ideally "just work." Of course, for anyone to be involved in any domain, he needs to work at one or two level of abstractions lower than the problem he is trying to solve, which means my user must learn about my internals sooner or later. If he uses my API for long enough, he will begin to stretch the limits and too much efforts to "hide" or "encapsulate" the inner workings will only become nuisance.
I guess fault tolerance is most of the time a good thing, it's just that it's difficult to get right when the API user is stretching corner cases. I could say the best of both worlds is to provide some kind of "strict mode" so that when things don't "just work," the user can easily dissect the problem.
Of course, doing this is a lot of extra work, so I may be just talking ideals here. Practically it all comes down to the specific case and the programmer's decision.
If your language doesn't allow this error to surface statically, chances are good the error will surface only at runtime. Depending on the use of your library, this means the error won't surface until much later in development. Possibly only when shipped (again, depends on alot).
If there's no danger in silently eating an error (which isn't a real error anyway, since you catch it before anything dangerous happens), then I'd say you should silently eat it. This makes it more user friendly.
If however someMagicInputRequiredAtRuntime varies from calling to calling, I'd raise the error whenever possible, or presumably the library will not function as expected ("I init'ed the lib with value 42, but it's behaving as if I initted with 11!?").
If this Library is a static class, (a library type with no state), why not put the call to Init in the type initializer? If it is an instantiatable type, then put the call in the constructor, or in the factory method that handles instantiation.
Don;t allow public access to the Init function at all.
I think your interface is a bit too technical. No programmer want to learn what concept you have used while designing the API. Programmers want solutions for their actual problems and don't want to learn how to use an API. Nobody wants to init your API, that is something that the API should handle in the background as far as possible. Find a good abstraction that shields the developer from as much low-level technical stuff as possible. That implies, that the API should be fault tolerant.

Any way to automatically generate QSharedData-based structures?

Qt has a build-in supprt for creating objects with integrated reference counting via QSharedData and QSharedDataPointer. All works great, but for each such object I need to write a lot of code: QSharedData-based implementation class with constructor and copy constructor, object class itsef with accessor methods for each filed.
For a simple structures with 5-10 fields this requires really lot of near same code. Is it some ways to automate such classes generation? Maybe it's some generators exists that take a short description and automatically generates implementation class and object class with all accessors?
You usually don't have to implement copy ctor or operator= when using QSharedData/Pointer. The default impls copy/assign the QSharedData-derived member, which usually does the Right Thing (TM).
For the public class, you need to implement the ctor creating the private object, and if the private class is not declared in the header but in the implementation (which is better), a dtor (doing nothing, the only point is that is not inlined and defined in the .cpp, after the private declaration).
For the private class, no method/ctor/dtor implementations are necessary.
For simple value-based classes, writing setters is of course tedious, but the same is true if you use plain private member variables. The overhead in LOC doesn't grow with the number of members.
And no, there is no standard generator solution for that I know of, although writing a script or emacs macro etc. doing it is not that hard. Probably would make sense to add such things to a publicly available toolbox, or QtCreator...
I don't think generators would exist for these things, but I suggest two things:
(ab)use existing shared containers (QVector, QList...)
read the documentation (with examples) on QSharedData, QSharedDataPointer, and QExplicitelySharedDataPointer
The two subclasses have simple examples that show how to implement the shared-ness it seems. I can't help you further though, because I've never had the need to create my own.
On second thought, why not make all data fields public, and use the QSharedData derivative as a struct-like class with reference counting? Maybe not nice on encapsulation, but if you're careful, nothing wrong should happen.

Why are getters prefixed with the word "get"?

Generally speaking, creating a fluid API is something that makes all programmers happy; Both for the creators who write the interface, and the consumers who program against it. Looking beyond conventions, why is it that we prefix all our getters with the word "get". Omitting it usually results in a more fluid, easy to read set of instructions, which ultimately leads to happiness (however small or passive). Consider this very simple example. (pseudo code)
Conventional:
person = new Person("Joey")
person.getName().toLower().print()
Alternative:
person = new Person("Joey")
person.name().toLower().print()
Of course this only applies to languages where getters/setters are the norm, but is not directed at any specific language. Were these conventions developed around technical limitations (disambiguation), or simply through the pursuit of a more explicit, intentional feeling type of interface, or perhaps this is just a case of trickle a down norm. What are your thoughts? And how would simple changes to these conventions impact your happiness / daily attitudes towards your craft (however minimal).
Thanks.
Because, in languages without Properties, name() is a function. Without some more information though, it's not necessarily specific about what it's doing (or what it's going to return).
Functions/Methods are also supposed to be Verbs because they are performing some action. name() obviously doesn't fit the bill because it tells you nothing about what action it is performing.
getName() lets you know without a doubt that the method is going to return a name.
In languages with Properties, the fact that something is a Property expresses the same meaning as having get or set attached to it. It merely makes things look a little neater.
The best answer I have ever heard for using the get/set prefixes is as such:
If you didn't use them, both the accessor and mutator (getter and setter) would have the same name; thus, they would be overloaded. Generally, you should only overload a method when each implementation of the method performs a similar function (but with different inputs).
In this case, you would have two methods with the same name that peformed very different functions, and that could be confusing to users of the API.
I always appreciate consistent get/set prefixing when working with a new API and its documentation. The automatic grouping of getters and setters when all functions are listed in their alphabetical order greatly helps to distinguish between simple data access and advanced functinality.
The same is true when using intellisense/auto completion within the IDE.
What about the case where a property is named after an verb?
object.action()
Does this get the type of action to be performed, or execute the action... Adding get/set/do removes the ambiguity which is always a good thing...
object.getAction()
object.setAction(action)
object.doAction()
In school we were taught to use get to distinguish methods from data structures. I never understood why the parens wouldn't be a tipoff. I'm of the personal opinion that overuse of get/set methods can be a horrendous time waster, and it's a phase I see a lot of object oriented programmers go through soon after they start.
I may not write much Objective-C, but since I learned it I've really come to love it's conventions. The very thing you are asking about is addressed by the language.
Here's a Smalltalk answer which I like most. One has to know a few rules about Smalltalk BTW.
fields are only accessible in the they are defined.If you dont write "accessors" you won't be able to do anything with them.
The convention there is having a Variable (let's anme it instVar1.
then you write a function instVar1 which just returns instVar1 and instVar: which sets
the value.
I like this convention much more than anything else. If you see a : somewhere you can bet it's some "setter" in one or the other way.
Custom.
Plus, in C++, if you return a reference, that provides potential information leakage into the class itself.

Use-cases for reflection

Recently I was talking to a co-worker about C++ and lamented that there was no way to take a string with the name of a class field and extract the field with that name; in other words, it lacks reflection. He gave me a baffled look and asked when anyone would ever need to do such a thing.
Off the top of my head I didn't have a good answer for him, other than "hey, I need to do it right now". So I sat down and came up with a list of some of the things I've actually done with reflection in various languages. Unfortunately, most of my examples come from my web programming in Python, and I was hoping that the people here would have more examples. Here's the list I came up with:
Given a config file with lines like
x = "Hello World!"
y = 5.0
dynamically set the fields of some config object equal to the values in that file. (This was what I wished I could do in C++, but actually couldn't do.)
When sorting a list of objects, sort based on an arbitrary attribute given that attribute's name from a config file or web request.
When writing software that uses a network protocol, reflection lets you call methods based on string values from that protocol. For example, I wrote an IRC bot that would translate
!some_command arg1 arg2
into a method call actions.some_command(arg1, arg2) and print whatever that function returned back to the IRC channel.
When using Python's __getattr__ function (which is sort of like method_missing in Ruby/Smalltalk) I was working with a class with a whole lot of statistics, such as late_total. For every statistic, I wanted to be able to add _percent to get that statistic as a percentage of the total things I was counting (for example, stats.late_total_percent). Reflection made this very easy.
So can anyone here give any examples from their own programming experiences of times when reflection has been helpful? The next time a co-worker asks me why I'd "ever want to do something like that" I'd like to be more prepared.
I can list following usage for reflection:
Late binding
Security (introspect code for security reasons)
Code analysis
Dynamic typing (duck typing is not possible without reflection)
Metaprogramming
Some real-world usages of reflection from my personal experience:
Developed plugin system based on reflection
Used aspect-oriented programming model
Performed static code analysis
Used various Dependency Injection frameworks
...
Reflection is good thing :)
I've used reflection to get current method information for exceptions, logging, etc.
string src = MethodInfo.GetCurrentMethod().ToString();
string msg = "Big Mistake";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
instead of
string src = "MyMethod";
string msg = "Big MistakeA";
Exception newEx = new Exception(msg, ex);
newEx.Source = src;
It's just easier for copy/paste inheritance and code generation.
I'm in a situation now where I have a stream of XML coming in over the wire and I need to instantiate an Entity object that will populate itself from elements in the stream. It's easier to use reflection to figure out which Entity object can handle which XML element than to write a gigantic, maintenance-nightmare conditional statement. There's clearly a dependency between the XML schema and how I structure and name my objects, but I control both so it's not a big problem.
There are lot's of times you want to dynamically instantiate and work with objects where the type isn't known until runtime. For example with OR-mappers or in a plugin architecture. Mocking frameworks use it, if you want to write a logging-library and dynamically want to examine type and properties of exceptions.
If I think a bit longer I can probably come up with more examples.
I find reflection very useful if the input data (like xml) has a complex structure which is easily mapped to object-instances or i need some kind of "is a" relationship between the instances.
As reflection is relatively easy in java, I sometimes use it for simple data (key-value maps) where I have a small fixed set of keys. One one hand it's simple to determine if a key is valid (if the class has a setter setKey(String data)), on the other hand i can change the type of the (textual) input data and hide the transformation (e.g simple cast to int in getKey()), so the rest of the application can rely on correctly typed data.
If the type of some key-value-pair changes for one object (e.g. form int to float), i only have to change it in the data-object and its users but don't have to keep in mind to check the parser too. This might not be a sensible approach, if performance is an issue...
Writing dispatchers. Twisted uses python's reflective capabilities to dispatch XML-RPC and SOAP calls. RMI uses Java's reflection api for dispatch.
Command line parsing. Building up a config object based on the command line parameters that are passed in.
When writing unit tests, it can be helpful to use reflection, though mostly I've used this to bypass access modifiers (Java).
I've used reflection in C# when there was some internal or private method in the framework or a third party library that I wanted to access.
(Disclaimer: It's not necessarily a best-practice because private and internal methods may be changed in later versions. But it worked for what I needed.)
Well, in statically-typed languages, you'd want to use reflection any time you need to do something "dynamic". It comes in handy for tooling purposes (scanning the members of an object). In Java it's used in JMX and dynamic proxies quite a bit. And there are tons of one-off cases where it's really the only way to go (pretty much anytime you need to do something the compiler won't let you do).
I generally use reflection for debugging. Reflection can more easily and more accurately display the objects within the system than an assortment of print statements. In many languages that have first-class functions, you can even invoke the functions of the object without writing special code.
There is, however, a way to do what you want(ed). Use a hashtable. Store the fields keyed against the field name.
If you really wanted to, you could then create standard Get/Set functions, or create macros that do it on the fly. #define GetX() Get("X") sort of thing.
You could even implement your own imperfect reflection that way.
For the advanced user, if you can compile the code, it may be possible to enable debug output generation and use that to perform reflection.

Resources