FREE JAVA PROGRAMING BOOKS

Wednesday, July 23, 2008

Exception handling:dealing with errors

Exception handling:dealing with errors
Ever since the beginning of programming languages, error handling has been one of the
most difficult issues. Because it’s so hard to design a good error-handling scheme, many
languages simply ignore the issue, passing the problem on to library designers who come up
with halfway measures that can work in many situations but can easily be circumvented,
generally by just ignoring them. A major problem with most error-handling schemes is that
6 Note that this is true only for objects that are created on the heap, with new. However, the problem
described, and indeed any general programming problem, requires objects to be created on the heap.
7 According to a technical reader for this book, one existing real-time Java implementation
(www.newmonics.com) has guarantees on garbage collector performance.
56 Thinking in Java www.BruceEckel.com
they rely on programmer vigilance in following an agreed-upon convention that is not
enforced by the language. If the programmer is not vigilant, which is often if they are in a
hurry, these schemes can easily be forgotten.
Exception handling wires error handling directly into the programming language and
sometimes even the operating system. An exception is an object that is “thrown” from the
site of the error and can be “caught” by an appropriate exception handler designed to handle
that particular type of error. It’s as if exception handling is a different, parallel path of
execution that can be taken when things go wrong. And because it uses a separate execution
path, it doesn’t need to interfere with your normally-executing code. This makes that code
simpler to write since you aren’t constantly forced to check for errors. In addition, a thrown
exception is unlike an error value that’s returned from a function or a flag that’s set by a
function in order to indicate an error condition, These can be ignored. An exception cannot
be ignored so it’s guaranteed to be dealt with at some point. Finally, exceptions provide a
way to reliably recover from a bad situation. Instead of just exiting you are often able to set
things right and restore the execution of a program, which produces much more robust
programs.
Java’s exception handling stands out among programming languages, because in Java,
exception-handling was wired in from the beginning and you’re forced to use it. If you don’t
write your code to properly handle exceptions, you’ll get a compile-time error message. This
guaranteed consistency makes error-handling much easier.
It’s worth noting that exception handling isn’t an object-oriented feature, although in objectoriented
languages the exception is normally represented with an object. Exception handling
existed before object-oriented languages.

Garbage collectors vs. efficiency and flexibility

Garbage collectors vs. efficiency and flexibility
If all this is such a good idea, why didn’t they do the same thing in C++? Well of course
there’s a price you pay for all this programming convenience, and that price is run-time
overhead. As mentioned before, in C++ you can create objects on the stack, and in this case
they’re automatically cleaned up (but you don’t have the flexibility of creating as many as
you want at run-time). Creating objects on the stack is the most efficient way to allocate
storage for objects and to free that storage. Creating objects on the heap can be much more
expensive. Always inheriting from a base class and making all function calls polymorphic
also exacts a small toll. But the garbage collector is a particular problem because you never
quite know when it’s going to start up or how long it will take. This means that there’s an
inconsistency in the rate of execution of a Java program, so you can’t use it in certain
situations, such as when the rate of execution of a program is uniformly critical. (These are
generally called real time programs, although not all real-time programming problems are
this stringent.)7
The designers of the C++ language, trying to woo C programmers (and most successfully, at
that), did not want to add any features to the language that would impact the speed or the
use of C++ in any situation where C might be used. This goal was realized, but at the price
of greater complexity when programming in C++. Java is simpler than C++, but the
tradeoff is in efficiency and sometimes applicability. For a significant portion of
programming problems, however, Java is often the superior choice.

housekeeping dilemma:who should clean up?

housekeeping dilemma:who should clean up?
Each object requires resources in order to exist, most notably memory. When an object is no
longer needed it must be cleaned up so that these resources are released for reuse. In simple
programming situations the question of how an object is cleaned up doesn’t seem too
challenging: you create the object, use it for as long as it’s needed, and then it should be
destroyed. It’s not too hard, however, to encounter situations in which the situation is more
complex.
Suppose, for example, you are designing a system to manage air traffic for an airport. (The
same model might also work for managing crates in a warehouse, or a video rental system,
or a kennel for boarding pets.) At first it seems simple: make a collection to hold airplanes,
then create a new airplane and place it in the collection for each airplane that enters the airtraffic-
control zone. For cleanup, simply delete the appropriate airplane object when a plane
leaves the zone.
But perhaps you have some other system to record data about the planes; perhaps data that
doesn’t require such immediate attention as the main controller function. Maybe it’s a
record of the flight plans of all the small planes that leave the airport. So you have a second
collection of small planes, and whenever you create a plane object you also put it in this
collection if it’s a small plane. Then some background process performs operations on the
objects in this collection during idle moments.
Now the problem is more difficult: how can you possibly know when to destroy the objects?
When you’re done with the object, some other part of the system might not be. This same
problem can arise in a number of other situations, and in programming systems (such as
Chapter 1: Introduction to Objects 55
C++) in which you must explicitly delete an object when you’re done with it this can
become quite complex.6
With Java, the garbage collector is designed to take care of the problem of releasing the
memory (although this doesn’t include other aspects of cleaning up an object). The garbage
collector “knows” when an object is no longer in use, and it then automatically releases the
memory for that object. This, combined with the fact that all objects are inherited from the
single root class Object and that you can create objects only one way, on the heap, makes
the process of programming in Java much simpler than programming in C++. You have far
fewer decisions to make and hurdles to overcome.

Collection libraries and support for easy collection use

Collection libraries and support for easy collection use
Because a collection is a tool that you’ll use frequently, it makes sense to have a library of
collections that are built in a reusable fashion, so you can take one off the shelf and plug it
into your program. Java provides such a library, although it is fairly limited in Java 1.0 and
1.1 (the Java 1.2 collections library, however, satisfies most needs).
Downcasting vs. templates/generics
To make these collections reusable, they contain the one universal type in Java that was
previously mentioned: Object. The singly-rooted hierarchy means that everything is an
Object, so a collection that holds Objects can hold anything. This makes it easy to reuse.
To use such a collection, you simply add object handles to it, and later ask for them back.
But, since the collection holds only Objects, when you add your object handle into the
collection it is upcast to Object, thus losing its identity. When you fetch it back, you get an
Object handle, and not a handle to the type that you put in. So how do you turn it back into
something that has the useful interface of the object that you put into the collection?
Here, the cast is used again, but this time you’re not casting up the inheritance hierarchy to
a more general type, you cast down the hierarchy to a more specific type. This manner of
casting is called downcasting. With upcasting, you know, for example, that a Circle is a type
of Shape so it’s safe to upcast, but you don’t know that an Object is necessarily a Circle or
a Shape so it’s hardly safe to downcast unless you know that’s what you’re dealing with.
54 Thinking in Java www.BruceEckel.com
It’s not completely dangerous, however, because if you downcast to the wrong thing you’ll
get a run-time error called an exception, which will be described shortly. When you fetch
object handles from a collection, though, you must have some way to remember exactly
what they are so you can perform a proper downcast.
Downcasting and the run-time checks require extra time for the running program, and extra
effort from the programmer. Wouldn’t it make sense to somehow create the collection so
that it knows the types that it holds, eliminating the need for the downcast and possible
mistake? The solution is parameterized types, which are classes that the compiler can
automatically customize to work with particular types. For example, with a parameterized
collection, the compiler could customize that collection so that it would accept only Shapes
and fetch only Shapes.
Parameterized types are an important part of C++, partly because C++ has no singlyrooted
hierarchy. In C++, the keyword that implements parameterized types is template.
Java currently has no parameterized types since it is possible for it to get by – however
awkwardly – using the singly-rooted hierarchy. At one point the word generic (the keyword
used by Ada for its templates) was on a list of keywords that were “reserved for future
implementation.” Some of these seemed to have mysteriously slipped into a kind of
“keyword Bermuda Triangle” and it’s difficult to know what might eventually happen.

The singly-rooted hierarchy

The singly-rooted hierarchy
One of the issues in OOP that has become especially prominent since the introduction of
C++ is whether all classes should ultimately be inherited from a single base class. In Java (as
with virtually all other OOP languages) the answer is “yes” and the name of this ultimate
base class is simply Object. It turns out that the benefits of the singly-rooted hierarchy are
many.
All objects in a singly-rooted hierarchy have an interface in common, so they are all
ultimately the same type. The alternative (provided by C++) is that you don’t know that
everything is the same fundamental type. From a backwards-compatibility standpoint this
fits the model of C better and can be thought of as less restrictive, but when you want to do
full-on object-oriented programming you must then build your own hierarchy to provide
the same convenience that’s built into other OOP languages. And in any new class library
you acquire, some other incompatible interface will be used. It requires effort (and possibly
multiple inheritance) to work the new interface into your design. Is the extra “flexibility” of
Chapter 1: Introduction to Objects 53
C++ worth it? If you need it – if you have a large investment in C – it’s quite valuable. If
you’re starting from scratch, other alternatives such as Java can often be more productive.
All objects in a singly-rooted hierarchy (such as Java provides) can be guaranteed to have
certain functionality. You know you can perform certain basic operations on every object in
your system. A singly-rooted hierarchy, along with creating all objects on the heap, greatly
simplifies argument passing (one of the more complex topics in C++).
A singly-rooted hierarchy makes it much easier to implement a garbage collector. The
necessary support can be installed in the base class, and the garbage collector can thus send
the appropriate messages to every object in the system. Without a singly-rooted hierarchy
and a system to manipulate an object via a handle, it is difficult to implement a garbage
collector.
Since run-time type information is guaranteed to be in all objects, you’ll never end up with
an object whose type you cannot determine. This is especially important with system level
operations, such as exception handling, and to allow greater flexibility in programming.
You may wonder why, if it’s so beneficial, a singly-rooted hierarchy isn’t it in C++. It’s the
old bugaboo of efficiency and control. A singly-rooted hierarchy puts constraints on your
program designs, and in particular it was perceived to put constraints on the use of existing
C code. These constraints cause problems only in certain situations, but for maximum
flexibility there is no requirement for a singly-rooted hierarchy in C++. In Java, which
started from scratch and has no backward-compatibility issues with any existing language,
it was a logical choice to use the singly-rooted hierarchy in common with most other objectoriented
programming languages.

Collections and iterators

Collections and iterators
If you don’t know how many objects you’re going to need to solve a particular problem, or
how long they will last, you also don’t know how to store those objects. How can you know
how much space to create for those objects? You can’t, since that information isn’t known
until run time.
The solution to most problems in object-oriented design seems flippant: you create another
type of object. The new type of object that solves this particular problem holds handles to
other objects. Of course, you can do the same thing with an array, which is available in most
languages. But there’s more. This new object, generally called a collection (also called a
container, but the AWT uses that term in a different sense so this book will use “collection”),
will expand itself whenever necessary to accommodate everything you place inside it. So you
don’t need to know how many objects you’re going to hold in a collection. Just create a
collection object and let it take care of the details.
Fortunately, a good OOP language comes with a set of collections as part of the package. In
C++, it’s the Standard Template Library (STL). Object Pascal has collections in its Visual
Component Library (VCL). Smalltalk has a very complete set of collections. Java also has
collections in its standard library. In some libraries, a generic collection is considered good
enough for all needs, and in others (C++ in particular) the library has different types of
collections for different needs: a vector for consistent access to all elements, and a linked list
for consistent insertion at all elements, for example, so you can choose the particular type
that fits your needs. These may include sets, queues, hash tables, trees, stacks, etc.
All collections have some way to put things in and get things out. The way that you place
something into a collection is fairly obvious. There’s a function called “push” or “add” or a
similar name. Fetching things out of a collection is not always as apparent; if it’s an arraylike
entity such as a vector, you might be able to use an indexing operator or function. But in
many situations this doesn’t make sense. Also, a single-selection function is restrictive. What
if you want to manipulate or compare a set of elements in the collection instead of just one?
The solution is an iterator, which is an object whose job is to select the elements within a
collection and present them to the user of the iterator. As a class, it also provides a level of
52 Thinking in Java www.BruceEckel.com
abstraction. This abstraction can be used to separate the details of the collection from the
code that’s accessing that collection. The collection, via the iterator, is abstracted to be simply
a sequence. The iterator allows you to traverse that sequence without worrying about the
underlying structure – that is, whether it’s a vector, a linked list, a stack or something else.
This gives you the flexibility to easily change the underlying data structure without
disturbing the code in your program. Java began (in version 1.0 and 1.1) with a standard
iterator, called Enumeration, for all of its collection classes. Java 1.2 has added a much
more complete collection library which contains an iterator called Iterator that does more
than the older Enumeration.
From the design standpoint, all you really want is a sequence that can be manipulated to
solve your problem. If a single type of sequence satisfied all of your needs, there’d be no
reason to have different kinds. There are two reasons that you need a choice of collections.
First, collections provide different types of interfaces and external behavior. A stack has a
different interface and behavior than that of a queue, which is different than that of a set or
a list. One of these might provide a more flexible solution to your problem than the other.
Second, different collections have different efficiencies for certain operations. The best
example is a vector and a list. Both are simple sequences that can have identical interfaces
and external behaviors. But certain operations can have radically different costs. Randomly
accessing elements in a vector is a constant-time operation; it takes the same amount of time
regardless of the element you select. However, in a linked list it is expensive to move through
the list to randomly select an element, and it takes longer to find an element if it is further
down the list. On the other hand, if you want to insert an element in the middle of a
sequence, it’s much cheaper in a list than in a vector. These and other operations have
different efficiencies depending upon the underlying structure of the sequence. In the design
phase, you might start with a list and, when tuning for performance, change to a vector.
Because of the abstraction via iterators, you can change from one to the other with minimal
impact on your code.
In the end, remember that a collection is only a storage cabinet to put objects in. If that
cabinet solves all of your needs, it doesn’t really matter how it is implemented (a basic
concept with most types of objects). If you’re working in a programming environment that
has built-in overhead due to other factors (running under Windows, for example, or the cost
of a garbage collector), then the cost difference between a vector and a linked list might not
matter. You might need only one type of sequence. You can even imagine the “perfect”
collection abstraction, which can automatically change its underlying implementation
according to the way it is used.

Object landscapes and lifetimes

Object landscapes and lifetimes
Technically, OOP is just about abstract data typing, inheritance and polymorphism, but other
issues can be at least as important. The remainder of this section will cover these issues.
One of the most important factors is the way objects are created and destroyed. Where is the
data for an object and how is the lifetime of the object controlled? There are different
philosophies at work here. C++ takes the approach that control of efficiency is the most
important issue, so it gives the programmer a choice. For maximum run-time speed, the
storage and lifetime can be determined while the program is being written, by placing the
objects on the stack (these are sometimes called automatic or scoped variables) or in the static
storage area. This places a priority on the speed of storage allocation and release, and control
of these can be very valuable in some situations. However, you sacrifice flexibility because
you must know the exact quantity, lifetime and type of objects while you’re writing the
program. If you are trying to solve a more general problem such as computer-aided design,
warehouse management or air-traffic control, this is too restrictive.
The second approach is to create objects dynamically in a pool of memory called the heap. In
this approach you don’t know until run time how many objects you need, what their
lifetime is or what their exact type is. Those are determined at the spur of the moment while
the program is running. If you need a new object, you simply make it on the heap at the
point that you need it. Because the storage is managed dynamically, at run time, the amount
of time required to allocate storage on the heap is significantly longer than the time to create
storage on the stack. (Creating storage on the stack is often a single assembly instruction to
move the stack pointer down, and another to move it back up.) The dynamic approach
makes the generally logical assumption that objects tend to be complicated, so the extra
overhead of finding storage and releasing that storage will not have an important impact on
Chapter 1: Introduction to Objects 51
the creation of an object. In addition, the greater flexibility is essential to solve the general
programming problem.
C++ allows you to determine whether the objects are created while you write the program
or at run time to allow the control of efficiency. You might think that since it’s more flexible,
you’d always want to create objects on the heap rather than the stack. There’s another issue,
however, and that’s the lifetime of an object. If you create an object on the stack or in static
storage, the compiler determines how long the object lasts and can automatically destroy it.
However, if you create it on the heap the compiler has no knowledge of its lifetime. A
programmer has two options for destroying objects: you can determine programmatically
when to destroy the object, or the environment can provide a feature called a garbage
collector that automatically discovers when an object is no longer in use and destroys it. Of
course, a garbage collector is much more convenient, but it requires that all applications
must be able to tolerate the existence of the garbage collector and the other overhead for
garbage collection. This does not meet the design requirements of the C++ language and so
it was not included, but Java does have a garbage collector (as does Smalltalk; Delphi does
not but one could be added. Third-party garbage collectors exist for C++).
The rest of this section looks at additional factors concerning object lifetimes and landscapes.