Everything is an object, Part 1
Learn to write your first Java program with these Java basics
Although it is based on C++, Java is more of a “pure” object-oriented language. Both C++ and Java are hybrid languages, but in Java the designers felt that the hybridization was not as important as it was in C++. A hybrid language allows multiple programming styles; the reason C++ is hybrid is to support backward compatibility with the C language. Because C++ is a superset of the C language, it includes many of that language’s undesirable features, which can make some aspects of C++ overly complicated.
TEXTBOX: TEXTBOX_HEAD: Everything is an object: Read the whole series!
- Part 2. Build your first Java program
-
:END_TEXTBOX
The Java language assumes that you want to do only object-oriented programming. This means that before you can begin you must shift your mindset into an object-oriented world (unless it’s already there). The benefit of this initial effort is the ability to program in a language that is simpler to learn and to use than many other OOP languages. In this [two-part article] we’ll see the basic components of a Java program and we’ll learn that everything in Java is an object, even a Java program.
You manipulate objects with references
Each programming language has its own means of manipulating data. Sometimes the programmer must be constantly aware of what type of manipulation is going on. Are you manipulating the object directly, or are you dealing with some kind of indirect representation (a pointer in C or C++) that must be treated with a special syntax?
All this is simplified in Java. You treat everything as an object, so there is a single consistent syntax that you use everywhere. Although you treat everything as an object, the identifier you manipulate is actually a “reference” to an object. You might imagine this scene as a television (the object) with your remote control (the reference). As long as you’re holding this reference, you have a connection to the television, but when someone says “change the channel” or “lower the volume,” what you’re manipulating is the reference, which in turn modifies the object. If you want to move around the room and still control the television, you take the remote/reference with you, not the television.
Also, the remote control can stand on its own, with no television. That is, just because you have a reference doesn’t mean there’s necessarily an object connected to it. So if you want to hold a word or sentence, you create a String
reference:
String s;
But here you’ve created only the reference, not an object. If you decided to send a message to s
at this point, you’ll get an error (at run-time) because s
isn’t actually attached to anything (there’s no television). A safer practice, then, is always to initialize a reference when you create it:
String s = "asdf";
However, this uses a special Java feature: strings can be initialized with quoted text. Normally, you must use a more general type of initialization for objects.
You must create all the objects
When you create a reference, you want to connect it with a new object. You do so, in general, with the new
keyword. new
says, “Make me a new one of these objects.” So in the above example, you can say:
String s = new String("asdf");
Not only does this mean “Make me a new String
,” but it also gives information about how to make the String
by supplying an initial character string.
Of course, String
is not the only type that exists. Java comes with a plethora of ready-made types. What’s more important is that you can create your own types. In fact, that’s the fundamental activity in Java programming, and it’s what you’ll be learning about in the rest of this [article].
Where storage lives
It’s useful to visualize some aspects of how things are laid out while the program is running, in particular how memory is arranged. There are six different places to store data:
- Registers. This is the fastest storage because it exists in a place different from that of other storage: inside the processor. However, the number of registers is severely limited, so registers are allocated by the compiler according to its needs. You don’t have direct control, nor do you see any evidence in your programs that registers even exist.
- The stack. This lives in the general RAM (random-access memory) area, but has direct support from the processor via its stack pointer. The stack pointer is moved down to create new memory and moved up to release that memory. This is an extremely fast and efficient way to allocate storage, second only to registers. The Java compiler must know, while it is creating the program, the exact size and lifetime of all the data that is stored on the stack, because it must generate the code to move the stack pointer up and down. This constraint places limits on the flexibility of your programs, so while some Java storage exists on the stack-in particular, object references — Java objects themselves are not placed on the stack.
- The heap. This is a general-purpose pool of memory (also in the RAM area) where all Java objects live. The nice thing about the heap is that, unlike the stack, the compiler doesn’t need to know how much storage it needs to allocate from the heap or how long that storage must stay on the heap. Thus, there’s a great deal of flexibility in using storage on the heap. Whenever you need to create an object, you simply write the code to create it using
new
, and the storage is allocated on the heap when that code is executed. Of course there’s a price you pay for this flexibility: it takes more time to allocate heap storage than it does to allocate stack storage (that is, if you even could create objects on the stack in Java, as you can in C++). - Static storage. “Static” is used here in the sense of “in a fixed location” (although it’s also in RAM). Static storage contains data that is available for the entire time a program is running. You can use the static keyword to specify that a particular element of an object is static, but Java objects themselves are never placed in static storage.
- Constant storage. Constant values are often placed directly in the program code, which is safe since they can never change. Sometimes constants are cordoned off by themselves so that they can be optionally placed in read-only memory (ROM).
- Non-RAM storage. If data lives completely outside a program it can exist while the program is not running, outside the control of the program. The two primary examples of this are streamed objects, in which objects are turned into streams of bytes, generally to be sent to another machine, and persistent objects, in which the objects are placed on disk so they will hold their state even when the program is terminated. The trick with these types of storage is turning the objects into something that can exist on the other medium, and yet can be resurrected into a regular RAM-based object when necessary. Java provides support for lightweight persistence, and future versions of Java might provide more complete solutions for persistence.
Special case: primitive types
There is a group of types that gets special treatment; you can think of these as “primitive” types that you use quite often in your programming. The reason for the special treatment is that to create an object with new
–especially a small, simple variable — isn’t very efficient because new
places objects on the heap. For these types Java falls back on the approach taken by C and C++. That is, instead of creating the variable using new
, an “automatic” variable is created that is not a reference. The variable holds the value, and it’s placed on the stack so it’s much more efficient.
Java determines the size of each primitive type. These sizes don’t change from one machine architecture to another as they do in most languages. This size invariance is one reason Java programs are so portable.
Primitive type | Size | Minimum | Maximum | Wrapper type |
---|---|---|---|---|
boolean | – | – | – | Boolean |
char | 16-bit | Unicode 0 | Unicode 216– 1 | Character |
byte | 8-bit | -128 | +127 | Byte |
short | 16-bit | -215 | +215-1 | Short |
int | 32-bit | -231 | +231-1 | Integer |
long | 64-bit | -263 | +263-1 | Long |
float | 32-bit | IEEE754 | IEEE754 | Float |
double | 64-bit | IEEE754 | IEEE754 | Double |
void | – | – | – | Void |
All numeric types are signed, so don’t go looking for unsigned types. The size of the boolean
type is not explicitly defined; it is only specified to be able to take the literal values true
or false
.
The primitive data types also have “wrapper” classes for them. That means that if you want to make a nonprimitive object on the heap to represent that primitive type, you use the associated wrapper. For example:
char c="x";
Character C = new Character(c);
Or you could also use:
Character C = new Character('x');
The reasons for doing this [are shown in the book].
High-precision numbers
Java includes two classes for performing high-precision arithmetic: BigInteger
and BigDecimal
. Although these approximately fit into the same category as the “wrapper” classes, neither one has a primitive analogue.
Both classes have methods that provide analogues for the operations that you perform on primitive types. That is, you can do anything with a BigInteger
or BigDecimal
that you can with an int
or float
, it’s just that you must use method calls instead of operators. Also, since there’s more involved, the operations will be slower. You’re exchanging speed for accuracy.
BigInteger
supports arbitrary-precision integers. This means that you can accurately represent integral values of any size without losing any information during operations.
BigDecimal
is for arbitrary-precision fixed-point numbers; you can use these for accurate monetary calculations, for example.
Consult your online documentation for details about the constructors and methods you can call for these two classes.
Arrays in Java
Virtually all programming languages support arrays. Using arrays in C and C++ is perilous because those arrays are only blocks of memory. If a program accesses the array outside of its memory block or uses the memory before initialization (common programming errors) there will be unpredictable results.
One of the primary goals of Java is safety, so many of the problems that plague programmers in C and C++ are not repeated in Java. A Java array is guaranteed to be initialized and cannot be accessed outside of its range. The range checking comes at the price of having a small amount of memory overhead on each array as well as verifying the index at run-time, but the assumption is that the safety and increased productivity is worth the expense.
When you create an array of objects, you are really creating an array of references, and each of those references is automatically initialized to a special value with its own keyword: null
. When Java sees null
, it recognizes that the reference in question isn’t pointing to an object. You must assign an object to each reference before you use it, and if you try to use a reference that’s still null
, the problem will be reported at run-time. Thus, typical array errors are prevented in Java.
You can also create an array of primitives. Again, the compiler guarantees initialization because it zeroes the memory for that array.
Arrays are covered in detail [in the book].
You never need to destroy an object
In most programming languages, the concept of the lifetime of a variable occupies a significant portion of the programming effort. How long does the variable last? If you are supposed to destroy it, when should you? Confusion over variable lifetimes can lead to a lot of bugs, and this section shows how Java greatly simplifies the issue by doing all the cleanup work for you.
Scoping
Most procedural languages have the concept of
scope
. This determines both the visibility and lifetime of the names defined within that scope. In C, C++, and Java, scope is determined by the placement of curly braces
{}
. So for example:
{
int x = 12;
/* only x available */
{
int q = 96;
/* both x & q available */
}
/* only x available */
/* q "out of scope" */
}
A variable defined within a scope is available only to the end of that scope.
Indentation makes Java code easier to read. Since Java is a free-form language, the extra spaces, tabs, and carriage returns do not affect the resulting program.
Note that you cannot do the following, even though it is legal in C and C++:
{
int x = 12;
{
int x = 96; /* illegal */
}
}
The compiler will announce that the variable x
has already been defined. Thus the C and C++ ability to “hide” a variable in a larger scope is not allowed because the Java designers thought that it led to confusing programs.
Scope of objects
Java objects do not have the same lifetimes as primitives. When you create a Java object using
new
, it hangs around past the end of the scope. Thus if you use:
{
String s = new String("a string");
} /* end of scope */
the reference s
vanishes at the end of the scope. However, the String
object that s
was pointing to is still occupying memory. In this bit of code, there is no way to access the object because the only reference to it is out of scope. In [the book] you’ll see how the reference to the object can be passed around and duplicated during the course of a program.
It turns out that because objects created with new
stay around for as long as you want them, a whole slew of C++ programming problems simply vanish in Java. The hardest problems seem to occur in C++ because you don’t get any help from the language in making sure that the objects are available when they’re needed. And more important, in C++ you must make sure that you destroy the objects when you’re done with them.
That brings up an interesting question. If Java leaves the objects lying around, what keeps them from filling up memory and halting your program? This is exactly the kind of problem that would occur in C++. This is where a bit of magic happens. Java has a garbage collector, which looks at all the objects that were created with new
and figures out which ones are not being referenced anymore. Then it releases the memory for those objects, so the memory can be used for new objects. This means that you never need to worry about reclaiming memory yourself. You simply create objects, and when you no longer need them they will go away by themselves. This eliminates a certain class of programming problem: the so-called “memory leak,” in which a programmer forgets to release memory.
Creating new data types: class
If everything is an object, what determines how a particular class of object looks and behaves? Put another way, what establishes the type of an object? You might expect there to be a keyword called “type,” and that certainly would have made sense. Historically, however, most object-oriented languages have used the keyword class
to mean “I’m about to tell you what a new type of object looks like.” The class
keyword is followed by the name of the new type. For example:
class ATypeName { /* class body goes here */ }
This introduces a new type, so you can now create an object of this type using new
:
ATypeName a = new ATypeName();
In ATypeName
, the class body consists only of a comment (the stars and slashes and what is inside, which will be discussed later in this [article]), so there is not too much that you can do with it. In fact, you cannot tell it to do much of anything (that is, you cannot send it any interesting messages) until you define some methods for it.
Fields and methods
When you define a class (and all you do in Java is define classes, make objects of those classes, and send messages to those objects), you can put two types of elements in your class: data members (sometimes called fields), and member functions (typically called methods). A data member is an object of any type that you can communicate with via its reference. It can also be one of the primitive types (which isn’t a reference). If it is a reference to an object, you must initialize that reference to connect it to an actual object (using new
, as seen earlier) in a special function called a constructor (described fully in Chapter 4 [of the book]). If it is a primitive type you can initialize it directly at the point of definition in the class. (As you’ll see later, references can also be initialized at the point of definition.)
Each object keeps its own storage for its data members; the data members are not shared among objects. Here is an example of a class with some data members:
class DataOnly {
int i;
float f;
boolean b;
}
This class doesn’t do anything, but you can create an object:
DataOnly d = new DataOnly();
You can assign values to the data members, but you must first know how to refer to a member of an object. This is accomplished by stating the name of the object reference, followed by a period (dot), followed by the name of the member inside the object:
objectReference.member
For example:
d.i = 47;
d.f = 1.1f;
d.b = false;
It is also possible that your object might contain other objects that contain data you’d like to modify. For this, you just keep “connecting the dots.” For example:
myPlane.leftTank.capacity = 100;
The DataOnly
class cannot do much of anything except hold data, because it has no member functions (methods). To understand how those work, you must first understand arguments and return values, which will be described shortly.
Default values for primitive members
When a primitive data type is a member of a class, it is guaranteed to get a default value if you do not initialize it:
Primitive type | Default |
---|---|
boolean | false |
char | ‘u0000’ (null) |
byte | (byte)0 |
short | (short)0 |
int | 0 |
long | 0L |
float | 0.0f |
double | 0.0d |
Note carefully that the default values are what Java guarantees when the variable is used as a member of a class. This ensures that member variables of primitive types will always be initialized (something C++ doesn’t do), reducing a source of bugs. However, this initial value may not be correct or even legal for the program you are writing. It’s best to always explicitly initialize your variables.
This guarantee doesn’t apply to “local” variables — those that are not fields of a class. Thus, if within a function definition you have:
int x;
Then x
will get some arbitrary value (as in C and C++); it will not automatically be initialized to zero. You are responsible for assigning an appropriate value before you use x
. If you forget, Java definitely improves on C++: you get a compile-time error telling you the variable might not have been initialized. (Many C++ compilers will warn you about uninitialized variables, but in Java these are errors.)
Methods, arguments, and return values
Up until now, the term function has been used to describe a named subroutine. The term that is more commonly used in Java is method, as in “a way to do something.” If you want, you can continue thinking in terms of functions. It’s really only a syntactic difference, but from now on “method” will be used in this [article] rather than “function.”
Methods in Java determine the messages an object can receive. In this section you will learn how simple it is to define a method.
The fundamental parts of a method are the name, the arguments, the return type, and the body. Here is the basic form:
returnType methodName( /* argument list */ ) {
/* Method body */
}
The return type is the type of the value that pops out of the method after you call it. The argument list gives the types and names for the information you want to pass into the method. The method name and argument list together uniquely identify the method.
Methods in Java can be created only as part of a class. A method can be called only for an object, and that object must be able to perform that method call. If you try to call the wrong method for an object, you’ll get an error message at compile-time. You call a method for an object by naming the object followed by a period (dot), followed by the name of the method and its argument list, like this: objectName.methodName(arg1, arg2, arg3)
. For example, suppose you have a method f( )
that takes no arguments and returns a value of type int
. Then, if you have an object called a
for which f( )
can be called, you can say this:
int x = a.f();
The type of the return value must be compatible with the type of x
.
This act of calling a method is commonly referred to as sending a message to an object. In the above example, the message is f( )
and the object is a
. Object-oriented programming is often summarized as simply “sending messages to objects.”
The argument list
The method argument list specifies what information you pass into the method. As you might guess, this information — like everything else in Java — takes the form of objects. So, what you must specify in the argument list are the types of the objects to pass in and the name to use for each one. As in any situation in Java where you seem to be handing objects around, you are actually passing references. The type of the reference must be correct, however. If the argument is supposed to be a String
, what you pass in must be a string.
Consider a method that takes a String
as its argument. Here is the definition, which must be placed within a class definition for it to be compiled:
int storage(String s) {
return s.length() * 2;
}
This method tells you how many bytes are required to hold the information in a particular String
. (Each char
in a String
is 16 bits, or two bytes, long, to support Unicode characters.) The argument is of type String
and is called s
. Once s
is passed into the method, you can treat it just like any other object. (You can send messages to it.) Here, the length()
method is called, which is one of the methods for String
s; it returns the number of characters in a string.
You can also see the use of the return
keyword, which does two things. First, it means “leave the method, I’m done.” Second, if the method produces a value, that value is placed right after the return
statement. In this case, the return value is produced by evaluating the expression s.length( ) * 2
.
You can return any type you want, but if you don’t want to return anything at all, you do so by indicating that the method returns void. Here are some examples:
boolean flag() { return true; }
float naturalLogBase() { return 2.718f; }
void nothing() { return; }
void nothing2() {}
When the return type is void
, then the return
keyword is used only to exit the method, and is therefore unnecessary when you reach the end of the method. You can return from a method at any point, but if you’ve given a non-void
return type then the compiler will force you (with error messages) to return the appropriate type of value regardless of where you return.
At this point, it can look like a program is just a bunch of objects with methods that take other objects as arguments and send messages to those other objects. That is indeed much of what goes on, but in the next chapter [of the book] you’ll learn how to do the detailed low-level work by making decisions within a method. For this [article], sending messages will suffice.
Stay tuned for the second part of “Everything is an object,” which will continue to help you build a Java program. The article will be featured in the October issue of JavaWorld.