Creating names is a fundamental
activity in programming, and when a project gets large, the number of names can
easily be overwhelming.
C++ allows you a great deal of control
over the creation and visibility of names, where storage for those names is
placed, and linkage for names.
The static
keyword was overloaded in C before people knew what the
term “overload” meant, and C++ has added yet another meaning. The
underlying concept with all uses of static seems to be “something
that holds its position” (like static electricity), whether that means a
physical location in memory or visibility within a file.
In this chapter, you’ll learn how
static controls storage and visibility, and an improved way to control
access to names via C++’s namespace feature. You’ll also find
out how to use functions that were written and compiled in
C.
In both C and C++ the keyword
static has two basic meanings, which unfortunately often step on each
other’s
toes:
When you create a local variable inside a
function, the compiler allocates storage for that variable each time the
function is called by moving the stack pointer down an
appropriate amount. If there is an initializer for the variable, the
initialization is performed each time that sequence point is
passed.
Sometimes, however, you want to retain a
value between function calls. You could accomplish this by making a global
variable, but then that variable would not be under the sole control of the
function. C and C++ allow you to create a static object inside a
function; the storage for this object is not on the stack but instead in the
program’s static data area. This object is initialized only once, the
first time the function is called, and then retains its value between function
invocations. For example, the following function returns the next character in
the array each time the function is called:
//: C10:StaticVariablesInfunctions.cpp #include "../require.h" #include <iostream> using namespace std; char oneChar(const char* charArray = 0) { static const char* s; if(charArray) { s = charArray; return *s; } else require(s, "un-initialized s"); if(*s == '\0') return 0; return *s++; } char* a = "abcdefghijklmnopqrstuvwxyz"; int main() { // oneChar(); // require() fails oneChar(a); // Initializes s to a char c; while((c = oneChar()) != 0) cout << c << endl; } ///:~
The static char* s holds its value
between calls of oneChar( ) because its storage is not part of the
stack frame of the function, but is in the static storage area of the program.
When you call oneChar( ) with a char* argument, s is
assigned to that argument, and the first character of the array is returned.
Each subsequent call to oneChar( ) without an argument
produces the default value of zero for charArray, which indicates to the
function that you are still extracting characters from the previously
initialized value of s. The function will continue to produce characters
until it reaches the null terminator of the character array, at which point it
stops incrementing the pointer so it doesn’t overrun the end of the
array.
But what happens if you call
oneChar( ) with no arguments and without previously initializing the
value of s? In the definition for s, you could have provided an
initializer,
static char* s = 0;
but if you do not provide an initializer
for a static variable of a built-in
type,
the compiler guarantees that variable will be initialized to zero (converted to
the proper type) at program start-up. So in oneChar( ), the first
time the function is called, s is zero. In this case, the if(!s)
conditional will catch it.
The initialization above for s is
very simple, but initialization for static objects (like all other objects) can
be arbitrary expressions involving constants and previously declared variables
and functions.
You should be aware that the function
above is very vulnerable to multithreading problems; whenever you design
functions containing static variables you should keep multithreading issues in
mind.
The rules are the same for static objects
of user-defined types, including the fact that some initialization is required
for the object. However, assignment to zero has meaning only for built-in types;
user-defined types must be initialized with constructor calls. Thus, if you
don’t specify constructor arguments when you define the static object, the
class must have a default
constructor. For
example,
//: C10:StaticObjectsInFunctions.cpp #include <iostream> using namespace std; class X { int i; public: X(int ii = 0) : i(ii) {} // Default ~X() { cout << "X::~X()" << endl; } }; void f() { static X x1(47); static X x2; // Default constructor required } int main() { f(); } ///:~
The static objects of type X
inside f( ) can be initialized either with the constructor argument
list or with the default constructor. This construction occurs the first time
control passes through the definition, and only the first time.
Destructors for static objects (that is,
all objects with static storage, not just local static objects as in the example
above) are called when main( ) exits or when the Standard C library
function
exit( )
is explicitly called. In most implementations, main( ) just
calls exit( ) when it terminates. This means that it can be
dangerous to call exit( ) inside a destructor because you can end up
with infinite recursion. Static object destructors are not called if you
exit the program using the Standard C library function
abort( ).
You can specify actions to take place
when leaving main( ) (or calling exit( )) by using the
Standard C library function
atexit( ).
In this case, the functions registered by atexit( ) may be called
before the destructors for any objects constructed before leaving
main( ) (or calling exit( )).
Like ordinary destruction, destruction of
static objects
occurs
in the reverse order of initialization. However, only objects that have been
constructed are destroyed. Fortunately, the C++ development tools keep track of
initialization order and the objects that have been constructed. Global objects
are always
constructed
before main( ) is entered and destroyed as main( )
exits, but if a function containing a local static object
is never
called, the constructor for that object is never executed, so the destructor is
also not executed. For example,
//: C10:StaticDestructors.cpp // Static object destructors #include <fstream> using namespace std; ofstream out("statdest.out"); // Trace file class Obj { char c; // Identifier public: Obj(char cc) : c(cc) { out << "Obj::Obj() for " << c << endl; } ~Obj() { out << "Obj::~Obj() for " << c << endl; } }; Obj a('a'); // Global (static storage) // Constructor & destructor always called void f() { static Obj b('b'); } void g() { static Obj c('c'); } int main() { out << "inside main()" << endl; f(); // Calls static constructor for b // g() not called out << "leaving main()" << endl; } ///:~
In Obj, the char c acts as
an identifier so the constructor and destructor can print out information about
the object they’re working on. The Obj a is a global object, so the
constructor is always called for it before main( ) is entered, but
the constructors for the static Obj b inside f( ) and the
static Obj c inside g( ) are called only if those functions
are called.
To demonstrate which constructors and
destructors are called, only f( ) is called. The output of the
program is
Obj::Obj() for a inside main() Obj::Obj() for b leaving main() Obj::~Obj() for b Obj::~Obj() for a
The constructor for a is called
before main( ) is entered, and the constructor for b is
called only because f( ) is called. When main( ) exits,
the destructors for the objects that have been constructed are called in reverse
order of their construction. This means that if g( ) is
called, the order in which the destructors for b and c are called
depends on whether f( ) or g( ) is called
first.
Notice that the trace file
ofstream object out is also a static object – since it is
defined outside of all functions, it lives in the static storage
area. It is important that its definition (as opposed to
an extern declaration) appear at the beginning of the file, before there
is any possible use of out. Otherwise, you’ll be using an object
before it is properly initialized.
In C++, the constructor for a global
static object is called before main( ) is entered, so you now have a
simple and portable way to execute code before entering
main( ) and to
execute code with the destructor after exiting
main( ).
In C, this was always a trial that required you to root around in the compiler
vendor’s assembly-language startup
code.
Ordinarily, any name at file scope
(that is, not nested inside a
class or function) is visible throughout all translation units in a program.
This is often called external linkage
because at link time the name is
visible to the linker everywhere, external to that translation unit. Global
variables and ordinary functions have external linkage.
There are times when you’d like to
limit the visibility of a name. You might like to have a variable at file scope
so all the functions in that file can use it, but you don’t want functions
outside that file to see or access that variable, or to inadvertently cause name
clashes with identifiers outside the file.
An object or function name at file scope
that is explicitly declared static is local to its translation unit (in
the terms of this book, the cpp file where the declaration occurs). That
name has internal
linkage. This means that you
can use the same name in other translation units without a name
clash.
One advantage to internal linkage is that
the name can be placed in a header file without worrying
that there will be a clash at link time. Names that are commonly placed in
header files, such as const definitions and inline functions,
default to internal linkage. (However, const defaults to internal linkage
only in C++; in C it defaults to external linkage.) Note that linkage refers
only to elements that have addresses at link/load time; thus, class declarations
and local variables have no
linkage.
Here’s an example of how the two
meanings of static can
cross over each other. All global objects implicitly have static storage
class, so if you say (at file scope),
int a = 0;
then storage for a will be in the
program’s static data area, and the initialization for a will occur
once, before main( ) is entered. In addition, the visibility of
a is global across all translation units. In terms of visibility, the
opposite of static (visible only in this translation unit) is
extern,
which explicitly states that the visibility of the name is across all
translation units. So the definition above is equivalent to
saying
extern int a = 0;
But if you say instead,
static int a = 0;
all you’ve done is change the
visibility, so a has internal linkage. The storage class is unchanged
– the object resides in the static data area whether the visibility is
static or extern.
Once you get into local variables,
static stops altering the visibility and instead alters the storage
class.
If you declare what appears to be a local
variable as extern, it means that the storage exists elsewhere (so the
variable is actually global to the function). For example:
//: C10:LocalExtern.cpp //{L} LocalExtern2 #include <iostream> int main() { extern int i; std::cout << i; } ///:~ //: C10:LocalExtern2.cpp {O} int i = 5; ///:~
With function names (for non-member
functions), static and extern can only alter visibility, so if you
say
extern void f();
it’s the same as the unadorned
declaration
void f();
and if you say,
static void f();
You will see static and
extern used commonly. There are two other storage class specifiers that
occur less often. The auto
specifier is almost never used
because it tells the compiler that this is a local variable. auto is
short for “automatic” and it refers to the way the compiler
automatically allocates storage for the variable. The compiler can always
determine this fact from the context in which the variable is defined, so
auto is redundant.
A register
variable is a local
(auto) variable, along with a hint to the compiler that this particular
variable will be heavily used so the compiler ought to keep it in a register if
it can. Thus, it is an optimization aid. Various compilers respond differently
to this hint; they have the option to ignore it. If you take the address of the
variable, the register specifier will almost certainly be ignored. You
should avoid using register because the compiler can usually do a better
job of optimization than
you.
Although names can be nested inside
classes, the names of global functions, global variables, and classes are still
in a single global name space. The static keyword gives you some control
over this by allowing you to give variables and functions internal linkage (that
is, to make them file static).
But in a large project, lack of control over the global name space can cause
problems. To solve these problems for classes, vendors often create long
complicated names that are unlikely to clash, but then you’re stuck typing
those names. (A typedef is often used to simplify
this.) It’s not an elegant, language-supported solution.
You can subdivide the global name space
into more manageable pieces using the namespace
feature of C++. The
namespace keyword, such as class,
struct, enum, and union, puts the names of its members in a
distinct space. While the other keywords have additional purposes, the creation
of a new name space is the only purpose for
namespace.
The creation of a namespace is notably
similar to the creation of a class:
//: C10:MyLib.cpp namespace MyLib { // Declarations } int main() {} ///:~
This produces a new namespace containing
the enclosed declarations. There are significant differences from class,
struct, union and enum,
however:
//: C10:Header1.h #ifndef HEADER1_H #define HEADER1_H namespace MyLib { extern int x; void f(); // ... }
#endif // HEADER1_H ///:~ //: C10:Header2.h #ifndef HEADER2_H #define HEADER2_H #include "Header1.h" // Add more names to MyLib namespace MyLib { // NOT a redefinition! extern int y; void g(); // ... }
#endif // HEADER2_H ///:~ //: C10:Continuation.cpp #include "Header2.h" int main() {} ///:~
//: C10:BobsSuperDuperLibrary.cpp namespace BobsSuperDuperLibrary { class Widget { /* ... */ }; class Poppit { /* ... */ }; // ... } // Too much to type! I’ll alias it: namespace Bob = BobsSuperDuperLibrary; int main() {} ///:~
Each translation unit contains an unnamed
namespace that you can add to by
saying “namespace” without an identifier:
//: C10:UnnamedNamespaces.cpp namespace { class Arm { /* ... */ }; class Leg { /* ... */ }; class Head { /* ... */ }; class Robot { Arm arm[4]; Leg leg[16]; Head head[3]; // ... } xanthan; int i, j, k; } int main() {} ///:~
The names in this space are automatically
available in that translation unit without qualification. It is guaranteed that
an unnamed space is unique for each translation unit. If you put local names in
an unnamed namespace, you don’t need to give them internal linkage by
making them static.
//: C10:FriendInjection.cpp namespace Me { class Us { //... friend void you(); }; } int main() {} ///:~
You can refer to a name within a
namespace in two ways: one name at a time, using the
scope resolution operator, or more expediently with the
using
keyword.
Any name in a namespace can be explicitly
specified using the
scope
resolution operator in the same way that you can refer to the names within a
class:
//: C10:ScopeResolution.cpp namespace X { class Y { static int i; public: void f(); }; class Z; void func(); } int X::Y::i = 9;
class X::Z { int u, v, w; public: Z(int i); int g(); };
X::Z::Z(int i) { u = v = w = i; } int X::Z::g() { return u = v = w = 0; }
void X::func() { X::Z a(1); a.g(); } int main(){} ///:~
Notice that the definition X::Y::i
could just as easily be referring to a data member of a class Y nested in
a class X instead of a namespace X.
So far, namespaces look very much like
classes.
Because it can rapidly get tedious to
type the full qualification for an identifier in a namespace, the using
keyword allows you to import an entire namespace at once. When used in
conjunction with the namespace keyword this is called a
using
directive. The using
directive declares all the names of a namespace to be in the current scope, so
you can conveniently use the unqualified names. If we start with a simple
namespace:
//: C10:NamespaceInt.h #ifndef NAMESPACEINT_H #define NAMESPACEINT_H namespace Int { enum sign { positive, negative }; class Integer { int i; sign s; public: Integer(int ii = 0) : i(ii), s(i >= 0 ? positive : negative) {} sign getSign() const { return s; } void setSign(sign sgn) { s = sgn; } // ... }; } #endif // NAMESPACEINT_H ///:~
One use of the using directive is
to bring all of the names in Int into another namespace, leaving those
names nested within the namespace:
//: C10:NamespaceMath.h #ifndef NAMESPACEMATH_H #define NAMESPACEMATH_H #include "NamespaceInt.h" namespace Math { using namespace Int; Integer a, b; Integer divide(Integer, Integer); // ... } #endif // NAMESPACEMATH_H ///:~
You can also declare all of the names in
Int inside a function, but leave those names nested within the
function:
//: C10:Arithmetic.cpp #include "NamespaceInt.h" void arithmetic() { using namespace Int; Integer x; x.setSign(positive); } int main(){} ///:~
Without the using directive, all
the names in the namespace would need to be fully qualified.
One aspect of the using directive
may seem slightly counterintuitive at first. The visibility of the names
introduced with a using directive is the scope in which the directive is
made. But you can override the names from the using directive as if
they’ve been declared globally to that scope!
//: C10:NamespaceOverriding1.cpp #include "NamespaceMath.h" int main() { using namespace Math; Integer a; // Hides Math::a; a.setSign(negative); // Now scope resolution is necessary // to select Math::a : Math::a.setSign(positive); } ///:~
Suppose you have a second namespace that
contains some of the names in namespace Math:
//: C10:NamespaceOverriding2.h #ifndef NAMESPACEOVERRIDING2_H #define NAMESPACEOVERRIDING2_H #include "NamespaceInt.h" namespace Calculation { using namespace Int; Integer divide(Integer, Integer); // ... } #endif // NAMESPACEOVERRIDING2_H ///:~
Since this namespace is also introduced
with a using directive, you have the possibility of a collision. However,
the ambiguity appears at the
point of use of the name, not at the using
directive:
//: C10:OverridingAmbiguity.cpp #include "NamespaceMath.h" #include "NamespaceOverriding2.h" void s() { using namespace Math; using namespace Calculation; // Everything's ok until: //! divide(1, 2); // Ambiguity } int main() {} ///:~
Thus, it’s possible to write
using directives to introduce a number of namespaces with conflicting
names without ever producing an ambiguity.
You can introduce names one at a time
into the current scope with a using
declaration.
Unlike the using directive, which treats names as if they were declared
globally to the scope, a using declaration is a declaration within the
current scope. This means it can override names from a using
directive:
//: C10:UsingDeclaration.h #ifndef USINGDECLARATION_H #define USINGDECLARATION_H namespace U { inline void f() {} inline void g() {} } namespace V { inline void f() {} inline void g() {} } #endif // USINGDECLARATION_H ///:~
//: C10:UsingDeclaration1.cpp #include "UsingDeclaration.h" void h() { using namespace U; // Using directive using V::f; // Using declaration f(); // Calls V::f(); U::f(); // Must fully qualify to call } int main() {} ///:~
The using declaration just gives
the fully specified name of the identifier, but no type information. This means
that if the namespace contains a set of
overloaded functions with the
same name, the using declaration declares all the functions in the
overloaded set.
You can put a using declaration
anywhere a normal declaration can occur. A using declaration works like a
normal declaration in all ways but one: because you don’t give an argument
list, it’s possible for a using declaration to cause the overload
of a function with the same argument types (which
isn’t allowed with normal overloading). This ambiguity, however,
doesn’t show up until the point of use, rather than the point of
declaration.
A using declaration can also
appear within a namespace, and it has the same effect as anywhere else –
that name is declared within the space:
//: C10:UsingDeclaration2.cpp #include "UsingDeclaration.h" namespace Q { using U::f; using V::g; // ... } void m() { using namespace Q; f(); // Calls U::f(); g(); // Calls V::g(); } int main() {} ///:~
A using declaration is an alias,
and it allows you to declare the same function in separate namespaces. If you
end up re-declaring the same function by importing different namespaces,
it’s OK – there won’t be any ambiguities or
duplications.
Some of the rules above may seem a bit
daunting at first, especially if you get the impression that you’ll be
using them all the time. In general, however, you can get away with very simple
usage of namespaces as long as you understand how they work. The key thing to
remember is that when you introduce a global using directive (via a
“using namespace” outside of any scope) you have thrown open
the namespace for that file. This is usually fine for an implementation file (a
“cpp” file) because the using directive is only in
effect until the end of the compilation of that file. That is, it doesn’t
affect any other files, so you can adjust the control of the namespaces one
implementation file at a time. For example, if you discover a name clash because
of too many using directives in a particular implementation file, it is a
simple matter to change that file so that it uses explicit qualifications or
using declarations to eliminate the clash, without modifying other
implementation files.
Header
files are a different issue. You virtually never want to introduce a global
using directive into a header file, because that would mean that any
other file that included your header would also have the namespace thrown open
(and header files can include other header files).
So, in header files you should either use
explicit qualification or scoped using directives and using
declarations. This is the practice that you will find in this book, and by
following it you will not “pollute” the global namespace and throw
yourself back into the pre-namespace world of C++.
There are times when you need a single
storage space to be used by all objects of a class. In C, you would use a global
variable, but this is not very safe. Global data can be modified by anyone, and
its name can clash with other identical names in a large project. It would be
ideal if the data could be stored as if it were global, but be hidden inside a
class, and clearly associated with that class.
This is accomplished with static
data members inside a class. There is a single piece of storage for a
static data member, regardless of how many objects of that class you
create. All objects share the same static storage space for that data
member, so it is a way for them to “communicate” with each other.
But the static data belongs to the class; its name is scoped inside the
class and it can be public, private, or
protected.
Because static data has a single
piece of storage regardless of how many objects are created, that storage must
be defined in a single place. The compiler will not allocate storage for you.
The linker will report an error if a static data member is declared but
not defined.
The definition must occur outside the
class (no inlining is allowed), and only one definition is allowed. Thus, it is
common to put it in the implementation file for the class. The syntax sometimes
gives people trouble, but it is actually quite logical. For example, if you
create a static data member inside a class like this:
class A { static int i; public: //... };
Then you must define storage for that
static data member in the definition file like this:
int A::i = 1;
If you were to define an ordinary global
variable, you would say
int i = 1;
but here, the scope resolution operator
and the class name are used to specify A::i.
Some people have trouble with the idea
that A::i is private, and yet here’s something that seems to
be manipulating it right out in the open. Doesn’t this break the
protection mechanism? It’s a completely safe practice for two reasons.
First, the only place this initialization is legal is in the definition. Indeed,
if the static data were an object with a constructor, you would call the
constructor instead of using the = operator. Second, once the definition
has been made, the end-user cannot make a second definition – the linker
will report an error. And the class creator is forced to create the definition
or the code won’t link during testing. This ensures that the definition
happens only once and that it’s in the hands of the class
creator.
//: C10:Statinit.cpp // Scope of static initializer #include <iostream> using namespace std; int x = 100; class WithStatic { static int x; static int y; public: void print() const { cout << "WithStatic::x = " << x << endl; cout << "WithStatic::y = " << y << endl; } }; int WithStatic::x = 1; int WithStatic::y = x + 1; // WithStatic::x NOT ::x int main() { WithStatic ws; ws.print(); } ///:~
Here, the qualification
WithStatic:: extends the scope of WithStatic to the entire
definition.
Chapter 8 introduced the static
const variable that allows you to define a constant value inside a class
body.
It’s
also possible to create arrays of static objects, both const and
non-const. The syntax is reasonably consistent:
//: C10:StaticArray.cpp // Initializing static arrays in classes class Values { // static consts are initialized in-place: static const int scSize = 100; static const long scLong = 100; // Automatic counting works with static arrays. // Arrays, Non-integral and non-const statics // must be initialized externally: static const int scInts[]; static const long scLongs[]; static const float scTable[]; static const char scLetters[]; static int size; static const float scFloat; static float table[]; static char letters[]; }; int Values::size = 100; const float Values::scFloat = 1.1; const int Values::scInts[] = { 99, 47, 33, 11, 7 }; const long Values::scLongs[] = { 99, 47, 33, 11, 7 }; const float Values::scTable[] = { 1.1, 2.2, 3.3, 4.4 }; const char Values::scLetters[] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' }; float Values::table[4] = { 1.1, 2.2, 3.3, 4.4 }; char Values::letters[10] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j' }; int main() { Values v; } ///:~
With static consts of integral
types you can provide the definitions inside the class, but for everything else
(including arrays of integral types, even if they are static const)
you must provide a single external definition for the member. These
definitions have internal linkage, so they can be placed in header files. The
syntax for initializing static arrays is the same as for any aggregate,
including automatic counting.
You can also create static const
objects of class types and arrays of such objects. However, you cannot
initialize them using the “inline syntax” allowed for static
consts of integral built-in types:
//: C10:StaticObjectArrays.cpp // Static arrays of class objects class X { int i; public: X(int ii) : i(ii) {} }; class Stat { // This doesn't work, although // you might want it to: //! static const X x(100); // Both const and non-const static class // objects must be initialized externally: static X x2; static X xTable2[]; static const X x3; static const X xTable3[]; }; X Stat::x2(100); X Stat::xTable2[] = { X(1), X(2), X(3), X(4) }; const X Stat::x3(100); const X Stat::xTable3[] = { X(1), X(2), X(3), X(4) }; int main() { Stat v; } ///:~
The initialization of both const
and non-const static arrays of class objects must be performed the
same way, following the typical static definition
syntax.
You can easily put static data members in
classes that are nested inside
other classes. The definition of such members is an intuitive and obvious
extension – you simply use another level of scope resolution. However, you
cannot have static data members inside local classes
(a local class is a class
defined inside a function). Thus,
//: C10:Local.cpp // Static members & local classes #include <iostream> using namespace std; // Nested class CAN have static data members: class Outer { class Inner { static int i; // OK }; }; int Outer::Inner::i = 47; // Local class cannot have static data members: void f() { class Local { public: //! static int i; // Error // (How would you define i?) } x; } int main() { Outer x; f(); } ///:~
You can see the immediate problem with a
static member in a local class: How do you describe the data member at
file scope in order to define it? In practice, local classes are used very
rarely.
You can also create static member
functions
that,
like static data members, work for the class as a whole rather than for a
particular object of a class. Instead of making a global function that lives in
and “pollutes” the global or local namespace, you bring the function
inside the class. When you create a static member function, you are
expressing an association with a particular class.
You can call a static member
function in the ordinary way, with the dot or the arrow, in association with an
object. However, it’s more typical to call a static member function
by itself, without any specific object, using the scope-resolution operator,
like
this:
//: C10:SimpleStaticMemberFunction.cpp class X { public: static void f(){}; }; int main() { X::f(); } ///:~
When you see static member functions in a
class, remember that the designer intended that function to be conceptually
associated with the class as a whole.
A static member function cannot
access ordinary data members, only static data members. It can call only
other static member functions. Normally, the address of the current
object (this) is quietly passed in when any
member function is called, but a static member has no
this, which is the reason it cannot access
ordinary members. Thus, you get the tiny increase in speed afforded by a global
function because a static member function doesn’t have the extra
overhead of passing this. At the same time you get the benefits of having
the function inside the class.
For data members, static indicates
that only one piece of storage for member data exists for all objects of a
class. This parallels the use of static to define objects inside a
function to mean that only one copy of a local variable is used for all calls of
that function.
//: C10:StaticMemberFunctions.cpp class X { int i; static int j; public: X(int ii = 0) : i(ii) { // Non-static member function can access // static member function or data: j = i; } int val() const { return i; } static int incr() { //! i++; // Error: static member function // cannot access non-static member data return ++j; } static int f() { //! val(); // Error: static member function // cannot access non-static member function return incr(); // OK -- calls static } }; int X::j = 0; int main() { X x; X* xp = &x; x.f(); xp->f(); X::f(); // Only works with static members } ///:~
Because they have no this pointer,
static member functions can neither access non-static data members
nor call non-static member functions.
Notice in main( ) that a
static member can be selected using the usual dot or arrow syntax,
associating that function with an object, but also with no object (because a
static member is associated with a class, not a particular object), using
the class name and scope resolution operator.
Here’s an interesting feature:
Because of the way initialization happens for static member objects, you
can put a static data member of the same class inside that class.
Here’s an example that allows only a single object of type Egg to
exist by making the constructor private. You can access that object, but you
can’t create any new Egg objects:
//: C10:Singleton.cpp // Static member of same type, ensures that // only one object of this type exists. // Also referred to as the "singleton" pattern. #include <iostream> using namespace std; class Egg { static Egg e; int i; Egg(int ii) : i(ii) {} Egg(const Egg&); // Prevent copy-construction public: static Egg* instance() { return &e; } int val() const { return i; } }; Egg Egg::e(47); int main() { //! Egg x(1); // Error -- can't create an Egg // You can access the single instance: cout << Egg::instance()->val() << endl; } ///:~
The initialization for E happens
after the class declaration is complete, so the compiler has all the information
it needs to allocate storage and make the constructor call.
To completely prevent the creation of any
other objects, something else has been added: a second private constructor
called the
copy-constructor. At this
point in the book, you cannot know why this is necessary since the copy
constructor will not be introduced until the next chapter. However, as a sneak
preview, if you were to remove the copy-constructor defined in the example
above, you’d be able to create an Egg object like
this:
Egg e = *Egg::instance(); Egg e2(*Egg::instance());
Both of these use the copy-constructor,
so to seal off that possibility the copy-constructor is declared as private (no
definition is necessary because it never gets called). A large portion of the
next chapter is a discussion of the copy-constructor so it should become clear
to you
then.
Within a specific translation unit, the
order of initialization of static objects is guaranteed to be the order in which
the object definitions appear in that translation unit.
The order of destruction is guaranteed to be the reverse of the order of
initialization.
However, there is no guarantee concerning
the order of initialization of static objects across translation units,
and the language provides no way to specify this order. This can cause
significant problems. As an example of an instant disaster (which will halt
primitive operating systems and kill the process on sophisticated ones), if one
file contains
// First file #include <fstream> std::ofstream out("out.txt");
and another file uses the out
object in one of its initializers
// Second file #include <fstream> extern std::ofstream out; class Oof { public: Oof() { std::out << "ouch"; } } oof;
the program may work, and it may not. If
the programming environment builds the program so that the first file is
initialized before the second file, then there will be no problem. However, if
the second file is initialized before the first, the constructor for Oof
relies upon the existence of out, which hasn’t been constructed yet
and this causes chaos.
This problem only occurs with static
object initializers that depend on each other, because by the time you
get into main( ), all constructors for static objects have already
been called.
A subtler example can be found in the
ARM.[47]
In one file you have at the global scope:
extern int y; int x = y + 1;
and in a second file you have at the
global scope:
extern int x; int y = x + 1;
For all static objects, the
linking-loading mechanism guarantees a static initialization to
zero before the dynamic
initialization specified by the programmer takes place. In the previous example,
zeroing of the storage occupied by the fstream out object has no special
meaning, so it is truly undefined until the constructor is called. However, with
built-in types, initialization to zero does have meaning, and if the
files are initialized in the order they are shown above, y begins as
statically initialized to zero, so x becomes one, and y is
dynamically initialized to two. However, if the files are initialized in the
opposite order, x is statically initialized to zero, y is
dynamically initialized to one, and x then becomes two.
Programmers must be aware of this because
they can create a program with static initialization dependencies and get it
working on one platform, but move it to another compiling environment where it
suddenly, mysteriously, doesn’t
work.
There are three approaches to dealing
with this problem:
This technique was pioneered by Jerry
Schwarz while creating the iostream library (because the
definitions for cin, cout, and cerr are static and
live in a separate file). It’s actually inferior to the second technique
but it’s been around a long time and so you may come across code that uses
it; thus it’s important that you understand how it works.
This technique requires an additional
class in your library header file. This class is responsible for the dynamic
initialization of your library’s static objects. Here is a simple
example:
//: C10:Initializer.h // Static initialization technique #ifndef INITIALIZER_H #define INITIALIZER_H #include <iostream> extern int x; // Declarations, not definitions extern int y; class Initializer { static int initCount; public: Initializer() { std::cout << "Initializer()" << std::endl; // Initialize first time only if(initCount++ == 0) { std::cout << "performing initialization" << std::endl; x = 100; y = 200; } } ~Initializer() { std::cout << "~Initializer()" << std::endl; // Clean up last time only if(--initCount == 0) { std::cout << "performing cleanup" << std::endl; // Any necessary cleanup here } } }; // The following creates one object in each // file where Initializer.h is included, but that // object is only visible within that file: static Initializer init; #endif // INITIALIZER_H ///:~
The declarations for x and
y announce only that these objects exist, but they don’t allocate
storage for the objects. However, the definition for the Initializer init
allocates storage for that object in every file where the header is included.
But because the name is static (controlling visibility this time, not the
way storage is allocated; storage is at file scope by default), it is visible
only within that translation unit, so the linker will not complain about
multiple definition errors.
Here is the file containing the
definitions for x, y, and initCount:
//: C10:InitializerDefs.cpp {O} // Definitions for Initializer.h #include "Initializer.h" // Static initialization will force // all these values to zero: int x; int y; int Initializer::initCount; ///:~
(Of course, a file static instance of
init is also placed in this file when the header is included.) Suppose
that two other files are created by the library user:
//: C10:Initializer.cpp {O} // Static initialization #include "Initializer.h" ///:~
and
//: C10:Initializer2.cpp //{L} InitializerDefs Initializer // Static initialization #include "Initializer.h" using namespace std; int main() { cout << "inside main()" << endl; cout << "leaving main()" << endl; } ///:~
Now it doesn’t matter which
translation unit is initialized first. The first time a translation unit
containing Initializer.h is initialized, initCount will be zero so
the initialization will be performed. (This depends heavily on the fact that the
static storage area is set to zero before any dynamic initialization takes
place.) For all the rest of the translation units, initCount will be
nonzero and the initialization will be skipped. Cleanup happens in the reverse
order, and ~Initializer( ) ensures that it will happen only
once.
This example used built-in types as the
global static objects. The technique also works with classes, but those objects
must then be dynamically initialized by the Initializer class. One way to
do this is to create the classes without constructors and destructors, but
instead with initialization and cleanup member functions using different names.
A more common approach, however, is to have pointers to objects and to create
them using new inside Initializer( ).
Long after technique one was in use,
someone (I don’t know who) came up with the technique explained in this
section, which is much simpler and cleaner than technique one. The fact that it
took so long to discover is a tribute to the complexity of C++.
This technique relies on the fact that
static
objects inside functions are initialized the first time (only) that the function
is called. Keep in mind that the problem we’re really trying to solve here
is not when the static objects are initialized (that can be controlled
separately) but rather making sure that the initialization happens in the proper
order.
This technique is very neat and clever.
For any initialization dependency, you place a static object inside a function
that returns a reference to that object. This way, the only way you can access
the static object is by calling the function, and if that object needs to access
other static objects on which it is dependent it must call their
functions. And the first time a function is called, it forces the initialization
to take place. The order of static initialization is guaranteed to be correct
because of the design of the code, not because of an arbitrary order established
by the linker.
To set up an example, here are two
classes that depend on each other. The first one contains a bool that is
initialized only by the constructor, so you can tell if the constructor has been
called for a static instance of the class (the static storage area is
initialized to zero at program startup, which produces a false value for
the bool if the constructor has not been called):
//: C10:Dependency1.h #ifndef DEPENDENCY1_H #define DEPENDENCY1_H #include <iostream> class Dependency1 { bool init; public: Dependency1() : init(true) { std::cout << "Dependency1 construction" << std::endl; } void print() const { std::cout << "Dependency1 init: " << init << std::endl; } }; #endif // DEPENDENCY1_H ///:~
The constructor also announces when it is
being called, and you can print( ) the state of the object to find
out if it has been initialized.
The second class is initialized from an
object of the first class, which is what will cause the
dependency:
//: C10:Dependency2.h #ifndef DEPENDENCY2_H #define DEPENDENCY2_H #include "Dependency1.h" class Dependency2 { Dependency1 d1; public: Dependency2(const Dependency1& dep1): d1(dep1){ std::cout << "Dependency2 construction "; print(); } void print() const { d1.print(); } }; #endif // DEPENDENCY2_H ///:~
The constructor announces itself and
prints the state of the d1 object so you can see if it has been
initialized by the time the constructor is called.
To demonstrate what can go wrong, the
following file first puts the static object definitions in the wrong order, as
they would occur if the linker happened to initialize the Dependency2
object before the Dependency1 object. Then the order is reversed to show
how it works correctly if the order happens to be “right.” Lastly,
technique two is demonstrated.
To provide more readable output, the
function separator( ) is created. The trick is that you can’t
call a function globally unless that function is being used to perform the
initialization of a variable, so separator( ) returns a dummy value
that is used to initialize a couple of global variables.
//: C10:Technique2.cpp #include "Dependency2.h" using namespace std; // Returns a value so it can be called as // a global initializer: int separator() { cout << "---------------------" << endl; return 1; } // Simulate the dependency problem: extern Dependency1 dep1; Dependency2 dep2(dep1); Dependency1 dep1; int x1 = separator(); // But if it happens in this order it works OK: Dependency1 dep1b; Dependency2 dep2b(dep1b); int x2 = separator(); // Wrapping static objects in functions succeeds Dependency1& d1() { static Dependency1 dep1; return dep1; } Dependency2& d2() { static Dependency2 dep2(d1()); return dep2; } int main() { Dependency2& dep2 = d2(); } ///:~
The functions d1( ) and
d2( ) wrap static instances of Dependency1 and
Dependency2 objects. Now, the only way you can get to the static objects
is by calling the functions and that forces static initialization on the first
function call. This means that initialization is guaranteed to be correct, which
you’ll see when you run the program and look at the
output.
Here’s how you would actually
organize the code to use the technique. Ordinarily, the static objects would be
defined in separate files (because you’re forced to for some reason;
remember that defining the static objects in separate files is what causes the
problem), so instead you define the wrapping functions in separate files. But
they’ll need to be declared in header files:
//: C10:Dependency1StatFun.h #ifndef DEPENDENCY1STATFUN_H #define DEPENDENCY1STATFUN_H #include "Dependency1.h" extern Dependency1& d1(); #endif // DEPENDENCY1STATFUN_H ///:~
Actually, the “extern” is
redundant for the function declaration. Here’s the second header
file:
//: C10:Dependency2StatFun.h #ifndef DEPENDENCY2STATFUN_H #define DEPENDENCY2STATFUN_H #include "Dependency2.h" extern Dependency2& d2(); #endif // DEPENDENCY2STATFUN_H ///:~
Now, in the implementation files where
you would previously have placed the static object definitions, you instead
place the wrapping function definitions:
//: C10:Dependency1StatFun.cpp {O} #include "Dependency1StatFun.h" Dependency1& d1() { static Dependency1 dep1; return dep1; } ///:~
Presumably, other code might also be
placed in these files. Here’s the other file:
//: C10:Dependency2StatFun.cpp {O} #include "Dependency1StatFun.h" #include "Dependency2StatFun.h" Dependency2& d2() { static Dependency2 dep2(d1()); return dep2; } ///:~
So now there are two files that could be
linked in any order and if they contained ordinary static objects could produce
any order of initialization. But since they contain the wrapping functions,
there’s no threat of incorrect initialization:
//: C10:Technique2b.cpp //{L} Dependency1StatFun Dependency2StatFun #include "Dependency2StatFun.h" int main() { d2(); } ///:~
When you run this program you’ll
see that the initialization of the Dependency1 static object always
happens before the initialization of the Dependency2 static object. You
can also see that this is a much simpler approach than technique
one.
You might be tempted to write
d1( ) and d2( ) as inline functions inside their
respective header files, but this is something you must definitely not do. An
inline function can be duplicated in every file in which it appears – and
this duplication includes the static object definition. Because inline
functions automatically default to internal linkage, this would result in having
multiple static objects across the various translation units, which would
certainly cause problems. So you must ensure that there is only one definition
of each wrapping function, and this means not making the wrapping functions
inline.
What happens if you’re writing a
program in C++ and you want to use a C library? If you make the C function
declaration,
float f(int a, char b);
the C++ compiler will decorate this name
to something like _f_int_char to support function overloading (and
type-safe linkage). However, the C compiler that compiled your C library has
most definitely not decorated the name, so its internal name will be
_f. Thus, the linker will not be able to resolve your C++ calls to
f( ).
The escape mechanism provided in C++ is
the alternate linkage specification, which was produced in the language
by overloading the extern
keyword. The extern is followed by a string that
specifies the linkage you want for the declaration, followed by the
declaration:
extern "C" float f(int a, char b);
This tells the compiler to give C linkage
to f( ) so that the compiler doesn’t decorate the
name. The only two types of
linkage specifications supported by the standard are “C” and
“C++,” but compiler vendors have the option of supporting
other languages in the same way.
If you have a group of declarations with
alternate linkage, put them inside braces, like this:
extern "C" { float f(int a, char b); double d(int a, char b); }
Or, for a header file,
extern "C" { #include "Myheader.h" }
Most C++ compiler vendors handle the
alternate linkage specifications inside their header files that work with both C
and C++, so you don’t have to worry about
it.
The static keyword can be
confusing because in some situations it controls the location of storage, and in
others it controls visibility and linkage of a name.
With the introduction of C++ namespaces,
you have an improved and more flexible alternative to control the proliferation
of names in large projects.
The use of static inside classes
is one more way to control names in a program. The names do not clash with
global names, and the visibility and access is kept within the program, giving
you greater control in the maintenance of your
code.
Solutions to selected exercises
can be found in the electronic document The Thinking in C++ Annotated
Solution Guide, available for a small fee from
www.BruceEckel.com.
[47]Bjarne
Stroustrup and Margaret Ellis, The Annotated C++ Reference Manual,
Addison-Wesley, 1990, pp. 20-21.