Rules for properly structuring and developing a C++ project

There are many different methods programmers have picked up for laying out a C++ project or library.

After a number of years of implementing, interacting and improving C++ projects, I’ve found the following to be the best structure to use.

These design rules and patterns are designed to increase consistency, readability, modularity and reduce programmer error.

Tribute to the refinement of my coding style and practices goes to a number of people I’ve worked with (read: leached their brains), including but not limited to David Hogan, Jeremy Kincaid, et al.

Last updated 2011/02/16

Separate your code

The first thing you need to do is ensure your project doesn’t clash with any other libraries. The best way to do this is to use namespaces. Use them everywhere.

Got a library? Put it in a namespace!

Got a module? Put it in a namespace!

Got a common set of libraries? Put them all in a namespace!

Namespace for projects and modules

All of your projects code (with the exception of main) should be in a namespace. This will ensure that any function you define will not clash with any other library now, or in the future.

Twisted Pair have developed a C++ library called Shock.

Shock is divided into separate modules. All code is within the “Shock” namespace. All modules are then given their own namespace within this one.

For example, out types module defines common type definitions for all platforms. This is in the Shock::Types namespace.

This makes coding more verbose, but it’s better to have code that is clear about its intentions.

Enumerations

C++ has poor scoping when using values in an enumerations. Enumerations are not only easily castable between integers and other enumeration types, but they also share scope with their parent.

For example:

enum MyEnum

{

MyValue,

};

enum YourEnum

{

YourValue,

};

class Example

{

public:

enum LanguageEnum

{

English,

French,

};

enum Currency

{

Dollars,

Francs,

};

};

int main( int argc, char** argv )

{

MyEnum eValue( MyValue ); // enumeration values are at global scope!

Example::LanguageEnum eLanguage( Example::French );

eLanguage = (Example::LanguageEnum)Example::Dollars; // Woah! Hold on there!

}

So not only are all enumerations in the scope of the class (Example::French) instead of the scope of the enumeration itself (Example::Language::French), but they are easily interchangeable.

Particularly bad are the global enumeration values which can conflict with global functions, namespaces and class names.

The first step to improve this is to change the scoping enumeration values.

Enumeration values that are considered global (not related to any one class), should be put inside a namespace.

namespace MyEnum

{

enum _t

{

MyValue,

};

};

namespace YourEnum

{

enum _t

{

YourValue,

};

};

When enumerations are related to a class, the fastest way is to create non-instantiable sub-classes in which the enumerations are defined.

class Example

{

public:

class Language

{

public:

enum _t // this can be called anything, but pick something consistant

{

English,

French,

};

private:

Language(); // private constructor to prevent instantiation

};

class Currency

{

public:

enum _t

{

Dollars,

Francs,

};

private:

Currency();

};

 

};

int main( int argc, char** argv )

{

Example::Language::_t eLanguage( Example::Language::French );

}

We have now separated the enumerations to a point where we must explicitly state which enumeration the value belongs to. This will reduce programmer error. A programmer can still cast from one enumeration value to another, but they aren’t the type of programmers we want to cater for =P.

The enumeration is still an integer when it comes down to the compiler. So we can still make mistakes like the following.

void setLanguage( int iValue );

<snip>

setLanguage( (int)Example::Currency::Dollars ); // Oops!

With our above verbose scoping of enumerations, the chance of this happening is reduced, but still present.

We can take this further by making each enumeration into its own object. Enumerations would no longer be passed around as values, but as references to objects.

class Language

{

public:

enum _t

{

English,

French,

};

explicit MyEnum( _t eInitialValue );

~MyEnum();

void setValue( _t eValue );

_t getValue();

};

void setLanguage( Language & kLang );

<snip>

Language kLang( Language::French );

Currency kCurrency( Currency::Dollars );

setLanguage( kLang ); // Ok

setLanguage( kCurrency ); // Boom! Compile error!

The setValue and constructor parameter can be cast from another enumeration, but it would have to be explicitly done by a programmer. And once again, we don’t cater to them!

By doing this, we’ve created a number of buffers against programming error.

Hash Defines

For the love of all things shiny, don’t define macros or #define values that are common. For example, don’t define “HashMap” as this would most likely be used in another library. Once that library is included the #define will over-ride any existing usage and confuse the compiler.

An example of this occurs when Ogre3D is used with Poco. A #define for HashMap introduces compile errors in Poco with it’s HashMap type.

As macros cannot be given scope, the only reliable method is to prefix all macros with your project name.

For example

#define SHOCK_ASSERT( x )

This also makes it clear which project the macro is being used from.

Also, avoid using macros in the first place. Unless you need dynamic code generation or to ensure the stack doesn’t change with a function call (such as an assert) then you probably don’t need a macro. That being said, macros are un-avoidable, so just minimise their potential damage and mis-use.

Directory structure

Source separation

Projects should separate their headers and implementation files.

The typical layout for this is as follows:

<project>/include

/src

This makes it trivial to release a library without the source, but still linkable by other projects by simply removing the /src directory from the release.

Match your namespaces

I recommend matching your header file include paths with the namespaces they exist in.

For example:

MyClass.h

namespace MyProject {

namespace MyModule {

class MyClass

{

<snip>

};

}; // namespace MyModule

}; // namespace MyProject

This class would exist in the following file

/include/MyProject/MyModule/MyClass.h

/src/MyClass.cpp

To use MyClass in your source, you just use the following include:

#include “MyProject/MyModule/MyClass.h”

You could add the “MyProject/MyModule” directory to your project’s include path, but this would remove information on where this file comes from. And it’s always better to be explicit about your intentions.

You will notice that the source file does not reside in a separate directory. This is not so important but if your modules are very complex it is probably worth doing the same for /src.

That being said, if your modules are getting that complex, then your modules are probably too complex and should be divided. But thats just my opinion =).

Coding Style

Line length

Don’t let lines of code get too long!

Long methods with large lists of parameters can create some very hard to read code like the following function declaration:

MyClass::ContainerConstIterator MyClass::iterate( MyProject::Types::uint8_t* pBuffer, size_t iLength )

Some people align the parameters with the opening parenthesis of the function declaration:

MyClass::ContainerConstIterator MyClass::iterate(

MyProject::Types::uint8_t* pBuffer,

size_t iLength )

{

<snip>

Sure, that’s…. kind of readable. But it also becomes inconsistent once the next method is defined as all the parameters will be indented at a different level! Eugh!

I picked up the following coding style from a friend of mine (David Hogan, you’re a hero!)

MyClass::ContainerConstIterator MyClass::iterate(

MyProject::Types::uint8_t* pBuffer,

size_t iLength

)

{

<snip>

You get the occasional large function name that can overflow your text editor. But your parameters will always be aligned 1 tab space in.

The same goes for calling functions and conditional statements. If the line of code has too many characters, put them on a new line. It’s also good to separate different statements when combined to form a single conditional statement.

container.iterate(

pBuffer,

iBufferSize

);

<snip>

if (

(iBufferSize < container.max()) &&

(true == container.isEmpty())

)

{

<snip>

It does take a little getting used to. But once you do, your code will be consistent and readable.

For God’s sake, use Tabs!

Don’t be one of these “use spaces for indentation” people.

If you indent with spaces, you fail.

Also, don’t mix tabs and spaces. If your editor doesn’t support auto-indent with tabs, then it’s time for a new editor.

I’m sick of code that looks like this:

if ( 0 == blah )

{

++blah;

if ( 0 == bar )

{

++bar;

}

All because someone mixed tabs and spaces and had their tab width set to a different value than mine!

Tab width can be changed by a user’s editor to fit their preference. Don’t force people to adhere to your arbitrary preference of 2, 4 or 8 spaced tabs.

Remember: Spaces are for separation. Tabs are for alignment.

Be explicit about your logic

Be clear about your conditional logic.

I personally find the following to have reduced readability:

if ( ! container.isEmpty() )

I prefer to state clearly what I am checking:

if ( false == container.isEmpty() )

Avoid double negatives

Always define API calls as positives.

Ie. isEmpty, isDeleted, isRunning.

Avoid using negatives like.

Ie. isNotEmpty, isNotDeleted, isNotRunning.

This leads to logic such as:

if ( false == object.isNotRunning() )

People have trouble resolving double negatives. This will reduce readability and increase programmer error.

Keep your definitions together

For example, the minimum and maximum values of a 32-bit integer could be defined as follows:

namespace ProjectName {

namespace Types {

namespace Limits {

namespace Int32 {

namespace Min {

int32_t value( 0x80000000 );

}; // Min

namespace Max {

int32_t value( 0x7FFFFFFF );

}; // Max

}; // Int32

}; // Limits

}; // Types

}; // ProjectName

<snip>

int32_t iValue( ProjectName::Types::Limits::Int32::Max ); // Very clear

Verbose? Yes, you could probably kill a few namespaces in there. But it’s very clear what the value is and makes the intentions of your code much clearer.

Larger projects

Use namespaces. Use them like there’s no tomorrow.

The larger your project, the more explicit your could should be. Sure, it will take a few seconds more to type some things out. But your code will be easier to read and you will reduce programmer error which is a productivity gain in the end.

Yes, you can go over-board. But you can always “find / replace” later.

Avoid Global Variables

Global variables are a hidden evil.

Yes, it’s handy to just declare a value at the top of your .cpp files for convenience. But look at this.

MyClass.cpp

int    g_iMyGlobal( 0 );

<snip>

MyApp.cpp

#include “MyClass.h”

int g_iMyGlobal( 1 ); // ok at compile time… Boom at link time!

The compiler will let MyApp have it’s own global with the same name.

But at linker time you will get errors about “multiply defined symbols”!

With global variables you need to be aware of the following:

  • No other function, class or variable in the variables scope can have the same name. For the global scope, this can be very disrupting.
  • If it’s not declared in the header, other files won’t be aware of its existence until link time, and then they’ll receive “multiply defined symbols” errors without any real information.
  • Using globals makes it harder to enforce concurrency in multi-threaded code.

A better solution is to try and use static member variables or member variables when possible.

If you must insist on using global variables (hey, I still use them!) then use an anonymous namespace!

MyClass.cpp

namespace {

int    g_iMyGlobal( 0 );

};

void myFunc()

{

std::cout

<< “MyClass: I’m using a variable in an anonymous namespace with a value of “

<< g_iMyGlobal

<< std::endl;

}

MyApp.cpp

#include “MyClass.h”

namespace {

int g_iMyGlobal( 1 ); // ok at compile and link time!

};

void myFunc()

{

std::cout

<< “MyApp: I’m using a variable in an anonymous namespace with a value of “

<< g_iMyGlobal

<< std::endl;

}

When you provide an un-named namespace, C++ automatically changes any usage of the variable to have an automatically generated namespace name in front of it. You don’t need to change your code to use an anonymous namespace.

Using the value from outside the file is not recommended. Some compilers don’t generated the same namespace name, so use anonymous namespaces for global variables that are only for use within a file.

I’m sure there are more topics I can cover here about my coding style. But these are my primary guiding principles when I write C++.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: