赞
踩
软件工程,就像建房子一样,包括规划、调研、计划、需求、调查、团队、分工、材料、框架、装饰、人力、资金、周期、后期使用、保养、维护、退出、更新、风险、质量、测试、漏洞、文档、稳定性、性能、品牌等等。
设计一个软件,不是一个人的活。做好一个软件,是一件难事。一步一步,管理好各方面因素,方可完成。
来源于
April 24, 2019// by Deb Haldar// 12 Comments
For many C++ developers, API Design probably makes number 3 or 4 on their priority list. Majority of developers flock to C++ for the raw power and control it provides. Consequently, performance and optimization is what occupies the thoughts of these devs eighty percent of the time.
Of course, there are aspects of header file design that every C++ dev thinks about – but API design is so much more than just header file design. In fact, I highly recommend that every developer give some though on the design of their API, whether public facing or internal, as it can save you a lot in maintenance cost, provide a smooth upgrade path and save headaches for your customers.
Many of the mistakes cataloged below are a combination of my own experience and things I learnt from Martin Reddy’s fantastic book C++ API Design, which I highly recommend. If you really want a deep understanding of C++ API Design, you should read Martin Reddy’s book and then use the list below as more of a checklist to enforce for code review purposes.
Why is this a mistake?
Because you have no idea in which code base your API will be used, especially for external APIs. If you do not constrain your API functionality to a namespace, it might cause name conflicts with other APIs used in that system.
Example:
Let’s consider a very simple API and a client class that uses it
- //API - In Location.h
- class vector
- {
- public:
- vector(double x, double y, double z);
- private:
- double xCoordinate;
- double yCoordinate;
- double zCoordinate;
- };
-
-
- //Client Program
- #include "stdafx.h"
- #include "Location.h"
- #include <vector>
-
- using namespace std;
-
- int main()
- {
- vector<int> myVector;
- myVector.push_back(99);
-
- return 0;
- }
If someone tries to use this class in a project that also uses std::vector, they’ll get an error “error C2872: ‘vector’: ambiguous symbol“. This is because the compiler can’t decide which vector the client code is referring to – the std::vector or the vector object defined in Location.h
How to Fix this?
Always put your API in a custom namespace like:
- //API
- namespace LocationAPI
- {
- class vector
- {
- public:
- vector(double x, double y, double z);
- private:
- double xCoordinate;
- double yCoordinate;
- double zCoordinate;
- };
- }
The other alternative is to put a unique prefix to all your public API symbols.If following this convention we’d call our class “lvector” instead of “vector”. This method is used in OpenGL and QT.
In my opinion, this makes sense if you are developing a pure C API. It’s an additional headache to ensure that all your public symbols conform to this unique naming convention. If you’re using C++, you should just group your API functionality in a namespace and let the compiler do the heavy lifting for you.
I’d also highly encourage you to use nested namespaces for grouping functionalities or segregating public APIs from internal ones. A great example of this is the The Boost libraries which liberally makes use of nested namespaces. Inside of the root “boost” namespace, for example, boost::variant contains the public symbols for the Boost Variant API and boost::detail::variant contains the internal details for that API.
Why is this a mistake?
This’ll cause all the symbols in the referenced namespace to become visible in the global namespace and nullify the benefits of using namespaces in the first place.
Additionally:
How to fix this ?
1. Try to avoid putting any using namespace declarations in your header files. If you absolutely need some namespace objects to get your headers to compile, please use the fully qualified names (Eg. std::cout , std::string )in the header files.
- //File:MyHeader.h:
- class MyClass
- {
- private:
- Microsoft::WRL::ComPtr _parent;
- Microsoft::WRL::ComPtr _child;
- }
2. If recommendation #1 above causes too much code clutter – restrict your “using namespace” usage to within the class or namespace defined in the header file. Another option is using scoped aliases in your header files as shown below.
- //File:MyHeader.h:
-
- class MyClass
- {
- namespace wrl = Microsoft::WRL; // note the aliasing here !
- private:
- wrl::ComPtr _parent;
- wrl::ComPtr _child;
- }
For additional gotchas associated with C++ header files, please refer to the post “Top 10 C++ header file mistakes and how to fix them“.
What is Rule of Three?
The Rule of three states that If a class defines a destructor, copy constructor or copy assignment operator then it should probably explicitly define all three, and not rely on their default implementation.
Why is Ignoring rule of three a mistake ?
If you define any of them, chances are that your class is managing a resource( memory, fileHandle, socket etc.). Thus:
Let’s take a look at an example – in the API below, we’ve a resource int* managed by MyArray class. We created a destructor for the class because we know that we have to deallocate the memory for the int* when we destroy the managing class. So far so good.
Now let’s assume the client of your API uses it as below.
- int main()
- {
- int vals[4] = { 1, 2, 3, 4 };
-
- MyArray a1(4, vals); // Object on stack - will call destructor once out of scope
- MyArray a2(a1); // DANGER !!! - We're copyin the reference to the same object
-
- return 0;
- }
So what happened here?
The client created an instance of the class a1 on eth stack via the constructor. Then he created another instance a2 by copying from a1. When a1 goes out of scope, the destructor deletes the memory for the underlying int*. But then when a2 goes out of scope, it invokes the destructor again and tries to free the memory for the int* again [ this problem is known as a double free] which leads to a heap corruption.
Since we did not provide an copy constructor and did not mark our API as non-copyable, there was no way for the client to know that he should not copy MyArray objects.
How to Fix this?
There’s essentially a few things we can do:
Here’s the code to fix the issue by providing the copy constructor and copy assignment operator:
- // File: RuleOfThree.h
-
- class MyArray
- {
- private:
- int size;
- int* vals;
-
- public:
- ~MyArray();
- MyArray(int s, int* v);
- MyArray(const MyArray& a); // Copy Constructor
- MyArray& operator=(const MyArray& a); // Copy assignment operator
- };
-
- // Copy constructor
- MyArray::MyArray(const MyArray &v)
- {
- size = v.size;
- vals = new int[v.size];
- std::copy(v.vals, v.vals + size, checked_array_iterator<int*>(vals, size));
- }
-
- // Copy Assignment operator
- MyArray& MyArray::operator =(const MyArray &v)
- {
- if (&v != this)
- {
- size = v.size;
- vals = new int[v.size];
- std::copy(v.vals, v.vals + size, checked_array_iterator<int*>(vals, size));
- }
- return *this;
- }
The second way to fix this is to make the class non-copyable by deleting the copy constructor and copy assignment operator.
- // File: RuleOfThree.h
-
- class MyArray
- {
- private:
- int size;
- int* vals;
-
- public:
- ~MyArray();
- MyArray(int s, int* v);
- MyArray(const MyArray& a) = delete;
- MyArray& operator=(const MyArray& a) = delete;
- };
Now when the client tries to make a copy of the class, he’ll encounter a compile time error : error C2280: ‘MyArray::MyArray(const MyArray &)’: attempting to reference a deleted function
[click_to_tweet tweet=”ALWAYS PREFER COMPILE TIME and LINK TIME ERRORS TO RUN TIME ERRORS” quote=”ALWAYS PREFER COMPILE TIME and LINK TIME ERRORS TO RUN TIME ERRORS”]
Addendum for C++11:
The rule of three has now transformed into the rule of 5 to factor in the move constructor and move assignment operator. So in our case, if we’re to make the classnon-copyable and non-movable, we’ll mark the Move constructors and movbe assignment operators as deleted.
- class MyArray
- {
- private:
- int size;
- int* vals;
-
- public:
- ~MyArray();
- MyArray(int s, int* v);
- //The class is Non-Copyable
- MyArray(const MyArray& a) = delete;
- MyArray& operator=(const MyArray& a) = delete;
- // The class is non-movable
- MyArray(MyArray&& a) = delete;
- MyArray& operator=(MyArray&& a) = delete;
- };
ADDITIONAL WARNING: If you define a copy constructor for the class (including marking it as deleted), no move constructor is created for that class. So if your class just contains simple data types and you planned on using the implicitly generated move constructor, it’ll not be possible if you define a copy constructor. In that case, you must explicitly define the move constructor.
In general, a move operation is not expected to throw. You’re basically stealing a bunch of pointers from the source object and gicing it to your destination object – which theoretically should not throw.
Why is this a mistake ?
An STL container can only use the move constructor in it’s resizing operation if that constructor does not break its strong exception safety guarantee. For example, std::vector wont use the move constructor of an your API object if that can throw an exception. This is because if an exception is thrown in the move then the data that was being processed could be lost, whereas in a copy constructor the original will not be changed.
So if you don’t mark your MOVE CONSTRUCTOR and MOVE ASSIGNMENT OPERATOR in your API as noexcept, it can have deep performance ramifications for your client if they plan to use the STL containers. This article shows that a class that cannot be moved takes about twice the time to be placed in a vector and experience unpredictable memory spikes when compared to a class that can be moved.
How to fix it ?
Simply mark the move constructor and move assignment operator as “noexcept”
- class Tool
- {
- public:
- Tool(Tool &&) noexcept;
- };
Why is this an API Design mistake?
There are multiple ramifications of marking an API as noexcept including certain compiler optimizations such as the one for move constructors. However, from an API design perspective, if your API truly does not throw, it reduces the code complexity on your client because now they don’t need to have multiple try/catch blocks in their code. This also has two additional benefits:
How to fix it ?
Just mark APIs that does not throw as noexcept.
Why is this an API design mistake ?
The compiler is allowed to make one implicit conversion to resolve the parameters to a function. This implies that the compiler can use constructors callable with *single argument* to convert from one type to another in order to get the right type of parameter.
For example, if we have the following single parameter constructor in the location API:
- namespace LocationAPI
- {
- class vector
- {
- public:
- vector(double x);
- // .....
- };
- }
We can invoke the following code:
LocationAPI::vector myVect = 21.0;
This will call the vector single-argument constructor with the double argument of 21.0. However, this type of implicit behavior can be confusing, unintuitive, and, in most cases, unintended.
As a further example of this kind of undesired implicit conversion, consider the following function signature:
CheckXCoordinate(20.0, 20.0);
This weakens the type safety of your API because now the compiler will not enforce the type of the first argument to be an explicit vector object.
As a result, there’s the potential for the user to forget the correct order of arguments and pass them in the wrong order.
How to Fix this ?
This is why you should always use the explicit keyword for any single-argument constructors unless you know that you want to support implicit conversion.
- class vector
- {
- public:
- explicit vector(double x);
- //.....
- }
Why is this a mistake?
Sometimes, your API will take as input some data structure from your clients. Marking the methods and method parameters as const indicates to the client that you’ll be using that data in a read only mode. Conversely, if you do not mark your APIs methods and parameters as const , your client might be inclined to pass you a copy of the data because you’re making no such guarantees. Depending on how frequently the client code is acalling your API, the performance implication can run from minor to severe.
How to fix this?
When your API needs read-only access to client data, mark the API methods and/or parameters as const.
Let’s assume you need a function to just *check* of two coordinates are the same.
- //Don't do this:
- bool AreCoordinatesSame(vector& vect1, vector& vect2);
Instead, mark the method as const such that the client knows you’ll not modify the vector objects the client passes in.
bool AreCoordinatesSame(vector& vect1, vector& vect2) const;
Const correctness is a huge topic – please refer to a good C++ text book or read the FAQ section in Standard C++.
Why is this a mistake ?
On the face of it, returning an object by const reference seems like a win-win. This is because:
However, this could lead to some searious issues – namely:
How to fix this ?
Follow the three step rule:
In implicit instantiation, the the internals of your template code has to be put in the header files. There is no way around it. However, you can separate the template declaration ( which your API users will refer to) from the template instantiation by putting the instantiation in separate header file as follows:
- // File: Stack.h ( Public interface)
- #pragma once
-
- #ifndef STACK_H
- #define STACK_H
-
- #include <vector>
-
- template <typename T>
- class Stack
- {
- public:
- void Push(T val);
- T Pop();
- bool IsEmpty() const;
-
- private:
- std::vector<T> mStack;
- };
-
- typedef Stack<int> IntStack;
- typedef Stack<double> DoubleStack;
- typedef Stack<std::string> StringStack;
-
- // isolate all implementation details within a separate header
- #include "stack_priv.h"
-
- #endif
- // File: Stack_priv.h ( hides implementation details of the Stack class)
- #pragma once
- #ifndef STACK_PRIV_H
- #define STACK_PRIV_H
-
- template <typename T>
- void Stack<T>::Push(T val)
- {
- mStack.push_back(val);
- }
-
- template <typename T>
- T Stack<T>::Pop()
- {
- if (IsEmpty())
- {
- return T();
- }
-
- T val = mStack.back();
- mStack.pop_back();
-
- return val;
- }
-
- template <typename T>
- bool Stack<T>::IsEmpty() const
- {
- return mStack.empty();
- }
-
- #endif
This technique is used by many high-quality template-based APIs, such as various Boost headers. It has the benefit of keeping the main public header uncluttered by implementation details while isolating the necessary exposure of internal details to a separate header that is clearly designated as containing private details.
Why is this a mistake?
Implicit instantiation is plagued by the following problems from an API design perspective:
How to fix this ?
If you know that your template will only ever be used with int, double and string – you can use explicit instantiation to generate template specializations for those three types. It brings down your client’s build time, isolates you from having to seal with untested types in your templates and keeps your template code logic hidden in your cpp files.
To do this is simple – just follow the three step process:
Step 1: Move the implementation of the stack template code in a cpp file
At this point, Let’s try to insstantiate and use the push() method of a stack,
- Stack<int> myStack;
- myStack.Push(31);
We’ll get a linker error:
error LNK2001: unresolved external symbol "public: void __thiscall Stack<int>::Push(int)" (?Push@?$Stack@H@@QAEXH@Z)
This is the linker telling us that it could not locate the definition of the push method anywhere. No wonder, because we have not instantiated it yet.
Step 2: Create a template instance of the int, double and string types at the bottom of your cpp file:
- // explicit template instantiations
-
- template class Stack<int>;
-
- template class Stack<double>;
-
- template class Stack<std::string>;
Now you’ll be able to build and run the stack code.
Step 3: Tell the client that your API supports the three specializations for int, double and string by putting the following typedefs at the end your header file:
- typedef Stack<int> IntStack;
-
- typedef Stack<double> DoubleStack;
-
- typedef Stack<std::string> StringStack;
WARNING: If you do explicit specialization, the client will not be able to create further specializations (and the compiler will not be able to create implicit instantiations for the user either) because the implementation details are hidden in our .cpp file. Please make sure that this is the intended use case for your API.
Why is this a problem ?
Default arguments are often used to extend an API in newer version to augment functionality in a way that does not break backwards compatibility of the API.
For example, let’s say that you released an API with the following signature:
- //Constructor
- Circle(double x, double y);
Later you decide that specifying the radius as an argument will be useful. So you release a new version of the API with the radius as the third argument. However, you do not want to break existing clients – so you prive the radius a default argument:
- // New API constructor
- Circle(double x, double y, double radius=10.0);
In this way, any client that was using the API with just the x and y coordinates can keep using it. The approach sounds like a good idea.
However, it suffers from multiple issues:
- Circle(double x=0, double y=0, double radius=10.0);
- Circle c2(2.3); // Does it make sense to have an x value without an Y value? May or may not !
How to Fix this ?
Provide multiple overloaded methods instead of using default arguments. For example,
- Circle();
-
- Circle(double x, double y);
-
- Circle(double x, double y, double radius);
The implementation of the first two constructors can use a default value for the attributes that are not specified. ut importantly, these default values are specified in the .cpp file and are not exposed in the .h file. As a result, a later version of the API could change these values without any impact on the public interface.
Additional Notes:
#defines were used in C code to define constants. For example:
#define GRAVITY 9.8f
Why is this a mistake ?
In C++, you should not use #defines for internal constants because of the following reasons:
How to Fix this?
Use static consts in code instead of #defines for simple constants. For example:
static const float Gravity;
Even better, if the value is known at compile time, use a constexpr:
constexpr double Gravity = 9.81;
For more details on consts vs constexpr – please check out : c++ - const vs constexpr on variables - Stack Overflow
In C code, sometimes #defines are used to define network states like :
- #define BATCHING 1
- #define SENDING 2
- #define WAITING 3
In C++, always use an enum class to do this:
enum class NetworkState { Batching, Sending, Waiting }; // enum class
In C++, friendship is a way for your class to grant full access privileges to another class or function. The friend class or function can then access all protected and private members of your class.
While this flies in the face of Object Oriented Design and Encapsulation, this can be useful in practice. If you’re developing a large system with many components and want to expose functionality in one compoennet to only selected clients(Test classes?), this can really make things a lot easier.
In fact, the [InternalsVisible] attribute in .Net does serve a similar objective.
However, friend classes should not be exposed in public APIs.
Why is using friends in C++ a mistake?
Because the friendship in a public API can allow a client to break your encapsulation and use your system objects in a way that was not intended.
Even if we set aside the general problem of internal discovery/ IP, the client might use the API in a way not intended, tank their system and then call your support team to fix the problem they created by not using the API in an unintended way in the first place.
So is it their fault ? No ! It is your fault for allowing them to shoot themselves in the foot in the first place by exposing the friend class.
How to fix it ?
Avoid using friends in Public API classes. They tend to indicate a poor design and can allow users to gain access to all protected and private members of your API.
Why is this a mistake ?
Unnecessary header files can significantly increase the compilation times. This not only causes time lost for developers who need to build the code with your APIs locally, but also incurss heavy cost by consuming cycles on automated build agents which probably needs to build the code thousands of times every day.
Additionally,anecdotally speaking, having large monolithic headers will compromise the effectiveness of build parallelization systems like Incredibuild and FastBuild.
How to fix this?
Why is this a mistake?
Using forward declaration for API objects not owned by you can break the client code in unexpected ways. For example, If the client decides to move to a different version of the foreign API header, your forward declaration will break if the forward declared class has been changed to a typedef or a templated class.
Viewed another way, if you forward declare a class from a foreign header, you’re basically locking in your client to always use the version of the foreign header you’ve declared — so basically he can’t upgrade that foreign dependency anymore !!!
How to fix this ?
You should only forward declare symbols from your on API. Also, never forward declare STL types etc.
Please see this question on stackoverflow for additional discussion on this topic: c++ - What are the risks to massively forward declaration classes in header files? - Stack Overflow
A header file should have everything it needs to compile by itself , i.e., it should explicitly #include or forward declare the types/ structs it needs to compile.
If a header file does not have everything it needs to compile but the program incorporating the header file compiles, it indicates that somehow the header file is getting what it needs because of an include order dependency. This typically happens because another header file gets included in the compile chain before this incompilable header file which provides the missing functionality.
If the include order/build order dependency changes, then the whole program might break in unexpected ways. The C++ compiler is notorious for misleading error messages and it might not be easy to locate the error at that point.
How to fix this ?
Check your header filies by compiling them in isolation via a testMain.cpp that includes nothing but the header file under test. If it produces a compilation error, then something either needs to get included in the header file or forward declared. The process should be repeated for all header files in the project using a bottoms-up approach. This’ll help prevent random build break as the code base grows larger and code blocks are moved around.
The client should be able to check both at compile time and runtime what version of your API is integrated into their system. If such information is lacking, they’ll not be able to take effective updates/patches.
It’ll also be difficult to add backwards compatibility for their code on different platforms.
Also, version number of the product is the first thing our escalation engineers ask when a customer reports an issue.
Whether your clients prefers a static library or a dynamic link library should dictate a lot of your design choices. For example:
How to avoid this?
There is no magic to this – it boils down to plain old requirements gathering – just make sure to bring up the static vs dynamic library implications with you client in the early stages of discussion.
Wikipedia defines application binary interface (ABI) is an interface between two binary program modules; often, one of these modules is a library or operating system facility, and the other is a program that is being run by a user.
A library is binary compatible, if a program linked dynamically to a former version of the library continues running with newer versions of the library without the need to recompile.
Binary compatibility saves a lot of trouble. It makes it much easier to distribute software for a certain platform. Without ensuring binary compatibility between releases, people will be forced to provide statically linked binaries. Static binaries are bad because they waste resources (especially memory) don’t allow the program to benefit from bug fixes or extensions in the libraries. There is a reason why the windows subsystem is packaged as a collection of DLLs — this makes those windows updates(patching) a breeze – well, maybe not really, but that’s because of other issues
For example, here are the mangled names of two different functions (i.e., the symbol names that are used to identify a function in an object or library file):
- // version 1.0
-
- void SetAudio(IAudio *audioStream) //[Name Mangling] ->_Z8SetAudioP5Audio
-
- // version 1.1
-
- void SetAudio(IAudio *audioStream, bool high_frequency = false) // [Name Mangling] ->_Z8SetAudioP5Audiob
These two methods are source compatible, but they are not binary compatible, as evidenced by the different mangled names that each produces. This means that code compiled against version 1.0 cannot simply use version 1.1 libraries because the _Z8SetAudioP5Audio symbol is no longer defined.
How to be ABI Compatible?
First of all, familiarize yourself with the ABI compatible and ABI breaking changes . Then, follow the additional guidance given by Martin Reddy in his book:
Why is this a mistake?
Consider the following code:
- class SubClassMe
- {
- public:
- virtual ~SubClassMe();
-
- virtual void ExistingCall() = 0;
-
- virtual void NewCall() = 0; // added in new release of API
- };
This is an API breaking change for all your existing clients because now they must now define an implementation for this new method, as otherwise their derived classes will not be concrete and their code will not compile.
How to fix this ?
The fix is simple – provide a default implementation for any new methods that you add to an abstract base class, that is, to make them virtual but not pure virtual.
- class SubClassMe
- {
- public:
- virtual ~SubClassMe();
-
- virtual void ExistingCall() = 0;
-
- virtual void NewCall(); // added in new release of API
- };
Consider the following piece of code in a public header file:
static void ExecuteRequest(CallRequestContainer& reqContainer);
When i look at this, I have absolutely no idea whether this method will return immediately ( async) or block ( synchronous). This’ll immesely influence of how and where i can use this code. For example, if this is a synchronous call, I’d never use it in a time critical code path like a game scene render loop.
How to fix this ?
There are a couple of things that can help:
std::future<StatusCode> ExecuteRequest(CallRequestContainer& reqContainer);
static void ExecuteRequestAsync(CallRequestContainer& reqContainer);
You should always have a good idea about what compiler/C++ standards your customers are primarily using. For example, if you know that a lot of your customers are adding functionality to their existing product which is using C++11, do not take a dependency on any C++14 features.
We had a recent support request submitted to us where the customer was using an older version of visual studio and the C++14 function make_unique wasn’t available. We had to make a conditional compilation fix for the customer – luckily this was in just a few places.
If you distribute your API as source code, please consider using header only libraries.
There are several advantages to distributing header only libraries:
This is a very popular model used by many open source projects including Boost and RapidJson.
This came up as part of a recent review of some legacy code we inherited( exact code changed changed for privacy).
The header file had the following typedefs:
- typedef Stack<int> IntStack;
- typedef Stack<double> DoubleStack;
- typedef Stack<std::string> StringStack;
There were a few methods scattered over the codebase that did not use the typedefs and used Stack<T> types explicitly. One of the public methods, if i recollect correctly had the following signature:
void CheckStackFidelity(IntStack testIntStack, Stack<std::string> testStringStack);
How to fix this ?
It doesn’t really matter if you choose the typedef version or non-typedef version. The key thing is “STAY CONSISTENT” – just pick one convention and stick with it.
I’ve very often seen and personally guity of not holding an API review early in the development process.This was due to not having any structured directives in place for API reviews in place.
I’ve seen multiple problems crop up when there is no process in place including:
Marketing wanted it and it caused a lot of late stage refactoring and delays.
How to fix this ?
In order to avoid the type of hassles pointed out above, you should establish a process which will at least do the following:
Well, there you go – those were the top 25 mistakes I’d watch out for C++ APIs. The list is in no way comprehensive – you should definitely pick up a copy of Martin Reddy’s book for an in-depth treatment of the subject. Happy API Reviewing
Category: ALL, CODE CRAFTTag: C++, C++ 11, C++ 14
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。