Announcing the C++ FAQ

AndrewBissell · on March 23, 2014

I recently started working my way through the C++ Primer after years of avoiding C++ and its reputation for complexity. My first impression is that it has actually matured into quite a nice language which strikes a balance between simplicity and programmer power which just isn't all that bad, especially given its genesis.

There are some annoyances for sure -- lots of mentions of "subtle" differences in how certain concepts work are already cropping up in introductory chapters. I'm sure there's some reason that "auto" discards top-level qualifiers while "decltype" does not, but trying to ingrain so many of those distinctions is a pain.

I'm also glad to have learned both Java and C before trying C++. The former so that I know some OO mistakes to avoid (I would not want to discover the pain of over-enthusiastic use of generics while knee deep in compiler template errors), and the latter because learning manual memory management and the stack/heap distinction is enough of a cognitive load without layering all of C++'s extra features on top. It's also good to have a bit deeper understanding of what's really going on with the memory and pointers under the hood, before handing over control to automatic resource management and smart pointers and such.

saurik · on March 23, 2014

Generally, the goal of auto is to create a variable whose purpose is to store a value of the type you have: you called a function, got a constant reference, but to store that you need a concrete value. Generally, the goal of decltype is to compute a type based on other code whose type you don't already know: a simple canonical example is to make a function that does something and transparently returns the result "as if" it were the original code. These are two drastically different use cases and, given the semantics of the various qualifiers, it makes sense to strip them for one and not for the other, lending to two different syntaxes. If you find yourself needing to transparently store the result of an expression, in C++14 you can use "decltype(auto) ;P".

jpgvm · on March 23, 2014

I too have started learning C++11 recently after having never really had success picking up the language in the past.

Coming from C I found C++ to be really difficult when I tried to pick it up a few years ago. I have since learnt Python,Ruby and C# and now I am having alot more success. So I really do second your opinion that learning something like Java (in my case C#) really helps a ton.

I also highly recommend checking out the books recommended here: http://isocpp.org/get-started I am currently reading C++ Primer. :)

userbinator · on March 23, 2014

C++ is mostly an extension of C (there are some subtle differences like changes in the type system), so I also recommend C before C++; understanding how classes and methods are implemented (which would be the natural way to do it in C anyway) gives you better knowledge of practical, pragmatic use of OOP.

As you mention, the layering of C++ is another thing that can guide the approach to learning the language: begin with the simple concepts that C++ added, like classes with methods, and then move on to inheritance, virtual methods, overloading/overriding, operator overloading, exceptions, and then templates, making sure you understand the previous ones before proceeding to the next extension. It's a big language, with many features, not all of which you'll need to use for any one application.

pjmlp · on March 23, 2014

> C++ is mostly an extension of C (there are some subtle differences like changes in the type system), so I also recommend C before C++

Never ever do this in 2014. It is the path of writing unsecure code.

Modern C++ has better, more secure, approaches how to write code.

Leave C like coding to legacy code.

userbinator · on March 23, 2014

> It is the path of writing unsecure code.

Why? If anything you will learn more about the types of bugs that lead to security issues, and thus gain a better understanding of how to avoid them.

I feel like "security" is the new "think of the children" phrase these days.

pjmlp · on March 23, 2014

- Prefer STL data types (vector, string...) over the native ones

- Use references for modifiable parameters. No need to check for null

- Templates, enum classes, const/constexpr, inline calls instead of preprocessor macros

- Use of RAII and smart pointers over manual memory management

> I feel like "security" is the new "think of the children" phrase these days.

It was nothing new. It was present since the early days in the form of languages like Mesa(1970), Modula-2 (1978), among others.

Just that developers got to forget about them when UNIX got out of the universities into the enterprise. Nowadays we pay for it as we discover how important it is to have secure software.

edit: typo (import => important )

userbinator · on March 23, 2014

> Nowadays we pay for it as we discover how important it is to have secure software.

I'm going to give a very controversial perspective here: Things like jailbreaks and homebrew on consoles are possible because of certain "security flaws" that are more common in languages like C and C++. If everyone used "safer" languages, wouldn't these become much rarer? If it weren't for this "insecurity", would our computing devices and software be even more locked-down and restricted against us and more control be put in the hands of large corporate entities or governments? Making software more secure against malicious attackers also makes it more secure against those fighting for freedom and control. The world is not quite so black-and-white...

mpyne · on March 23, 2014

> Things like jailbreaks and homebrew on consoles are possible because of certain "security flaws" that are more common in languages like C and C++.

Using C++ as if it were C but s/malloc/new/ and s/free/delete/ is certainly a good way to get security vulns. But as pjmlp mentioned, that isn't how you should be using C++ nowadays.

pjmlp · on March 23, 2014

I never did any jailbreak or homebreak in my devices. I rather have memory safe languages.

To put it in perspective, I code since the days C was UNIX only so I had the pleasure to know better designed languages for system programming, that sadly went out of fashion as UNIX spread into the enterprise.

balls187 · on March 23, 2014

> If anything you will learn more about the types of bugs that lead to security issues

Usually you will learn this lesson the hard way.

kristiandupont · on March 23, 2014

>..I also recommend C before C++..

I'm going to agree with pjmlp and advise against this. It will lead you to things like relying on char*'s instead of std::string because it works, it's what you know and plenty of examples on the web do so.

mpyne · on March 23, 2014

> C++ is mostly an extension of C

This is absolutely not the case. C++ is a huge monolith of a language that happens to have a fairly large intersection with C.

Understanding technical things like the stack, heap, how new types would be laid out, etc. are all wonderful things to know for C++, that much is true. So if learning C is conducive to learning about those topics then by all means, make it a prereq.

But please don't use language that would give the impression that C++ in 2014 is sort of "C with Classes".

acc01 · on March 22, 2014

An essential companion to the consolidated C++ FAQ is the indispensable C++ FQA.

http://yosefk.com/c++fqa/

eps · on March 22, 2014

The only thing "essential" about FQA is that someone absolutely has to drop a link to it whenever there's "C++" and "FAQ" in the same sentence. It's known for being known. The actual content is utter junk - just try reading through it once.

mikeash · on March 23, 2014

It's strident and biased (and makes no attempt to hide either) but IMO it's quite informative. I've learned a lot about C++ from it.

plorkyeran · on March 23, 2014

I thought I learned a lot about C++ from it when I first read it, but I have gradually come to realize that the things I learned were mostly a mix of incorrect and irrelevant.

benched · on March 23, 2014

It's a highly suspect source at best. The original FAQ (lite) is worlds better in spirit and in content.

barrkel · on March 23, 2014

I read through it once - some years ago - and agreed with its spirit. Unfortunately, until they've been enlightened, C++ acolytes are unlikely to understand it.

Niten · on March 23, 2014

C++ is an ugly, flawed language. Despite this, it can also genuinely be the best choice for certain applications, and so there is a need for a FAQ such as this one.

But it's not "essential" that any such reference be accompanied by a recitation of the reasons the language sucks (some of them legitimate, even). The FAQ is for people using a tool to make things; the FQA seems primarily used to score points in discussions such as this one.* I think working programmers generally have more use for the former.

* There is also a need for constructive criticism of language design choices, of course, but the FQA predominantly rejects C++'s design premises (such as static compile-time polymorphism and lack of a memory-managed runtime) rather than recognize their utility and suggest better ways to accomplish them (as Mozilla is trying to do with Rust).

userbinator · on March 22, 2014

It's also worth reading this:

http://stackoverflow.com/questions/3171647/errors-in-c-fqa

barrkel · on March 23, 2014

No, that answer isn't particularly worth reading - unless you were doubting your C++ faith and wanted it shored up without much critical thought.

reality_czech · on March 23, 2014

Having a different opinion than yours is not an error.

benched · on March 23, 2014

For those who don't know, the FQA, unlike the FAQ is well known to be extremely biased and exaggerated. It is not a neutral resource - it was expressly written as a takedown, with exaggeration used to great effect.

balls187 · on March 22, 2014

Communities like Node, Python, RoR, and even iOS (obj-c) have benefited from a defacto centralized point of reference, and I think devs who come from those communities will find it easier to be more productive in C++ with something like this.

I always referred to Marshall Cline's FAQ as my reference point, but pulling together Bjarne's and StackOverflows FAQ's is a great step.

brudgers · on March 23, 2014

    Until now, there have been several different overlapping FAQs, including notably:

         Bjarne Stroustrup’s FAQ pages
         Marshall Cline’s popular C++ FAQs Online
         at least three different StackOverflow C++ FAQs in some form....
         and many others

What could be more apropos than multiple inheritance?

pjmlp · on March 23, 2014

Given that all modern languages do support some form of MI, maybe it is time to stop bashing C++ about it.

andrewflnr · on March 23, 2014

Maybe, but how many other modern languages have the same diamond inheritance problem? Most modern languages put more limits on MI.

pjmlp · on March 23, 2014

All of them have the same problem. It just shows up in different ways.

In C++ case you have two main issues:

- the decision to have or not virtual base classes, due to C++ design of only paying for what you use, the common solution to duplicate data members in OO languages is explicit in C++

- even if one restricts him/herself to use pure abstract classes without data members, as a way to implement interfaces, there is the problem of function members collisions.

Some OO languages with inheritance of state usually get away with the data member duplication, either by doing what C++ compilers do with virtual base classes.

The ones that adopt only some form interfaces/traits, have to deal with name clashes the same way as C++.

All the existing solutions are kind of good enough, but not perfect:

- Using of some form of linearisation algorithm every developer needs to be aware of, to understand which methods wins;

- Members need to be renamed, thus changing the real interface meaning

- Implementations are provided for each collided method, requiring either casts to specific interfaces or explicit calls for the right method

- Provide just one implementation, with the caveat the semantics might not match all the interfaces that share the same method signature

- Not allowing collisions, forcing the creation of an adapter interface

So it is not well in the MI solutions outside C++. Each approach has plus and minus.

shmerl · on March 23, 2014

Marshall Cline’s C++ FAQ is really great. Good to see it will be merged into the new one.

Other resources I use regularly are:

* http://en.cppreference.com

* http://www.cplusplus.com

omaranto · on March 23, 2014

I enjoyed the wording of this question: "I’m from Missouri. Can you give me a simple reason why virtual functions (dynamic binding, dynamic polymorphism) and templates (static polymorphism) make a big difference?"

elnate · on March 23, 2014

Did anyone else think of the XKCD comic on new standards?

https://xkcd.com/927/

rkuykendall-com · on March 23, 2014

I immediately thought of this, but dismissed it after reading that 2 of the largest FAQs were being merged. This is made less important by the fact that neither of those FAQs are going offline, but at least it gives the impression that this will be replacing 2 and not just creating a third.

snogglethorpe · on March 23, 2014

What will happen to the other FAQs...? It seems like if they're pretty much entirely incorporated into the official FAQ, they could (and maybe should) be replaced by links to and/or mirrors of the official FAQ.

[At least in the case of Marshall Cline's FAQ, I've always found it somewhat confusingly organized, although the actual content was pretty good once you managed to find what you were looking for. My (brief!) initial impression is that this official FAQ seems better organized, at least...]

xymostech · on March 23, 2014

They say that the previous FAQs will be replaced by links:

> Both Marshall and Bjarne will be updating their FAQs to either forward or link to the new FAQ on a per-FAQ level.

e12e · on March 23, 2014

I found the link to Stroustrup's "Learning Standard C++ as a New Language" [edit: from 1998, which explains why I found it kind of familiar to my late 90s run-in with c++...] interesting[2,1]. But then I tried to find out what's the current state of the art for idiomatic, cross-platform text processing (in unicode, probably utf-8) -- and got a little sad.

There's still no way to write a ten(ish) line c++ program that reads and writes text, that works both on the windows console, and under OS X/*bsd and Linux?

(Lets go crazy, say you have to implement a minimal cat, tac (reverses stdin on stdout) -- and also corresponding echo and ohce (I made that up, something that takes string(s) as input, and outputs the characters reversed (ie: "ohce olé there" outputs "ereht élo").

[2][edit] The direct link to Strostrup's paper is:

http://stroustrup.com/new_learning.pdf

code:

http://isocpp.org/wiki/faq/newbie#simple-program

[1]

The FAQ briefly touches on Unicode, but it doesn't seem very helpful (to me): http://isocpp.org/wiki/faq/cpp11-language-misc (search page for unicode)

Trying to look for a (simple, generally accepted) solution, I came across:

http://www.utf8everywhere.org/ (If I'm reading this right, it says assume std::string is utf8, but I'm not sure if there are std-lib funtions for doing stuff like getting the index of a glyph, and reversing strings by glyph? And will they work on windows?)

http://stackoverflow.com/questions/2037765/what-is-the-optim...

Which points to: http://utfcpp.sourceforge.net/ which is the best I'm aware of so far.

http://stackoverflow.com/questions/8513249/handling-utf-8-in...

http://stackoverflow.com/questions/402283/stdwstring-vs-stds...

(Suggest using wstring on Windows and string on Linux -- for simple programs that would effectively mean write to versions, one for each platform ..)

saurik · on March 23, 2014

Writing tac in C++ is trivial: std::deque<std::string> lines; for (std::string line; std::getline(std::cin, line); ) lines.push_front(line); for (const auto &line : lines) std::cout << line << std::endl; // Note that I typed this quickly off the top of my head: I'm willing to believe there is a trivial typo the compiler would catch, but the overall implementation should be fine ;P.

As for ohce, you have defined a very very hard problem, one that it does not seem you realize is quite as hard as it actually is: if you have a sequence of Unicode codepoints and reverse their order you do not end up with a string of reversed characters, not in the general case, and not even for some reasonable encodings of seemingly-simple cases like an accented letter e.

Like, I challenge you to provide a working version of ohce in Python (2 or 3: your choice). Virtually no language actually provides a string type that makes this problem reasonable. It simply isn't fair to pick on C++ in this regard when no language "gets this right": at least C++ is being honest about the lack of guarantees it is making about string manipulation.

For more information, I recommend reading this article:

http://mortoray.com/2013/11/27/the-string-type-is-broken/

e12e · on March 24, 2014

> Like, I challenge you to provide a working version of ohce in Python (2 or 3: your choice).

Not a full implementation, but wouldn't this approach actually work (note, doesn't work for python2):

    $ python3
    Python 3.2.3 (default, Feb 20 2013, 14:44:27) 
    [GCC 4.7.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> "abc"[::-1]
    'cba'
    >>> "øæåにほ言"[::-1]
    '言ほにåæø'
    [edit: with accents]:
    >>> "eẽêëèøæåにほ言"[::-1]
    '言ほにåæøèëêẽe'

    [edit2: formatting, indentation]

There might very well be problems with this, but I'm not aware of any?

[edit4: This is indeed broken for the ligature (baﬄe) case in python3. I'm not entirely sure that is an entirely fair test (but it is very interesting). I would argue that the ligature should probably be a replacement done for display/print, not in a text file. Just like the reverse of "æ" isn't a reverse composition of "e" and "a" (even if "æ" might be seen as a compositon of "a" and "e".

I'm not sure how it deals with changing direction (left-to-right, right-to-left) -- comments welcome.]

[edit3, sorry for the many edits]

To be clear, I do not wish to "pick on c++", nor do I think the example is trivial. I do think it probably should be trivial -- it is something that should be supported in a canonical way by a standard library/implementation.

Working with graphemes is a very fundamental part of working with text -- the fact that half(?) of developers have been able to hide behind ascii isn't a good excuse for not fixing it. How would one implement an editor if you can't access graphemes in a reasonable way?

And more importantly, how would you test for palindromes? ;-)

mpyne · on March 23, 2014

For the "reversing Unicode text" problem the easiest C++ solution is probably to use an external library such as QtCore or ICU (Qt uses ICU internally).

Unfortunately even in UTF-16 grapheme clusters do not correspond 1:1 with Unicode code points so you wouldn't be able to just reverse a list. But Qt can split up a QString into its grapheme clusters (a quick example I had made a couple of months ago):

    static QString reverse(QString src)
    {
        auto src_nfc = src.normalized(QString::NormalizationForm_C);
        QChar *start = src_nfc.data();
        int length = src_nfc.length();

        QTextBoundaryFinder finder(QTextBoundaryFinder::Grapheme, start, length);
        finder.toStart();

        // Reverse code elements that make up a code point when that code point has
        // been expressed in more than one code element (which is even possible in
        // UCS-4!)
        while(finder.position() < src_nfc.length()) {
            int oldPos = finder.position();
            finder.toNextBoundary();
            int newPos = finder.position();

            if(newPos - oldPos > 1) {
                std::reverse(start + oldPos, start + newPos);
            }
        }

        std::reverse(start, start + length);
        return src_nfc;
    }

yati · on March 23, 2014

This. I am very happy with how C++ has evolved and how expressive it has become today, but decent Unicode support at least in the stdlib is something any programmer would(hopefully) look for in a modern language. Thanks for the interesting links, though.

e12e · on March 23, 2014

FWIW I've submitted an improvement request to the c++ faq along these lines.

a8da6b0c91d · on March 23, 2014

There's ICU and boost.locale but unicode in C++ truly remains a pain in the ass.

I feel like C++ is still better as the "fast parts" language under a layer of Perl or whatever for the shoveling data and text around stuff.

e12e · on March 23, 2014

Any canoical examples to go with those two? As I understand it now, I can pretty much get away with utf8 and some locale code, as long as I stick to (possibly some subset of distributions of) Linux. Which really is fine for my use case, but it's not really a very nice stance to take (it's all fun and games until you need to work in an environment where you for some reason or other can't change the OS, and need that clever utility that wasn't quite as standard/cross-platform as it maybe should've been...).