Today I Learned - lessons from building a C++ development stack
Table of Contents
- 1. wsl (windows services for linux)
- 2. git
- 3. webhosting + forgejo + CI
- 4. cmake
- 5. lsp (language server process)
- 6. iostream
- 6.1. general api rant (25feb2024)
- 6.1.1.
istream.read()doesn't report the number of bytes/chars read. - 6.1.2.
istream.read(s, n)expects always to read n chars. - 6.1.3. Iostream get isn't monotonic
- 6.1.4. Iostream position reporting isn't monotonic.
- 6.1.5. Streambuf not responsible for eofbit.
- 6.1.6. Stream position reporting from streambuf
- 6.1.1.
- 6.1. general api rant (25feb2024)
- 7. zlib
- 8. gcc
- 9. llvm
1. wsl (windows services for linux)
- (Nov? 2023) x11 apps wedged after wsl update
- (Sep 2024) getting fonts from nix
2. git
- (May 2026) bugfix for git subtree
3. webhosting + forgejo + CI
- (May 2026) forgejo setup
- (May 2026) forgejo runner setup)
4. cmake
4.1. inconsistent eigen package (25oct2023)
4.1.1. Setup
- library xo-kalmanfilter depends on eigen
- library xo-pykalmanfilter depends on
xo-kalmanfilter - the project xo incorporates both codebases (along with others) into single umbrella source tree (using git submodules).
- alternatively, can build+install libraries independently (in bottom-up dependency order),
We call this last a 'vanilla build', since it follows standard practice for installing 3rd-party libraries.
In a so-called vanilla build, we use cmake's find_package() support to acquire each dependency.
In a vanilla build, When cmake builds a library, it obtains its dependencies from their final install location.
Contrast this with the submodule build: here, when cmake build a library, it obtains its dependencies from the build tree
4.1.2. Problem
submodule build works as expected; compile flags include required
-Ipath/to/eigen/eigen3, so c++ code like this compiles#include <Eigen/Dense>
vanilla build fails: when compiling xo-pykalmanfilter, the header path for eigen is given as
-Ipath/to/eigeninstead of-Ipath/to/eigen/eigen3, so now need this to compile instead:#include <eigen3/Eigen/Dense>
4.1.3. Details
xo-kalmanfilterspecifies eigen dependency in the approved manner:# xo-kalmanfilter/src/kalmanfilter/CMakeLists.txt set(SELF_LIB xo_kalmanfilter) .. xo_external_target_dependency(${SELF_LIB} Eigen3 Eigen3::Eigen)
which expands as if we had written:
find_package(Eigen3 CONFIG REQUIRED) target_link_libraries(${SELF_LIB} PUBLIC Eigen3::Eigen)
This works as expected in submodule build. In submodule build, codebases
xo-kalmanfilterandxo-pykalmanfilter(amongst others) are incorporated into a single source tree:# xo-sm2/CMakeLists.txt set(XO_SUBMODULE_BUILD True) .. add_subdirectory(repo/xo-kalmanfilter) add_subdirectory(repo/xo-pykalmanfilter)
xo-pykalmanfilterspecifiesxo-kalmanfilterdependency:# xo-pykalmanfilter/src/pykalmanfilter/CMakeLists.txt set(SELF_LIB pykalmanfilter) .. xo_pybind11_dependency(${SELF_LIB} xo_kalmanfilter)
which expands differently, depending on build type. In submodule build, as if we had written:
target_include_directories(${SELF_LIB} PUBLIC $<BUILD_INTERFACE:${CMAKE_SOURCE_DIR}/repo/xo_kalmanfilter/include>) target_include_directories(${SELF_LIB} PUBLIC $<BUILD_INTERFACE:${CMAKE_BINARY_DIR}/repo/xo_kalmanfilter/include>) target_link_libraries(${SELF_LIB} PUBLIC xo_kalmanfilter)
In vanilla build,
xo_pybind11_dependency()expands differently:find_package(xo_kalmanfilter CONFIG REQUIRED) .. target_link_libraries(${SELF_LIB} PUBLIC xo_kalmanfilter)
xo-kalmanfilterprovides support for cmakefind_package():# xo-kalmanfilter/cmake/xo_kalmanfilterConfig.cmake.in @PACKAGE_INIT@ include(CMakeFindDependencyMacro) find_dependency(reactor) find_dependency(eigen3) include("${CMAKE_CURRENT_LIST_DIR}/@PROJECT_NAME@Targets.cmake") check_required_components("@PROJECT_NAME@")
and generated
xo_kalmanfilterTargets.cmakefile contains:# Create imported target xo_kalmanfilter add_library(xo_kalmanfilter SHARED IMPORTED) set_target_properties(xo_kalmanfilter PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include;${_IMPORT_PREFIX}/include/xo/kalmanfilter" INTERFACE_LINK_LIBRARIES "reactor;Eigen3::Eigen" )
which.. doesn't look wrong :)
Evidence points to an inconsistency in Eigen-provided cmake support, if not in cmake proper.
4.1.4. Workaround
It's sufficient to restate the eigen dependency in xo-pykalmanfilter:
# xo_pykalmanfilter/src/pykalmanfilter/CMakeLists.txt set(SELF_LIB pykalmanfilter) ... xo_external_target_dependency(${SELF_LIB} Eigen3 Eigen3::Eigen)
4.2. cmake handling of header-only library dependencies (7oct2023)
4.2.1. Update
My original investigation mistaken.
It turns out I didn't understand that cmake error from target_link_libraries():
INTERFACE library can only be used with the INTERFACE keyword of target_link_libraries
applies to the depended-on library (3rd argument), not the depending library (1st argument).
4.2.2. Setup
- NOTE: also asked on stack overflow here
- cmake version 3.25.3
Must introduce a header-only library like this:
add_library(foo INTERFACE)(instead of
add_library(foo SHARED)oradd_library(foo STATIC))Must specify include directories for a header-only library like this:
target_include_directories( foo INTERFACE $<INSTALL_INTERFACE:path/to/include> $<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/path/to/include>)
cmake enforces this explicitly; gives error
target_include_directories may only set INTERFACE properties on INTERFACE targets
Must specify dependency on a header-only library like this:
target_link_libraries(bar INTERFACE foo))(instead of
target_link_libraries(bar PUBLIC foo))cmake enforces this explicitly; gives error
INTERFACE library can only be used with the INTERFACE keyword of target_link_libraries
Expected behavior: this is sufficient for compilation of
barto tell compiler about include paths forfoo:gcc -Ipath/to/foo bar.cpp
This expectation is satisfied for regular non-INTERFACE libraries.
4.2.3. Problem
- compilation fails to supply include paths for depended-on
foowhen compiling depending-onbar, ifbaris a regular library - predicted cause:
- for an INTERFACE library, cmake uses property
INTERFACE_INCLUDE_DIRECTORIES; it does not populateINCLUDE_DIRECTORIES. target_link_librarieswhen applied to aSTATICorSHAREDtarget, picks up theINCLUDE_DIRECTORIESproperty for the depended-on target, while ignoring theINTERFACE_INCLUDE_DIRECTORIESproperty.
- for an INTERFACE library, cmake uses property
4.2.4. Workaround
when depending on a header-only library, explictly incorporate depended-on
INTERFACE_INCLUDE_DIRECTORIEStoINCLUDE_DIRECTORIES:macro(dependency_headeronly target dep) target_link_libraries(${target} INTERFACE ${dep}) get_target_property(dependency_headeronly__tmp ${dep} INTERFACE_INCLUDE_DIRECTORIES) set_property( TARGET ${target} APPEND PROPERTY INCLUDE_DIRECTORIES ${dependency_headeronly__tmp}) endmacro()
4.3. pybind11 link difficulties with transitive library dependencies (7oct2023)
4.3.1. Setup
- cmake version 3.25.3
- pybind11 version ???
- nix build (see https://github.com:rconybea/xo-nix2)
Consequences of nix build:
- Each package installed to a separate directory – no "common swimming pool" like
/usr/lib - Implies install directory always distinct from any directory containing build inputs
- Tends to reveal oversights in toolchain, as we'll see below
- Each package installed to a separate directory – no "common swimming pool" like
- pybind library (
xo-pyreflect) with dependency on a separate library (xo-reflect), that in turn has secondary dependencies (xo-refcnt,xo-indentlog). Note thatxo-indentlogis header-only. Expect this cmake script to work:
find_package(pybind11) pybind11_add_module(pyreflect pyreflect.cpp) find_package(reflect CONFIG REQUIRED) target_link_libraries(pyreflect PUBLIC reflect)
4.3.2. Problem
Instead, link fails. Link line something like:
g++ -fPIC ... -o pyreflect.cpython-311-x86_64-linux-gnu.so /path/to/libreflect.so -lrefcnt -lindentlog
Two problems here:
- directory containing
librefcnt.soisn't on the link line (no-L/path/to/refcnt/dirfor example). libindentlog.sodoes not exist, sinceindentlogis header-only
- directory containing
Looked into intermediate outputs like
lib/cmake/reflectTargets.cmake, excerpt:set_target_properties(reflect PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${_IMPORT_PREFIX}/include" INTERFACE_LINK_LIBRARIES "indentlog;refcnt" )
It's not obvious how
xo_pyreflectcan know thatindentlogis header-only, whilerefcntisn't (though could presumably extract the relevant libdir fromfind_package()with some work).
4.3.3. Workaround
Recognize that
pyreflectlink shouldn't needrefcnton the link line, sincelibreflect.sohas aDT_NEEDEDentry for it.$ readelf -d /path/to/libreflect.so Dynamic section at offset 0x17860 contains 34 entries: Tag Type Name/Value 0x0000000000000001 (NEEDED) Shared library: [librefcnt.so.1] ...
When building
pyreflect, suppress transitive dependencies For example:# xo_cxx.cmake macro(xo_pybind11_dependency target dep) find_package(${dep} CONFIG REQUIRED) set_property(TARGET ${dep} PROPERTY INTERFACE_LINK_LIBRARIES "") target_link_libraries(${target} PUBLIC ${dep}) endmacro()
Then in
.cmakeforpyreflect, something equivalent to:pybind11_add_module(pyreflect pyreflect.cpp) xo_pybind11_dependency(pyreflect reflect)
5. lsp (language server process)
5.1. mysterious lsp complaints about iostream headers (22feb2024)
5.1.1. Setup
- working on bespoke streambuf implementation (cmake-examples/zstream/include/zstream/zstreambuf.hpp)
5.1.2. Problem
- getting mysterious errors from flymake + emacs, e.g. message "<fstream> not found"
5.1.3. Solution
- Had discarded
cmake-examples/builddirectory, usingcmake-examples/.buildinstead. This was to prevent build tree showing up when runningtreein thecmake-examplesdirectory - Left
cmake-examples/compile_commands.jsonas a broken symlink referring to old build directory - This causes
lspunderstanding of compiler invocation to deteriorate as project acquires new files - Fix by swinging symlink to
cmake-examples/.build/compile_commands.json, duh!
6. iostream
6.1. general api rant (25feb2024)
6.1.1. istream.read() doesn't report the number of bytes/chars read.
Instead of:
istream & istream::read (char_type * s, std::streamsize count);
I'd prefer signature
istream & istream::read (char_type * s, std::streamsize count, std::streamsize * p_gcount);
Developers are expected to use.
std::streamsize istream::gcount () const;
I think this is inferior, since relies on state held by istream,
that will be discarded on next read operation.
6.1.2. istream.read(s, n) expects always to read n chars.
It sets failbit if less than n chars read.
Apparent alternatives are unsatisfactory:
istream & readsome(s, n)isn't required to do any physical i/o; instead reports what's available already in memoryistream & get(s, n, delim)only reads up to first occurence ofdelim.istream & get(s, n)is just a convenience foristream::get(s, n, '\n').- could try writing a loop using combination of
istream::sync(),istream::readsome(), but that won't work if istream is actually unbuffered. istream s; s.rdbuf()->sgetn(s, n)bypassesistreamcode for sentry object etc, and can't set istream'seofbit.
The following workaround is viable, except that it will read one-byte-at-a-time if input alternates between bytes values '\x0' and '\xff':
template<typename istream> std::streamsize read_upto(istream & in, istream::char_type * s, std::streamsize n) { std::streamsize n_read = 0; constexpr char c_bits = '\x0'; /*any char value will do here*/ char delim = c_bits; for (; in.good() && !in.eof() && (n_read < n); delim = delim ^ '\xff') { // each iteration alternates between {c_bits, ~c_bits} as delimiter, // so guarantees at least one byte progress every two iterations in.get(s, n, delim); std::streamsize nr = in.gcount(); if (nr > 0) { n_read += nr; s += nr; } } return n_read; }
I'd prefer to support this behavior (without the performance-accident-waiting-to-happen) directly from istream.
Another strategy is to use istream::peek() to check for input and istream::readsome() to fetch it
template<typename istream> std::streamsize read_upto(istream & in, istream::char_type * s, std::streamsize n) { std::streamsize n_read = 0; while (in.good() && !in.eof() && (n_read < n))) { in.peek(); /* ensure at least one byte available in streambuf */ std::streamsize nr = in.readsome(s + n_read, n - n_read); n_read += nr; } }
This works if streambuf actually does buffering. It may be very slow if streambuf is unbuffered.
istream::sentry looks interesting, but doesn't do any reading (except to possibly skip whitespace).
gcc 12.2.0's implementation:
template<typename _CharT, typename _Traits> basic_istream<_CharT, _Traits>::sentry:: sentry(basic_istream<_CharT, _Traits>& __in, bool __noskip) : _M_ok(false) { ios_base::iostate __err = ios_base::goodbit; if (__in.good()) { __try { if (__in.tie()) __in.tie()->flush(); if (!__noskip && bool(__in.flags() & ios_base::skipws)) { const __int_type __eof = traits_type::eof(); __streambuf_type* __sb = __in.rdbuf(); __int_type __c = __sb->sgetc(); const __ctype_type& __ct = __check_facet(__in._M_ctype); while (!traits_type::eq_int_type(__c, __eof) && __ct.is(ctype_base::space, traits_type::to_char_type(__c))) __c = __sb->snextc(); // _GLIBCXX_RESOLVE_LIB_DEFECTS // 195. Should basic_istream::sentry's constructor ever // set eofbit? if (traits_type::eq_int_type(__c, __eof)) __err |= ios_base::eofbit; // (A) } } __catch(__cxxabiv1::__forced_unwind&) { __in._M_setstate(ios_base::badbit); __throw_exception_again; } __catch(...) { __in._M_setstate(ios_base::badbit); } } if (__in.good() && __err == ios_base::goodbit) // (B) _M_ok = true; else { __err |= ios_base::failbit; // (C) __in.setstate(__err); } }
with
template<typename _Facet> inline const _Facet& __check_facet(const _Facet* __f) { if (!__f) __throw_bad_cast(); return *__f; }
Note that if __noskipws is false and sentry encounters eof,
then the line marked (A) executes –> test (B) fails –> (C) executes,
flagging stream as in an 'unrecoverable error state'.
The line (A) appears to be mandatory (in spite of the inline comment).
From https://cppreference.com:
explicit sentry( std::basic_istream<CharT, Traits>& is, bool noskipws = false );
Prepares the stream for formatted input.
If is.good() is false, calls is.setstate(std::ios_base::failbit) and returns. Otherwise, if is.tie() is not a null pointer, calls is.tie()->flush() to synchronize the output sequence with external streams. This call can be suppressed if the put area of is.tie() is empty. The implementation may defer the call to flush() until a call of is.rdbuf()->underflow() occurs. If no such call occurs before the sentry object is destroyed, it may be eliminated entirely.
If noskipws is zero and is.flags() & std::ios_base::skipws is nonzero, the function extracts and discards all whitespace characters until the next available character is not a whitespace character (as determined by the currently imbued locale in is). If is.rdbuf()->sbumpc() or is.rdbuf()->sgetc() returns traits::eof(), the function calls setstate(std::ios_base::failbit | std::ios_base::eofbit) (which may throw std::ios_base::failure).
Additional implementation-defined preparation may take place, which may call setstate(std::ios_base::failbit) (which may throw std::ios_base::failure).
If after preparation is completed, is.good() == true, then any subsequent calls to operator bool will return true.
However we can bypass this with __noskip_ set to true:
template<typename istream> std::streamsize read_upto(istream & in, istream::char_type * s, std::streamsize n) { istream::sentry sentry(in, true /*noskipws*/); std::streamsize n_read = 0; if (sentry) { try { n_read = in.rdbuf()->sgetn(s, n); in.setstate(ios::eofbit); } catch(__cxxabiv1::__forced_unwind &) { in.setstate(ios::failbit); throw; } catch(...) { in.setstate(ios::failbit); } } return n_read; }
Another alternative would be to post-process read(), and clear failbit if set along with eofbit:
template<typename istream> std::streamsize read_upto(istream & in, istream::char_type * s, std::streamsize n) { in.read(s, n); std::streamsize n_read = in.gcount(); if ((n_read < n) && in.eof() && in.fail()) { /* clear failbit */ in.clear(in.rdstate() & ~std::ios::failbit); } return n_read; }
6.1.3. Iostream get isn't monotonic
iostream.get(s, n, delim) sets failbit if first character matches delim.
This interferes with using iostream.get as building block for a longer i/o sequence;
Tripped over this while writing zstream.read_until for my cmake-examples project:
Instead of:
std::streamsize read_until(char_type * s, std::streamsize n, bool check_delim_flag, char_type delim) { ... std::streamsize nr = 0; this->get(s, n, delim); nr = this->gcount(); ... return nr; }
We need carve-out:
std::streamsize read_until(char_type * s, std::streamsize n, bool check_delim_flag, char_type delim) { ... std::streamsize nr = 0; int_type nextc = this->rdbuf_.sgetc(); if (nextc == Traits::to_int_type(delim)) { nr = 0; } else { this->get(s, n, delim); nr = this->gcount(); } ... return nr; }
6.1.4. Iostream position reporting isn't monotonic.
iostream.tellg() and iostream.putg() report current position w.r.t.
beginning of stream for input (get) and output (put) respectively.
Unfortunately, they are not monotonic, and code like this is subtly broken:
istream & input = ...; // some binary stream struct foo part1; struct foo part2; istream::pos_type p0 = input.tellg(); input >> part1 >> part2; istream::pos_type p1 = input.tellg(); istream::pos_type n_read = p1 - p0;
If stream reaches end-of-file at the end of part2, then in fact reading was successful,
but p1 will be -1, and n_read will be nonsense.
Presumably this is why iostream.gcount() exists: otherwise there'd be no way to
determine how many bytes/chars a preceding read obtained.
A correct (but awkward and error-prone) implementation:
istream & input = ...; struct foo part1; struct foo part2; std::streamsize n_read = 0; input >> part1; n_read += input.gcount(); input >> part2; n_read += input.gcount();
6.1.5. Streambuf not responsible for eofbit.
istream.eofbit probably belongs in streambuf.
streambuf has to recognize end-of-file anyway, since it's responsible for physical I/O.
It might as well record and report it.
6.1.6. Stream position reporting from streambuf
It would be simpler for streambuf to support istream::tellg() and istream::tellp() directly instead of relying on streambuf::seekoff().
Argument here is that even for a non-seekable stream buffer, it still makes sense to support tellg() and tellp().
This requires streambuf author to implement at least a restricted version of streambuf::seekoff(),
which muddies the waters.
(ed: switching to article-based format)
7. zlib
- (Jan? 2024) zlib z_stream cannot be moved
8. gcc
- (June 2024) get builtin preprocessor defines from gcc
9. llvm
- (June 2024) injecting symbols into llvm jit
- (June 2024) alloca and function pointers