diff options
Diffstat (limited to 'docs/CMakePrimer.rst')
-rw-r--r-- | docs/CMakePrimer.rst | 465 |
1 files changed, 465 insertions, 0 deletions
diff --git a/docs/CMakePrimer.rst b/docs/CMakePrimer.rst new file mode 100644 index 000000000000..034779022142 --- /dev/null +++ b/docs/CMakePrimer.rst @@ -0,0 +1,465 @@ +============ +CMake Primer +============ + +.. contents:: + :local: + +.. warning:: + Disclaimer: This documentation is written by LLVM project contributors `not` + anyone affiliated with the CMake project. This document may contain + inaccurate terminology, phrasing, or technical details. It is provided with + the best intentions. + + +Introduction +============ + +The LLVM project and many of the core projects built on LLVM build using CMake. +This document aims to provide a brief overview of CMake for developers modifying +LLVM projects or building their own projects on top of LLVM. + +The official CMake language references is available in the cmake-language +manpage and `cmake-language online documentation +<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_. + +10,000 ft View +============== + +CMake is a tool that reads script files in its own language that describe how a +software project builds. As CMake evaluates the scripts it constructs an +internal representation of the software project. Once the scripts have been +fully processed, if there are no errors, CMake will generate build files to +actually build the project. CMake supports generating build files for a variety +of command line build tools as well as for popular IDEs. + +When a user runs CMake it performs a variety of checks similar to how autoconf +worked historically. During the checks and the evaluation of the build +description scripts CMake caches values into the CMakeCache. This is useful +because it allows the build system to skip long-running checks during +incremental development. CMake caching also has some drawbacks, but that will be +discussed later. + +Scripting Overview +================== + +CMake's scripting language has a very simple grammar. Every language construct +is a command that matches the pattern _name_(_args_). Commands come in three +primary types: language-defined (commands implemented in C++ in CMake), defined +functions, and defined macros. The CMake distribution also contains a suite of +CMake modules that contain definitions for useful functionality. + +The example below is the full CMake build for building a C++ "Hello World" +program. The example uses only CMake language-defined functions. + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.2) + project(HelloWorld) + add_executable(HelloWorld HelloWorld.cpp) + +The CMake language provides control flow constructs in the form of foreach loops +and if blocks. To make the example above more complicated you could add an if +block to define "APPLE" when targeting Apple platforms: + +.. code-block:: cmake + + cmake_minimum_required(VERSION 3.2) + project(HelloWorld) + add_executable(HelloWorld HelloWorld.cpp) + if(APPLE) + target_compile_definitions(HelloWorld PUBLIC APPLE) + endif() + +Variables, Types, and Scope +=========================== + +Dereferencing +------------- + +In CMake variables are "stringly" typed. All variables are represented as +strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it +and results in a literal substitution of the name for the value. CMake refers to +this as "variable evaluation" in their documentation. Dereferences are performed +*before* the command being called receives the arguments. This means +dereferencing a list results in multiple separate arguments being passed to the +command. + +Variable dereferences can be nested and be used to model complex data. For +example: + +.. code-block:: cmake + + set(var_name var1) + set(${var_name} foo) # same as "set(var1 foo)" + set(${${var_name}}_var bar) # same as "set(foo_var bar)" + +Dereferencing an unset variable results in an empty expansion. It is a common +pattern in CMake to conditionally set variables knowing that it will be used in +code paths that the variable isn't set. There are examples of this throughout +the LLVM CMake build system. + +An example of variable empty expansion is: + +.. code-block:: cmake + + if(APPLE) + set(extra_sources Apple.cpp) + endif() + add_executable(HelloWorld HelloWorld.cpp ${extra_sources}) + +In this example the ``extra_sources`` variable is only defined if you're +targeting an Apple platform. For all other targets the ``extra_sources`` will be +evaluated as empty before add_executable is given its arguments. + +One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly +dereference values. This has some unexpected results. For example: + +.. code-block:: cmake + + if("${SOME_VAR}" STREQUAL "MSVC") + +In this code sample MSVC will be implicitly dereferenced, which will result in +the if command comparing the value of the dereferenced variables ``SOME_VAR`` +and ``MSVC``. A common workaround to this solution is to prepend strings being +compared with an ``x``. + +.. code-block:: cmake + + if("x${SOME_VAR}" STREQUAL "xMSVC") + +This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This +pattern is uncommon, but it does occur in LLVM's CMake scripts. + +.. note:: + + Once the LLVM project upgrades its minimum CMake version to 3.1 or later we + can prevent this behavior by setting CMP0054 to new. For more information on + CMake policies please see the cmake-policies manpage or the `cmake-policies + online documentation + <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_. + +Lists +----- + +In CMake lists are semi-colon delimited strings, and it is strongly advised that +you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of +defining lists: + +.. code-block:: cmake + + # Creates a list with members a, b, c, and d + set(my_list a b c d) + set(my_list "a;b;c;d") + + # Creates a string "a b c d" + set(my_string "a b c d") + +Lists of Lists +-------------- + +One of the more complicated patterns in CMake is lists of lists. Because a list +cannot contain an element with a semi-colon to construct a list of lists you +make a list of variable names that refer to other lists. For example: + +.. code-block:: cmake + + set(list_of_lists a b c) + set(a 1 2 3) + set(b 4 5 6) + set(c 7 8 9) + +With this layout you can iterate through the list of lists printing each value +with the following code: + +.. code-block:: cmake + + foreach(list_name IN LISTS list_of_lists) + foreach(value IN LISTS ${list_name}) + message(${value}) + endforeach() + endforeach() + +You'll notice that the inner foreach loop's list is doubly dereferenced. This is +because the first dereference turns ``list_name`` into the name of the sub-list +(a, b, or c in the example), then the second dereference is to get the value of +the list. + +This pattern is used throughout CMake, the most common example is the compiler +flags options, which CMake refers to using the following variable expansions: +CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}. + +Other Types +----------- + +Variables that are cached or specified on the command line can have types +associated with them. The variable's type is used by CMake's UI tool to display +the right input field. The variable's type generally doesn't impact evaluation. +One of the few examples is PATH variables, which CMake does have some special +handling for. You can read more about the special handling in `CMake's set +documentation +<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_. + +Scope +----- + +CMake inherently has a directory-based scoping. Setting a variable in a +CMakeLists file, will set the variable for that file, and all subdirectories. +Variables set in a CMake module that is included in a CMakeLists file will be +set in the scope they are included from, and all subdirectories. + +When a variable that is already set is set again in a subdirectory it overrides +the value in that scope and any deeper subdirectories. + +The CMake set command provides two scope-related options. PARENT_SCOPE sets a +variable into the parent scope, and not the current scope. The CACHE option sets +the variable in the CMakeCache, which results in it being set in all scopes. The +CACHE option will not set a variable that already exists in the CACHE unless the +FORCE option is specified. + +In addition to directory-based scope, CMake functions also have their own scope. +This means variables set inside functions do not bleed into the parent scope. +This is not true of macros, and it is for this reason LLVM prefers functions +over macros whenever reasonable. + +.. note:: + Unlike C-based languages, CMake's loop and control flow blocks do not have + their own scopes. + +Control Flow +============ + +CMake features the same basic control flow constructs you would expect in any +scripting language, but there are a few quarks because, as with everything in +CMake, control flow constructs are commands. + +If, ElseIf, Else +---------------- + +.. note:: + For the full documentation on the CMake if command go + `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is + far more complete. + +In general CMake if blocks work the way you'd expect: + +.. code-block:: cmake + + if(<condition>) + .. do stuff + elseif(<condition>) + .. do other stuff + else() + .. do other other stuff + endif() + +The single most important thing to know about CMake's if blocks coming from a C +background is that they do not have their own scope. Variables set inside +conditional blocks persist after the ``endif()``. + +Loops +----- + +The most common form of the CMake ``foreach`` block is: + +.. code-block:: cmake + + foreach(var ...) + .. do stuff + endforeach() + +The variable argument portion of the ``foreach`` block can contain dereferenced +lists, values to iterate, or a mix of both: + +.. code-block:: cmake + + foreach(var foo bar baz) + message(${var}) + endforeach() + # prints: + # foo + # bar + # baz + + set(my_list 1 2 3) + foreach(var ${my_list}) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + + foreach(var ${my_list} out_of_bounds) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + # out_of_bounds + +There is also a more modern CMake foreach syntax. The code below is equivalent +to the code above: + +.. code-block:: cmake + + foreach(var IN ITEMS foo bar baz) + message(${var}) + endforeach() + # prints: + # foo + # bar + # baz + + set(my_list 1 2 3) + foreach(var IN LISTS my_list) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + + foreach(var IN LISTS my_list ITEMS out_of_bounds) + message(${var}) + endforeach() + # prints: + # 1 + # 2 + # 3 + # out_of_bounds + +Similar to the conditional statements, these generally behave how you would +expect, and they do not have their own scope. + +CMake also supports ``while`` loops, although they are not widely used in LLVM. + +Modules, Functions and Macros +============================= + +Modules +------- + +Modules are CMake's vehicle for enabling code reuse. CMake modules are just +CMake script files. They can contain code to execute on include as well as +definitions for commands. + +In CMake macros and functions are universally referred to as commands, and they +are the primary method of defining code that can be called multiple times. + +In LLVM we have several CMake modules that are included as part of our +distribution for developers who don't build our project from source. Those +modules are the fundamental pieces needed to build LLVM-based projects with +CMake. We also rely on modules as a way of organizing the build system's +functionality for maintainability and re-use within LLVM projects. + +Argument Handling +----------------- + +When defining a CMake command handling arguments is very useful. The examples +in this section will all use the CMake ``function`` block, but this all applies +to the ``macro`` block as well. + +CMake commands can have named arguments, but all commands are implicitly +variable argument. If the command has named arguments they are required and must +be specified at every call site. Below is a trivial example of providing a +wrapper function for CMake's built in function ``add_dependencies``. + +.. code-block:: cmake + + function(add_deps target) + add_dependencies(${target} ${ARGV}) + endfunction() + +This example defines a new macro named ``add_deps`` which takes a required first +argument, and just calls another function passing through the first argument and +all trailing arguments. When variable arguments are present CMake defines them +in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``. + +CMake provides a module ``CMakeParseArguments`` which provides an implementation +of advanced argument parsing. We use this all over LLVM, and it is recommended +for any function that has complex argument-based behaviors or optional +arguments. CMake's official documentation for the module is in the +``cmake-modules`` manpage, and is also available at the +`cmake-modules online documentation +<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_. + +.. note:: + As of CMake 3.5 the cmake_parse_arguments command has become a native command + and the CMakeParseArguments module is empty and only left around for + compatibility. + +Functions Vs Macros +------------------- + +Functions and Macros look very similar in how they are used, but there is one +fundamental difference between the two. Functions have their own scope, and +macros don't. This means variables set in macros will bleed out into the calling +scope. That makes macros suitable for defining very small bits of functionality +only. + +The other difference between CMake functions and macros is how arguments are +passed. Arguments to macros are not set as variables, instead dereferences to +the parameters are resolved across the macro before executing it. This can +result in some unexpected behavior if using unreferenced variables. For example: + +.. code-block:: cmake + + macro(print_list my_list) + foreach(var IN LISTS my_list) + message("${var}") + endforeach() + endmacro() + + set(my_list a b c d) + set(my_list_of_numbers 1 2 3 4) + print_list(my_list_of_numbers) + # prints: + # a + # b + # c + # d + +Generally speaking this issue is uncommon because it requires using +non-dereferenced variables with names that overlap in the parent scope, but it +is important to be aware of because it can lead to subtle bugs. + +LLVM Project Wrappers +===================== + +LLVM projects provide lots of wrappers around critical CMake built-in commands. +We use these wrappers to provide consistent behaviors across LLVM components +and to reduce code duplication. + +We generally (but not always) follow the convention that commands prefaced with +``llvm_`` are intended to be used only as building blocks for other commands. +Wrapper commands that are intended for direct use are generally named following +with the project in the middle of the command name (i.e. ``add_llvm_executable`` +is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are +all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM +distribution. It can be included and used by any LLVM sub-project that requires +LLVM. + +.. note:: + + Not all LLVM projects require LLVM for all use cases. For example compiler-rt + can be built without LLVM, and the compiler-rt sanitizer libraries are used + with GCC. + +Useful Built-in Commands +======================== + +CMake has a bunch of useful built-in commands. This document isn't going to +go into details about them because The CMake project has excellent +documentation. To highlight a few useful functions see: + +* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_ +* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_ +* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_ +* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_ +* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_ +* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_ + +The full documentation for CMake commands is in the ``cmake-commands`` manpage +and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_ |