aboutsummaryrefslogtreecommitdiff
path: root/docs/ShadowCallStack.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/ShadowCallStack.rst')
-rw-r--r--docs/ShadowCallStack.rst193
1 files changed, 193 insertions, 0 deletions
diff --git a/docs/ShadowCallStack.rst b/docs/ShadowCallStack.rst
new file mode 100644
index 000000000000..da609dcd9de4
--- /dev/null
+++ b/docs/ShadowCallStack.rst
@@ -0,0 +1,193 @@
+===============
+ShadowCallStack
+===============
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+ShadowCallStack is an **experimental** instrumentation pass, currently only
+implemented for x86_64 and aarch64, that protects programs against return
+address overwrites (e.g. stack buffer overflows.) It works by saving a
+function's return address to a separately allocated 'shadow call stack'
+in the function prolog and checking the return address on the stack against
+the shadow call stack in the function epilog.
+
+Comparison
+----------
+
+To optimize for memory consumption and cache locality, the shadow call stack
+stores an index followed by an array of return addresses. This is in contrast
+to other schemes, like :doc:`SafeStack`, that mirror the entire stack and
+trade-off consuming more memory for shorter function prologs and epilogs with
+fewer memory accesses. Similarly, `Return Flow Guard`_ consumes more memory with
+shorter function prologs and epilogs than ShadowCallStack but suffers from the
+same race conditions (see `Security`_). Intel `Control-flow Enforcement Technology`_
+(CET) is a proposed hardware extension that would add native support to
+use a shadow stack to store/check return addresses at call/return time. It
+would not suffer from race conditions at calls and returns and not incur the
+overhead of function instrumentation, but it does require operating system
+support.
+
+.. _`Return Flow Guard`: https://xlab.tencent.com/en/2016/11/02/return-flow-guard/
+.. _`Control-flow Enforcement Technology`: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
+
+Compatibility
+-------------
+
+ShadowCallStack currently only supports x86_64 and aarch64. A runtime is not
+currently provided in compiler-rt so one must be provided by the compiled
+application.
+
+On aarch64, the instrumentation makes use of the platform register ``x18``.
+On some platforms, ``x18`` is reserved, and on others, it is designated as
+a scratch register. This generally means that any code that may run on the
+same thread as code compiled with ShadowCallStack must either target one
+of the platforms whose ABI reserves ``x18`` (currently Darwin, Fuchsia and
+Windows) or be compiled with the flag ``-ffixed-x18``.
+
+Security
+========
+
+ShadowCallStack is intended to be a stronger alternative to
+``-fstack-protector``. It protects from non-linear overflows and arbitrary
+memory writes to the return address slot; however, similarly to
+``-fstack-protector`` this protection suffers from race conditions because of
+the call-return semantics on x86_64. There is a short race between the call
+instruction and the first instruction in the function that reads the return
+address where an attacker could overwrite the return address and bypass
+ShadowCallStack. Similarly, there is a time-of-check-to-time-of-use race in the
+function epilog where an attacker could overwrite the return address after it
+has been checked and before it has been returned to. Modifying the call-return
+semantics to fix this on x86_64 would incur an unacceptable performance overhead
+due to return branch prediction.
+
+The instrumentation makes use of the ``gs`` segment register on x86_64,
+or the ``x18`` register on aarch64, to reference the shadow call stack
+meaning that references to the shadow call stack do not have to be stored in
+memory. This makes it possible to implement a runtime that avoids exposing
+the address of the shadow call stack to attackers that can read arbitrary
+memory. However, attackers could still try to exploit side channels exposed
+by the operating system `[1]`_ `[2]`_ or processor `[3]`_ to discover the
+address of the shadow call stack.
+
+.. _`[1]`: https://eyalitkin.wordpress.com/2017/09/01/cartography-lighting-up-the-shadows/
+.. _`[2]`: https://www.blackhat.com/docs/eu-16/materials/eu-16-Goktas-Bypassing-Clangs-SafeStack.pdf
+.. _`[3]`: https://www.vusec.net/projects/anc/
+
+On x86_64, leaf functions are optimized to store the return address in a
+free register and avoid writing to the shadow call stack if a register is
+available. Very short leaf functions are uninstrumented if their execution
+is judged to be shorter than the race condition window intrinsic to the
+instrumentation.
+
+On aarch64, the architecture's call and return instructions (``bl`` and
+``ret``) operate on a register rather than the stack, which means that
+leaf functions are generally protected from return address overwrites even
+without ShadowCallStack. It also means that ShadowCallStack on aarch64 is not
+vulnerable to the same types of time-of-check-to-time-of-use races as x86_64.
+
+Usage
+=====
+
+To enable ShadowCallStack, just pass the ``-fsanitize=shadow-call-stack``
+flag to both compile and link command lines. On aarch64, you also need to pass
+``-ffixed-x18`` unless your target already reserves ``x18``.
+
+Low-level API
+-------------
+
+``__has_feature(shadow_call_stack)``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In some cases one may need to execute different code depending on whether
+ShadowCallStack is enabled. The macro ``__has_feature(shadow_call_stack)`` can
+be used for this purpose.
+
+.. code-block:: c
+
+ #if defined(__has_feature)
+ # if __has_feature(shadow_call_stack)
+ // code that builds only under ShadowCallStack
+ # endif
+ #endif
+
+``__attribute__((no_sanitize("shadow-call-stack")))``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Use ``__attribute__((no_sanitize("shadow-call-stack")))`` on a function
+declaration to specify that the shadow call stack instrumentation should not be
+applied to that function, even if enabled globally.
+
+Example
+=======
+
+The following example code:
+
+.. code-block:: c++
+
+ int foo() {
+ return bar() + 1;
+ }
+
+Generates the following x86_64 assembly when compiled with ``-O2``:
+
+.. code-block:: gas
+
+ push %rax
+ callq bar
+ add $0x1,%eax
+ pop %rcx
+ retq
+
+or the following aarch64 assembly:
+
+.. code-block:: none
+
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl bar
+ add w0, w0, #1
+ ldp x29, x30, [sp], #16
+ ret
+
+
+Adding ``-fsanitize=shadow-call-stack`` would output the following x86_64
+assembly:
+
+.. code-block:: gas
+
+ mov (%rsp),%r10
+ xor %r11,%r11
+ addq $0x8,%gs:(%r11)
+ mov %gs:(%r11),%r11
+ mov %r10,%gs:(%r11)
+ push %rax
+ callq bar
+ add $0x1,%eax
+ pop %rcx
+ xor %r11,%r11
+ mov %gs:(%r11),%r10
+ mov %gs:(%r10),%r10
+ subq $0x8,%gs:(%r11)
+ cmp %r10,(%rsp)
+ jne trap
+ retq
+
+ trap:
+ ud2
+
+or the following aarch64 assembly:
+
+.. code-block:: none
+
+ str x30, [x18], #8
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+ bl bar
+ add w0, w0, #1
+ ldp x29, x30, [sp], #16
+ ldr x30, [x18, #-8]!
+ ret