20 files changed, 247 insertions, 91 deletions
diff --git a/docs/AMDGPUUsage.rst b/docs/AMDGPUUsage.rst
index 57822ae9ab0a..41c7ecba527f 100644
--- a/docs/AMDGPUUsage.rst
+++ b/docs/AMDGPUUsage.rst
@@ -190,9 +190,7 @@ names from both the *Processor* and *Alternative Processor* can be used.
      gfx810     - stoney    amdgcn       APU
      **GCN GFX9**
      --------------------------------------------------------------------
-     gfx900                 amdgcn       dGPU          - FirePro W9500
-                                                       - FirePro S9500
-                                                       - FirePro S9500x2
+     gfx900                 amdgcn       dGPU          - Radeon Vega Frontier Edition
      gfx901                 amdgcn       dGPU  ROCm    Same as gfx900
                                                        except XNACK is
                                                        enabled
diff --git a/docs/CMake.rst b/docs/CMake.rst
index aeebc8f6acf9..bf97e9173158 100644
--- a/docs/CMake.rst
+++ b/docs/CMake.rst
@@ -536,6 +536,11 @@ LLVM-specific variables
   during the build. Enabling this option can significantly speed up build times
   especially when building LLVM in Debug configurations.
 
+**LLVM_REVERSE_ITERATION**:BOOL
+  If enabled, all supported unordered llvm containers would be iterated in
+  reverse order. This is useful for uncovering non-determinism caused by
+  iteration of unordered containers.
+
 CMake Caches
 ============
 
diff --git a/docs/CMakePrimer.rst b/docs/CMakePrimer.rst
index 1e3a09e4d98a..c29d627ee62c 100644
--- a/docs/CMakePrimer.rst
+++ b/docs/CMakePrimer.rst
@@ -112,33 +112,6 @@ In this example the ``extra_sources`` variable is only defined if you're
 targeting an Apple platform. For all other targets the ``extra_sources`` will be
 evaluated as empty before add_executable is given its arguments.
 
-One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly
-dereference values. This has some unexpected results. For example:
-
-.. code-block:: cmake
-
-   if("${SOME_VAR}" STREQUAL "MSVC")
-
-In this code sample MSVC will be implicitly dereferenced, which will result in
-the if command comparing the value of the dereferenced variables ``SOME_VAR``
-and ``MSVC``. A common workaround to this solution is to prepend strings being
-compared with an ``x``.
-
-.. code-block:: cmake
-
-   if("x${SOME_VAR}" STREQUAL "xMSVC")
-
-This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This
-pattern is uncommon, but it does occur in LLVM's CMake scripts.
-
-.. note::
-   
-   Once the LLVM project upgrades its minimum CMake version to 3.1 or later we
-   can prevent this behavior by setting CMP0054 to new. For more information on
-   CMake policies please see the cmake-policies manpage or the `cmake-policies
-   online documentation
-   <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_.
-
 Lists
 -----
 
diff --git a/docs/CommandGuide/lit.rst b/docs/CommandGuide/lit.rst
index b8299d44d48e..b4d15ef57b73 100644
--- a/docs/CommandGuide/lit.rst
+++ b/docs/CommandGuide/lit.rst
@@ -169,6 +169,13 @@ SELECTION OPTIONS
  must be in the range ``1..M``. The environment variable
  ``LIT_RUN_SHARD`` can also be used in place of this option.
 
+.. option:: --filter=REGEXP
+
+  Run only those tests whose name matches the regular expression specified in
+  ``REGEXP``. The environment variable ``LIT_FILTER`` can be also used in place
+  of this option, which is especially useful in environments where the call
+  to ``lit`` is issued indirectly.
+
 ADDITIONAL OPTIONS
 ------------------
 
diff --git a/docs/CommandGuide/llvm-cov.rst b/docs/CommandGuide/llvm-cov.rst
index ea2e625bc4d2..47db8d04e0b2 100644
--- a/docs/CommandGuide/llvm-cov.rst
+++ b/docs/CommandGuide/llvm-cov.rst
@@ -262,6 +262,12 @@ OPTIONS
  The demangler is expected to read a newline-separated list of symbols from
  stdin and write a newline-separated list of the same length to stdout.
 
+.. option:: -num-threads=N, -j=N
+
+ Use N threads to write file reports (only applicable when -output-dir is
+ specified). When N=0, llvm-cov auto-detects an appropriate number of threads to
+ use. This is the default.
+
 .. option:: -line-coverage-gt=<N>
 
  Show code coverage only for functions with line coverage greater than the
diff --git a/docs/CommandGuide/llvm-profdata.rst b/docs/CommandGuide/llvm-profdata.rst
index f7aa8309485b..5b6330b5dc40 100644
--- a/docs/CommandGuide/llvm-profdata.rst
+++ b/docs/CommandGuide/llvm-profdata.rst
@@ -192,6 +192,12 @@ OPTIONS
  information is dumped in a more human readable form (also in text) with
  annotations.
 
+.. option:: -topn=n
+	     
+ Instruct the profile dumper to show the top ``n`` functions with the
+ hottest basic blocks in the summary section. By default, the topn functions
+ are not dumped.
+
 .. option:: -sample
 
  Specify that the input profile is a sample-based profile.
diff --git a/docs/Coroutines.rst b/docs/Coroutines.rst
index f7a38577fe8e..1bea04ebdd2a 100644
--- a/docs/Coroutines.rst
+++ b/docs/Coroutines.rst
@@ -846,7 +846,7 @@ Overview:
 """""""""
 
 The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
-required to obtain a memory for the corutine frame and `false` otherwise.
+required to obtain a memory for the coroutine frame and `false` otherwise.
 
 Arguments:
 """"""""""
diff --git a/docs/Docker.rst b/docs/Docker.rst
index d873e1ebeeb4..e606e1b71a2c 100644
--- a/docs/Docker.rst
+++ b/docs/Docker.rst
@@ -88,15 +88,11 @@ compiled by the system compiler in the debian8 image:
     ./llvm/utils/docker/build_docker_image.sh \
 	--source debian8 \
 	--docker-repository clang-debian8 --docker-tag "staging" \
-	-- \
 	-p clang -i install-clang -i install-clang-headers \
 	-- \
 	-DCMAKE_BUILD_TYPE=Release
 
-Note there are two levels of ``--`` indirection. First one separates
-``build_docker_image.sh`` arguments from ``llvm/utils/build_install_llvm.sh``
-arguments. Second one separates CMake arguments from ``build_install_llvm.sh``
-arguments. Note that build like that doesn't use a 2-stage build process that
+Note that a build like that doesn't use a 2-stage build process that
 you probably want for clang. Running a 2-stage build is a little more intricate,
 this command will do that:
 
@@ -108,7 +104,6 @@ this command will do that:
     ./build_docker_image.sh \
 	--source debian8 \
 	--docker-repository clang-debian8 --docker-tag "staging" \
-	-- \
 	-p clang -i stage2-install-clang -i stage2-install-clang-headers \
 	-- \
 	-DLLVM_TARGETS_TO_BUILD=Native -DCMAKE_BUILD_TYPE=Release \
@@ -178,7 +173,6 @@ debian8-based image using the latest ``google/stable`` sources for you:
 
     ./llvm/utils/docker/build_docker_image.sh \
 	-s debian8 --d clang-debian8 -t "staging" \
-	-- \
 	--branch branches/google/stable \
 	-p clang -i install-clang -i install-clang-headers \
 	-- \
diff --git a/docs/Dummy.html b/docs/Dummy.html
deleted file mode 100644
index e69de29bb2d1..000000000000
--- a/docs/Dummy.html
+++ /dev/null
diff --git a/docs/HowToAddABuilder.rst b/docs/HowToAddABuilder.rst
index 08cbecdc2a57..201c71b21391 100644
--- a/docs/HowToAddABuilder.rst
+++ b/docs/HowToAddABuilder.rst
@@ -62,6 +62,9 @@ Here are the steps you can follow to do so:
                     lab.llvm.org:9990 \
                     <buildslave-access-name> <buildslave-access-password>
 
+   To point a slave to silent master please use lab.llvm.org:9994 instead
+   of lab.llvm.org:9990.
+
 #. Fill the buildslave description and admin name/e-mail.  Here is an
    example of the buildslave description::
 
diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index 2a0812ab930f..44efc1498060 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -2209,12 +2209,21 @@ For a simpler introduction to the ordering constraints, see the
     same address in this global order. This corresponds to the C++0x/C1x
     ``memory_order_seq_cst`` and Java volatile.
 
-.. _singlethread:
+.. _syncscope:
 
-If an atomic operation is marked ``singlethread``, it only *synchronizes
-with* or participates in modification and seq\_cst total orderings with
-other operations running in the same thread (for example, in signal
-handlers).
+If an atomic operation is marked ``syncscope("singlethread")``, it only
+*synchronizes with* and only participates in the seq\_cst total orderings of
+other operations running in the same thread (for example, in signal handlers).
+
+If an atomic operation is marked ``syncscope("<target-scope>")``, where
+``<target-scope>`` is a target specific synchronization scope, then it is target
+dependent if it *synchronizes with* and participates in the seq\_cst total
+orderings of other operations.
+
+Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
+or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
+seq\_cst total orderings of other operations that are not marked
+``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
 
 .. _fastmath:
 
@@ -5034,7 +5043,7 @@ which is the string ``llvm.loop.licm_versioning.disable``. For example:
 
 Loop distribution allows splitting a loop into multiple loops.  Currently,
 this is only performed if the entire loop cannot be vectorized due to unsafe
-memory dependencies.  The transformation will atempt to isolate the unsafe
+memory dependencies.  The transformation will attempt to isolate the unsafe
 dependencies into their own loop.
 
 This metadata can be used to selectively enable or disable distribution of the
@@ -7380,7 +7389,7 @@ Syntax:
 ::
 
       <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>]
-      <result> = load atomic [volatile] <ty>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>]
+      <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>]
       !<index> = !{ i32 1 }
       !<deref_bytes_node> = !{i64 <dereferenceable_bytes>}
       !<align_node> = !{ i64 <value_alignment> }
@@ -7401,14 +7410,14 @@ modify the number or order of execution of this ``load`` with other
 :ref:`volatile operations <volatile>`.
 
 If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
-<ordering>` and optional ``singlethread`` argument. The ``release`` and
-``acq_rel`` orderings are not valid on ``load`` instructions. Atomic loads
-produce :ref:`defined <memmodel>` results when they may see multiple atomic
-stores. The type of the pointee must be an integer, pointer, or floating-point
-type whose bit width is a power of two greater than or equal to eight and less
-than or equal to a target-specific size limit.  ``align`` must be explicitly
-specified on atomic loads, and the load has undefined behavior if the alignment
-is not set to a value which is at least the size in bytes of the
+<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
+``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
+Atomic loads produce :ref:`defined <memmodel>` results when they may see
+multiple atomic stores. The type of the pointee must be an integer, pointer, or
+floating-point type whose bit width is a power of two greater than or equal to
+eight and less than or equal to a target-specific size limit.  ``align`` must be
+explicitly specified on atomic loads, and the load has undefined behavior if the
+alignment is not set to a value which is at least the size in bytes of the
 pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
 
 The optional constant ``align`` argument specifies the alignment of the
@@ -7509,7 +7518,7 @@ Syntax:
 ::
 
       store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>]        ; yields void
-      store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
+      store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
 
 Overview:
 """""""""
@@ -7529,14 +7538,14 @@ allowed to modify the number or order of execution of this ``store`` with other
 structural type <t_opaque>`) can be stored.
 
 If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
-<ordering>` and optional ``singlethread`` argument. The ``acquire`` and
-``acq_rel`` orderings aren't valid on ``store`` instructions. Atomic loads
-produce :ref:`defined <memmodel>` results when they may see multiple atomic
-stores. The type of the pointee must be an integer, pointer, or floating-point
-type whose bit width is a power of two greater than or equal to eight and less
-than or equal to a target-specific size limit.  ``align`` must be explicitly
-specified on atomic stores, and the store has undefined behavior if the
-alignment is not set to a value which is at least the size in bytes of the
+<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
+``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
+Atomic loads produce :ref:`defined <memmodel>` results when they may see
+multiple atomic stores. The type of the pointee must be an integer, pointer, or
+floating-point type whose bit width is a power of two greater than or equal to
+eight and less than or equal to a target-specific size limit.  ``align`` must be
+explicitly specified on atomic stores, and the store has undefined behavior if
+the alignment is not set to a value which is at least the size in bytes of the
 pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
 
 The optional constant ``align`` argument specifies the alignment of the
@@ -7597,7 +7606,7 @@ Syntax:
 
 ::
 
-      fence [singlethread] <ordering>                   ; yields void
+      fence [syncscope("<target-scope>")] <ordering>  ; yields void
 
 Overview:
 """""""""
@@ -7631,17 +7640,17 @@ A ``fence`` which has ``seq_cst`` ordering, in addition to having both
 ``acquire`` and ``release`` semantics specified above, participates in
 the global program order of other ``seq_cst`` operations and/or fences.
 
-The optional ":ref:`singlethread <singlethread>`" argument specifies
-that the fence only synchronizes with other fences in the same thread.
-(This is useful for interacting with signal handlers.)
+A ``fence`` instruction can also take an optional
+":ref:`syncscope <syncscope>`" argument.
 
 Example:
 """"""""
 
 .. code-block:: llvm
 
-      fence acquire                          ; yields void
-      fence singlethread seq_cst             ; yields void
+      fence acquire                                        ; yields void
+      fence syncscope("singlethread") seq_cst              ; yields void
+      fence syncscope("agent") seq_cst                     ; yields void
 
 .. _i_cmpxchg:
 
@@ -7653,7 +7662,7 @@ Syntax:
 
 ::
 
-      cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields  { ty, i1 }
+      cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields  { ty, i1 }
 
 Overview:
 """""""""
@@ -7682,10 +7691,8 @@ must be at least ``monotonic``, the ordering constraint on failure must be no
 stronger than that on success, and the failure ordering cannot be either
 ``release`` or ``acq_rel``.
 
-The optional "``singlethread``" argument declares that the ``cmpxchg``
-is only atomic with respect to code (usually signal handlers) running in
-the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with
-respect to all other code in the system.
+A ``cmpxchg`` instruction can also take an optional
+":ref:`syncscope <syncscope>`" argument.
 
 The pointer passed into cmpxchg must have alignment greater than or
 equal to the size in memory of the operand.
@@ -7739,7 +7746,7 @@ Syntax:
 
 ::
 
-      atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering>                   ; yields ty
+      atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>                   ; yields ty
 
 Overview:
 """""""""
@@ -7773,6 +7780,9 @@ be a pointer to that type. If the ``atomicrmw`` is marked as
 order of execution of this ``atomicrmw`` with other :ref:`volatile
 operations <volatile>`.
 
+A ``atomicrmw`` instruction can also take an optional
+":ref:`syncscope <syncscope>`" argument.
+
 Semantics:
 """"""""""
 
@@ -10272,6 +10282,8 @@ overlap. It copies "len" bytes of memory over. If the argument is known
 to be aligned to some boundary, this can be specified as the fourth
 argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
 
+.. _int_memmove:
+
 '``llvm.memmove``' Intrinsic
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -10327,6 +10339,8 @@ copies "len" bytes of memory over. If the argument is known to be
 aligned to some boundary, this can be specified as the fourth argument,
 otherwise it should be set to 0 or 1 (both meaning no alignment).
 
+.. _int_memset:
+
 '``llvm.memset.*``' Intrinsics
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -14168,4 +14182,154 @@ In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``
 lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
 is replaced with an actual element size.
 
+Optimizer is allowed to inline memory copy when it's profitable to do so.
+
+'``llvm.memmove.element.unordered.atomic``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use
+``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
+different address spaces. Not all targets support all bit widths however.
+
+::
+
+      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
+                                                                        i8* <src>,
+                                                                        i32 <len>,
+                                                                        i32 <element_size>)
+      declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
+                                                                        i8* <src>,
+                                                                        i64 <len>,
+                                                                        i32 <element_size>)
+
+Overview:
+"""""""""
+
+The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
+of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
+``src`` are treated as arrays with elements that are exactly ``element_size``
+bytes, and the copy between buffers uses a sequence of
+:ref:`unordered atomic <ordering>` load/store operations that are a positive
+integer multiple of the ``element_size`` in size.
+
+Arguments:
+""""""""""
+
+The first three arguments are the same as they are in the
+:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
+``len`` is required to be a positive integer multiple of the ``element_size``.
+If ``len`` is not a positive integer multiple of ``element_size``, then the
+behaviour of the intrinsic is undefined.
+
+``element_size`` must be a compile-time constant positive power of two no
+greater than a target-specific atomic access size limit.
+
+For each of the input pointers the ``align`` parameter attribute must be
+specified. It must be a power of two no less than the ``element_size``. Caller
+guarantees that both the source and destination pointers are aligned to that
+boundary.
+
+Semantics:
+""""""""""
+
+The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
+of memory from the source location to the destination location. These locations
+are allowed to overlap. The memory copy is performed as a sequence of load/store
+operations where each access is guaranteed to be a multiple of ``element_size``
+bytes wide and aligned at an ``element_size`` boundary. 
+
+The order of the copy is unspecified. The same value may be read from the source
+buffer many times, but only one write is issued to the destination buffer per
+element. It is well defined to have concurrent reads and writes to both source
+and destination provided those reads and writes are unordered atomic when
+specified.
+
+This intrinsic does not provide any additional ordering guarantees over those
+provided by a set of unordered loads from the source location and stores to the
+destination.
+
+Lowering:
+"""""""""
+
+In the most general case call to the
+'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
+``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
+actual element size.
+
 The optimizer is allowed to inline the memory copy when it's profitable to do so.
+
+.. _int_memset_element_unordered_atomic:
+
+'``llvm.memset.element.unordered.atomic``' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
+any integer bit width and for different address spaces. Not all targets
+support all bit widths however.
+
+::
+
+      declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
+                                                                  i8 <value>,
+                                                                  i32 <len>,
+                                                                  i32 <element_size>)
+      declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
+                                                                  i8 <value>,
+                                                                  i64 <len>,
+                                                                  i32 <element_size>)
+
+Overview:
+"""""""""
+
+The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
+'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
+with elements that are exactly ``element_size`` bytes, and the assignment to that array
+uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
+that are a positive integer multiple of the ``element_size`` in size.
+
+Arguments:
+""""""""""
+
+The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
+intrinsic, with the added constraint that ``len`` is required to be a positive integer
+multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
+``element_size``, then the behaviour of the intrinsic is undefined.
+
+``element_size`` must be a compile-time constant positive power of two no greater than
+target-specific atomic access size limit.
+
+The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
+must be a power of two no less than the ``element_size``. Caller guarantees that
+the destination pointer is aligned to that boundary.
+
+Semantics:
+""""""""""
+
+The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
+memory starting at the destination location to the given ``value``. The memory is
+set with a sequence of store operations where each access is guaranteed to be a
+multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary. 
+
+The order of the assignment is unspecified. Only one write is issued to the
+destination buffer per element. It is well defined to have concurrent reads and
+writes to the destination provided those reads and writes are unordered atomic
+when specified.
+
+This intrinsic does not provide any additional ordering guarantees over those
+provided by a set of unordered stores to the destination.
+
+Lowering:
+"""""""""
+
+In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
+lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
+is replaced with an actual element size.
+
+The optimizer is allowed to inline the memory assignment when it's profitable to do so.
+
diff --git a/docs/LibFuzzer.rst b/docs/LibFuzzer.rst
index 5acfa04ce1f4..0f0b0e2e6fbd 100644
--- a/docs/LibFuzzer.rst
+++ b/docs/LibFuzzer.rst
@@ -587,7 +587,7 @@ The simplest way is to have a statically initialized global object inside
 
 Alternatively, you may define an optional init function and it will receive
 the program arguments that you can read and modify. Do this **only** if you
-realy need to access ``argv``/``argc``.
+really need to access ``argv``/``argc``.
 
 .. code-block:: c++
 
diff --git a/docs/tutorial/BuildingAJIT1.rst b/docs/tutorial/BuildingAJIT1.rst
index 625cbbba1a5c..88f7aa5abbc7 100644
--- a/docs/tutorial/BuildingAJIT1.rst
+++ b/docs/tutorial/BuildingAJIT1.rst
@@ -12,7 +12,7 @@ Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This
 tutorial runs through the implementation of a JIT compiler using LLVM's
 On-Request-Compilation (ORC) APIs. It begins with a simplified version of the
 KaleidoscopeJIT class used in the
-`Implementing a language with LLVM <LangImpl1.html>`_ tutorials and then
+`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then
 introduces new features like optimization, lazy compilation and remote
 execution.
 
@@ -41,7 +41,7 @@ The structure of the tutorial is:
   a remote process with reduced privileges using the JIT Remote APIs.
 
 To provide input for our JIT we will use the Kaleidoscope REPL from
-`Chapter 7 <LangImpl7.html>`_ of the "Implementing a language in LLVM tutorial",
+`Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial",
 with one minor modification: We will remove the FunctionPassManager from the
 code for that chapter and replace it with optimization support in our JIT class
 in Chapter #2.
@@ -91,8 +91,8 @@ KaleidoscopeJIT
 
 In the previous section we described our API, now we examine a simple
 implementation of it: The KaleidoscopeJIT class [1]_ that was used in the
-`Implementing a language with LLVM <LangImpl1.html>`_ tutorials. We will use
-the REPL code from `Chapter 7 <LangImpl7.html>`_ of that tutorial to supply the
+`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use
+the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the
 input for our JIT: Each time the user enters an expression the REPL will add a
 new IR module containing the code for that expression to the JIT. If the
 expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also
diff --git a/docs/tutorial/BuildingAJIT2.rst b/docs/tutorial/BuildingAJIT2.rst
index 839875266a24..2f22bdad6c14 100644
--- a/docs/tutorial/BuildingAJIT2.rst
+++ b/docs/tutorial/BuildingAJIT2.rst
@@ -25,7 +25,7 @@ IRTransformLayer, to add IR optimization support to KaleidoscopeJIT.
 Optimizing Modules using the IRTransformLayer
 =============================================
 
-In `Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+In `Chapter 4 <LangImpl04.html>`_ of the "Implementing a language with LLVM"
 tutorial series the llvm *FunctionPassManager* is introduced as a means for
 optimizing LLVM IR. Interested readers may read that chapter for details, but
 in short: to optimize a Module we create an llvm::FunctionPassManager
@@ -148,7 +148,7 @@ At the bottom of our JIT we add a private method to do the actual optimization:
 *optimizeModule*. This function sets up a FunctionPassManager, adds some passes
 to it, runs it over every function in the module, and then returns the mutated
 module. The specific optimizations are the same ones used in
-`Chapter 4 <LangImpl4.html>`_ of the "Implementing a language with LLVM"
+`Chapter 4 <LangImpl04.html>`_ of the "Implementing a language with LLVM"
 tutorial series. Readers may visit that chapter for a more in-depth
 discussion of these, and of IR optimization in general.
 
diff --git a/docs/tutorial/LangImpl02.rst b/docs/tutorial/LangImpl02.rst
index 4be447eb5ba3..d72c8dc9add4 100644
--- a/docs/tutorial/LangImpl02.rst
+++ b/docs/tutorial/LangImpl02.rst
@@ -10,7 +10,7 @@ Chapter 2 Introduction
 
 Welcome to Chapter 2 of the "`Implementing a language with
 LLVM <index.html>`_" tutorial. This chapter shows you how to use the
-lexer, built in `Chapter 1 <LangImpl1.html>`_, to build a full
+lexer, built in `Chapter 1 <LangImpl01.html>`_, to build a full
 `parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope
 language. Once we have a parser, we'll define and build an `Abstract
 Syntax Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST).
diff --git a/docs/tutorial/LangImpl03.rst b/docs/tutorial/LangImpl03.rst
index 1dfe10175c74..fab2ddaf8829 100644
--- a/docs/tutorial/LangImpl03.rst
+++ b/docs/tutorial/LangImpl03.rst
@@ -10,7 +10,7 @@ Chapter 3 Introduction
 
 Welcome to Chapter 3 of the "`Implementing a language with
 LLVM <index.html>`_" tutorial. This chapter shows you how to transform
-the `Abstract Syntax Tree <LangImpl2.html>`_, built in Chapter 2, into
+the `Abstract Syntax Tree <LangImpl02.html>`_, built in Chapter 2, into
 LLVM IR. This will teach you a little bit about how LLVM does things, as
 well as demonstrate how easy it is to use. It's much more work to build
 a lexer and parser than it is to generate LLVM IR code. :)
@@ -362,7 +362,7 @@ end of the new basic block. Basic blocks in LLVM are an important part
 of functions that define the `Control Flow
 Graph <http://en.wikipedia.org/wiki/Control_flow_graph>`_. Since we
 don't have any control flow, our functions will only contain one block
-at this point. We'll fix this in `Chapter 5 <LangImpl5.html>`_ :).
+at this point. We'll fix this in `Chapter 5 <LangImpl05.html>`_ :).
 
 Next we add the function arguments to the NamedValues map (after first clearing
 it out) so that they're accessible to ``VariableExprAST`` nodes.
@@ -540,7 +540,7 @@ functions referencing each other.
 
 This wraps up the third chapter of the Kaleidoscope tutorial. Up next,
 we'll describe how to `add JIT codegen and optimizer
-support <LangImpl4.html>`_ to this so we can actually start running
+support <LangImpl04.html>`_ to this so we can actually start running
 code!
 
 Full Code Listing
diff --git a/docs/tutorial/LangImpl04.rst b/docs/tutorial/LangImpl04.rst
index 16d7164ae15e..921c4dcc21ad 100644
--- a/docs/tutorial/LangImpl04.rst
+++ b/docs/tutorial/LangImpl04.rst
@@ -622,7 +622,7 @@ This completes the JIT and optimizer chapter of the Kaleidoscope
 tutorial. At this point, we can compile a non-Turing-complete
 programming language, optimize and JIT compile it in a user-driven way.
 Next up we'll look into `extending the language with control flow
-constructs <LangImpl5.html>`_, tackling some interesting LLVM IR issues
+constructs <LangImpl05.html>`_, tackling some interesting LLVM IR issues
 along the way.
 
 Full Code Listing
diff --git a/docs/tutorial/LangImpl05.rst b/docs/tutorial/LangImpl05.rst
index dcf45bcbf8d2..8650892e8f8b 100644
--- a/docs/tutorial/LangImpl05.rst
+++ b/docs/tutorial/LangImpl05.rst
@@ -269,7 +269,7 @@ Phi nodes:
 #. Values that are implicit in the structure of your AST, such as the
    Phi node in this case.
 
-In `Chapter 7 <LangImpl7.html>`_ of this tutorial ("mutable variables"),
+In `Chapter 7 <LangImpl07.html>`_ of this tutorial ("mutable variables"),
 we'll talk about #1 in depth. For now, just believe me that you don't
 need SSA construction to handle this case. For #2, you have the choice
 of using the techniques that we will describe for #1, or you can insert
@@ -790,7 +790,7 @@ of the tutorial. In this chapter we added two control flow constructs,
 and used them to motivate a couple of aspects of the LLVM IR that are
 important for front-end implementors to know. In the next chapter of our
 saga, we will get a bit crazier and add `user-defined
-operators <LangImpl6.html>`_ to our poor innocent language.
+operators <LangImpl06.html>`_ to our poor innocent language.
 
 Full Code Listing
 =================
diff --git a/docs/tutorial/LangImpl06.rst b/docs/tutorial/LangImpl06.rst
index c1035bce8559..cb8ec766bb26 100644
--- a/docs/tutorial/LangImpl06.rst
+++ b/docs/tutorial/LangImpl06.rst
@@ -41,7 +41,7 @@ The point of going into user-defined operators in a tutorial like this
 is to show the power and flexibility of using a hand-written parser.
 Thus far, the parser we have been implementing uses recursive descent
 for most parts of the grammar and operator precedence parsing for the
-expressions. See `Chapter 2 <LangImpl2.html>`_ for details. By
+expressions. See `Chapter 2 <LangImpl02.html>`_ for details. By
 using operator precedence parsing, it is very easy to allow
 the programmer to introduce new operators into the grammar: the grammar
 is dynamically extensible as the JIT runs.
@@ -734,7 +734,7 @@ side-effects, but it can't actually define and mutate a variable itself.
 
 Strikingly, variable mutation is an important feature of some languages,
 and it is not at all obvious how to `add support for mutable
-variables <LangImpl7.html>`_ without having to add an "SSA construction"
+variables <LangImpl07.html>`_ without having to add an "SSA construction"
 phase to your front-end. In the next chapter, we will describe how you
 can add variable mutation without building SSA in your front-end.
 
diff --git a/docs/tutorial/OCamlLangImpl5.rst b/docs/tutorial/OCamlLangImpl5.rst
index 6e17de4b2bde..d06bf6ec252a 100644
--- a/docs/tutorial/OCamlLangImpl5.rst
+++ b/docs/tutorial/OCamlLangImpl5.rst
@@ -258,7 +258,7 @@ a truth value as a 1-bit (bool) value.
           let then_bb = append_block context "then" the_function in
           position_at_end then_bb builder;
 
-As opposed to the `C++ tutorial <LangImpl5.html>`_, we have to build our
+As opposed to the `C++ tutorial <LangImpl05.html>`_, we have to build our
 basic blocks bottom up since we can't have dangling BasicBlocks. We
 start off by saving a pointer to the first block (which might not be the
 entry block), which we'll need to build a conditional branch later. We