aboutsummaryrefslogtreecommitdiff
path: root/docs/ReleaseNotes.rst
blob: 549021e0e6e6212aeb45cacf3aae076b13d7a5d8 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
========================
LLVM 4.0.0 Release Notes
========================

.. contents::
    :local:

Introduction
============

This document contains the release notes for the LLVM Compiler Infrastructure,
release 4.0.0.  Here we describe the status of LLVM, including major improvements
from the previous release, improvements in various subprojects of LLVM, and
some of the current users of the code.  All LLVM releases may be downloaded
from the `LLVM releases web site <http://llvm.org/releases/>`_.

For more information about LLVM, including information about the latest
release, please check out the `main LLVM web site <http://llvm.org/>`_.  If you
have questions or comments, the `LLVM Developer's Mailing List
<http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ is a good place to send
them.

New Versioning Scheme
=====================
Starting with this release, LLVM is using a
`new versioning scheme <http://blog.llvm.org/2016/12/llvms-new-versioning-scheme.html>`_,
increasing the major version number with each major release. Stable updates to
this release will be versioned 4.0.x, and the next major release, six months
from now, will be version 5.0.0.

Non-comprehensive list of changes in this release
=================================================
* Minimum compiler version to build has been raised to GCC 4.8 and VS 2015.

* The C API functions ``LLVMAddFunctionAttr``, ``LLVMGetFunctionAttr``,
  ``LLVMRemoveFunctionAttr``, ``LLVMAddAttribute``, ``LLVMRemoveAttribute``,
  ``LLVMGetAttribute``, ``LLVMAddInstrAttribute`` and
  ``LLVMRemoveInstrAttribute`` have been removed.

* The C API enum ``LLVMAttribute`` has been deleted.

* The definition and uses of ``LLVM_ATRIBUTE_UNUSED_RESULT`` in the LLVM source
  were replaced with ``LLVM_NODISCARD``, which matches the C++17 ``[[nodiscard]]``
  semantics rather than gcc's ``__attribute__((warn_unused_result))``.

* The Timer related APIs now expect a Name and Description. When upgrading code
  the previously used names should become descriptions and a short name in the
  style of a programming language identifier should be added.

* LLVM now handles ``invariant.group`` across different basic blocks, which makes
  it possible to devirtualize virtual calls inside loops.

* The aggressive dead code elimination phase ("adce") now removes
  branches which do not effect program behavior. Loops are retained by
  default since they may be infinite but these can also be removed
  with LLVM option ``-adce-remove-loops`` when the loop body otherwise has
  no live operations.

* The GVNHoist pass is now enabled by default. The new pass based on Global
  Value Numbering detects similar computations in branch code and replaces
  multiple instances of the same computation with a unique expression.  The
  transform benefits code size and generates better schedules.  GVNHoist is
  more aggressive at ``-Os`` and ``-Oz``, hoisting more expressions at the
  expense of execution time degradations.

 * The llvm-cov tool can now export coverage data as json. Its html output mode
   has also improved.

Improvements to ThinLTO (-flto=thin)
------------------------------------
Integration with profile data (PGO). When available, profile data
enables more accurate function importing decisions, as well as
cross-module indirect call promotion.

Significant build-time and binary-size improvements when compiling with
debug info (-g).

LLVM Coroutines
---------------

Experimental support for :doc:`Coroutines` was added, which can be enabled
with ``-enable-coroutines`` in ``opt`` the command tool or using the
``addCoroutinePassesToExtensionPoints`` API when building the optimization
pipeline.

For more information on LLVM Coroutines and the LLVM implementation, see
`2016 LLVM Developers’ Meeting talk on LLVM Coroutines
<http://llvm.org/devmtg/2016-11/#talk4>`_.

Regcall and Vectorcall Calling Conventions
--------------------------------------------------

Support was added for ``_regcall`` calling convention.
Existing ``__vectorcall`` calling convention support was extended to include
correct handling of HVAs.

The ``__vectorcall`` calling convention was introduced by Microsoft to
enhance register usage when passing parameters.
For more information please read `__vectorcall documentation
<https://msdn.microsoft.com/en-us/library/dn375768.aspx>`_.

The ``__regcall`` calling convention was introduced by Intel to
optimize parameter transfer on function call.
This calling convention ensures that as many values as possible are
passed or returned in registers.
For more information please read `__regcall documentation
<https://software.intel.com/en-us/node/693069>`_.

Code Generation Testing
-----------------------

Passes that work on the machine instruction representation can be tested with
the .mir serialization format. ``llc`` supports the ``-run-pass``,
``-stop-after``, ``-stop-before``, ``-start-after``, ``-start-before`` to
run a single pass of the code generation pipeline, or to stop or start the code
generation pipeline at a given point.

Additional information can be found in the :doc:`MIRLangRef`. The format is
used by the tests ending in ``.mir`` in the ``test/CodeGen`` directory.

This feature is available since 2015. It is used more often lately and was not
mentioned in the release notes yet.

Intrusive list API overhaul
---------------------------

The intrusive list infrastructure was substantially rewritten over the last
couple of releases, primarily to excise undefined behaviour.  The biggest
changes landed in this release.

* ``simple_ilist<T>`` is a lower-level intrusive list that never takes
  ownership of its nodes.  New intrusive-list clients should consider using it
  instead of ``ilist<T>``.

  * ``ilist_tag<class>`` allows a single data type to be inserted into two
    parallel intrusive lists.  A type can inherit twice from ``ilist_node``,
    first using ``ilist_node<T,ilist_tag<A>>`` (enabling insertion into
    ``simple_ilist<T,ilist_tag<A>>``) and second using
    ``ilist_node<T,ilist_tag<B>>`` (enabling insertion into
    ``simple_ilist<T,ilist_tag<B>>``), where ``A`` and ``B`` are arbitrary
    types.

  * ``ilist_sentinel_tracking<bool>`` controls whether an iterator knows
    whether it's pointing at the sentinel (``end()``).  By default, sentinel
    tracking is on when ABI-breaking checks are enabled, and off otherwise;
    this is used for an assertion when dereferencing ``end()`` (this assertion
    triggered often in practice, and many backend bugs were fixed).  Explicitly
    turning on sentinel tracking also enables ``iterator::isEnd()``.  This is
    used by ``MachineInstrBundleIterator`` to iterate over bundles.

* ``ilist<T>`` is built on top of ``simple_ilist<T>``, and supports the same
  configuration options.  As before (and unlike ``simple_ilist<T>``),
  ``ilist<T>`` takes ownership of its nodes.  However, it no longer supports
  *allocating* nodes, and is now equivalent to ``iplist<T>``.  ``iplist<T>``
  will likely be removed in the future.

  * ``ilist<T>`` now always uses ``ilist_traits<T>``.  Instead of passing a
    custom traits class in via a template parameter, clients that want to
    customize the traits should specialize ``ilist_traits<T>``.  Clients that
    want to avoid ownership can specialize ``ilist_alloc_traits<T>`` to inherit
    from ``ilist_noalloc_traits<T>`` (or to do something funky); clients that
    need callbacks can specialize ``ilist_callback_traits<T>`` directly.

* The underlying data structure is now a simple recursive linked list.  The
  sentinel node contains only a "next" (``begin()``) and "prev" (``rbegin()``)
  pointer and is stored in the same allocation as ``simple_ilist<T>``.
  Previously, it was malloc-allocated on-demand by default, although the
  now-defunct ``ilist_sentinel_traits<T>`` was sometimes specialized to avoid
  this.

* The ``reverse_iterator`` class no longer uses ``std::reverse_iterator``.
  Instead, it now has a handle to the same node that it dereferences to.
  Reverse iterators now have the same iterator invalidation semantics as
  forward iterators.

  * ``iterator`` and ``reverse_iterator`` have explicit conversion constructors
    that match ``std::reverse_iterator``'s off-by-one semantics, so that
    reversing the end points of an iterator range results in the same range
    (albeit in reverse).  I.e., ``reverse_iterator(begin())`` equals
    ``rend()``.

  * ``iterator::getReverse()`` and ``reverse_iterator::getReverse()`` return an
    iterator that dereferences to the *same* node.  I.e.,
    ``begin().getReverse()`` equals ``--rend()``.

  * ``ilist_node<T>::getIterator()`` and
    ``ilist_node<T>::getReverseIterator()`` return the forward and reverse
    iterators that dereference to the current node.  I.e.,
    ``begin()->getIterator()`` equals ``begin()`` and
    ``rbegin()->getReverseIterator()`` equals ``rbegin()``.

* ``iterator`` now stores an ``ilist_node_base*`` instead of a ``T*``.  The
  implicit conversions between ``ilist<T>::iterator`` and ``T*`` have been
  removed.  Clients may use ``N->getIterator()`` (if not ``nullptr``) or
  ``&*I`` (if not ``end()``); alternatively, clients may refactor to use
  references for known-good nodes.

Changes to the ARM Targets
--------------------------

**During this release the AArch64 target has:**

* Gained support for ILP32 relocations.
* Gained support for XRay.
* Made even more progress on GlobalISel. There is still some work left before
  it is production-ready though.
* Refined the support for Qualcomm's Falkor and Samsung's Exynos CPUs.
* Learned a few new tricks for lowering multiplications by constants, folding
  spilled/refilled copies etc.

**During this release the ARM target has:**

* Gained support for ROPI (read-only position independence) and RWPI
  (read-write position independence), which can be used to remove the need for
  a dynamic linker.
* Gained support for execute-only code, which is placed in pages without read
  permissions.
* Gained a machine scheduler for Cortex-R52.
* Gained support for XRay.
* Gained Thumb1 implementations for several compiler-rt builtins. It also
  has some support for building the builtins for HF targets.
* Started using the generic bitreverse intrinsic instead of rbit.
* Gained very basic support for GlobalISel.

A lot of work has also been done in LLD for ARM, which now supports more
relocations and TLS.

Changes to the AVR Target
-----------------------------

This marks the first release where the AVR backend has been completely merged
from a fork into LLVM trunk. The backend is still marked experimental, but
is generally quite usable. All downstream development has halted on
`GitHub <https://github.com/avr-llvm/llvm>`_, and changes now go directly into
LLVM trunk.

* Instruction selector and pseudo instruction expansion pass landed
* `read_register` and `write_register` intrinsics are now supported
* Support stack stores greater than 63-bytes from the bottom of the stack
* A number of assertion errors have been fixed
* Support stores to `undef` locations
* Very basic support for the target has been added to clang
* Small optimizations to some 16-bit boolean expressions

Most of the work behind the scenes has been on correctness of generated
assembly, and also fixing some assertions we would hit on some well-formed
inputs.

Changes to the MIPS Target
-----------------------------

**During this release the MIPS target has:**

* IAS is now enabled by default for Debian mips64el.
* Added support for the two operand form for many instructions.
* Added the following macros: unaligned load/store, seq, double word load/store for O32.
* Improved the parsing of complex memory offset expressions.
* Enabled the integrated assembler by default for Debian mips64el.
* Added a generic scheduler based on the interAptiv CPU.
* Added support for thread local relocations.
* Added recip, rsqrt, evp, dvp, synci instructions in IAS.
* Optimized the generation of constants from some cases.

**The following issues have been fixed:**

* Thread local debug information is correctly recorded.
* MSA intrinsics are now range checked.
* Fixed an issue with MSA and the no-odd-spreg abi.
* Fixed some corner cases in handling forbidden slots for MIPSR6.
* Fixed an issue with jumps not being converted to relative branches for assembly.
* Fixed the handling of local symbols and jal instruction.
* N32/N64 no longer have their relocation tables sorted as per their ABIs.
* Fixed a crash when half-precision floating point conversion MSA intrinsics are used.
* Fixed several crashes involving FastISel.
* Corrected the corrected definitions for aui/daui/dahi/dati for MIPSR6.

Changes to the OCaml bindings
-----------------------------

* The attribute API was completely overhauled, following the changes
  to the C API.


External Open Source Projects Using LLVM 4.0.0
==============================================

LDC - the LLVM-based D compiler
-------------------------------

`D <http://dlang.org>`_ is a language with C-like syntax and static typing. It
pragmatically combines efficiency, control, and modeling power, with safety and
programmer productivity. D supports powerful concepts like Compile-Time Function
Execution (CTFE) and Template Meta-Programming, provides an innovative approach
to concurrency and offers many classical paradigms.

`LDC <http://wiki.dlang.org/LDC>`_ uses the frontend from the reference compiler
combined with LLVM as backend to produce efficient native code. LDC targets
x86/x86_64 systems like Linux, OS X, FreeBSD and Windows and also Linux on ARM
and PowerPC (32/64 bit). Ports to other architectures like AArch64 and MIPS64
are underway.


Additional Information
======================

A wide variety of additional information is available on the `LLVM web page
<http://llvm.org/>`_, in particular in the `documentation
<http://llvm.org/docs/>`_ section.  The web page also contains versions of the
API documentation which is up-to-date with the Subversion version of the source
code.  You can access versions of these documents specific to this release by
going into the ``llvm/docs/`` directory in the LLVM tree.

If you have any questions or comments about LLVM, please feel free to contact
us via the `mailing lists <http://llvm.org/docs/#maillist>`_.