<feed xmlns='http://www.w3.org/2005/Atom'>
<title>src/lib/libc/amd64/string/memset.S, branch releng/13.1</title>
<subtitle>FreeBSD source tree</subtitle>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/'/>
<entry>
<title>amd64: add a note about simd to libc memset, memmove and memcmp</title>
<updated>2021-02-04T17:59:05+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2021-01-31T15:50:34+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=068f2402d28bf2ddee884c83be1dff3a7631569b'/>
<id>068f2402d28bf2ddee884c83be1dff3a7631569b</id>
<content type='text'>
(cherry picked from commit 0db6aef407f30c138982b8cde43189aad098b337)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
(cherry picked from commit 0db6aef407f30c138982b8cde43189aad098b337)
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: add missing ALIGN_TEXT to loops in memset and memmove</title>
<updated>2021-02-01T12:39:18+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2021-01-29T15:09:14+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=3975d4c9e1be53c2a4977acaa314bcdc18c02416'/>
<id>3975d4c9e1be53c2a4977acaa314bcdc18c02416</id>
<content type='text'>
(cherry picked from commit 164c3b81848bc81dc200b12370999474225447a3)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
(cherry picked from commit 164c3b81848bc81dc200b12370999474225447a3)
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: handle small memset buffers with overlapping stores</title>
<updated>2018-11-16T00:44:22+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2018-11-16T00:44:22+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=088ac3ef4b7c315e5669a38197fd04f76a20b8f1'/>
<id>088ac3ef4b7c315e5669a38197fd04f76a20b8f1</id>
<content type='text'>
Instead of jumping to locations which store the exact number of bytes,
use displacement to move the destination.

In particular the following clears an area between 8-16 (inclusive)
branch-free:

movq    %r10,(%rdi)
movq    %r10,-8(%rdi,%rcx)

For instance for rcx of 10 the second line is rdi + 10 - 8 = rdi + 2.
Writing 8 bytes starting at that offset overlaps with 6 bytes written
previously and writes 2 new, giving 10 in total.

Provides a nice win for smaller stores. Other ones are erratic depending
on the microarchitecture.

General idea taken from NetBSD (restricted use of the trick) and bionic
string functions (use for various ranges like in this patch).

Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17660
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of jumping to locations which store the exact number of bytes,
use displacement to move the destination.

In particular the following clears an area between 8-16 (inclusive)
branch-free:

movq    %r10,(%rdi)
movq    %r10,-8(%rdi,%rcx)

For instance for rcx of 10 the second line is rdi + 10 - 8 = rdi + 2.
Writing 8 bytes starting at that offset overlaps with 6 bytes written
previously and writes 2 new, giving 10 in total.

Provides a nice win for smaller stores. Other ones are erratic depending
on the microarchitecture.

General idea taken from NetBSD (restricted use of the trick) and bionic
string functions (use for various ranges like in this patch).

Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17660
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: sync up libc memset with the kernel version</title>
<updated>2018-11-15T20:28:35+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2018-11-15T20:28:35+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=ad2ff705a458c48d540a2fea4917cebad47deb82'/>
<id>ad2ff705a458c48d540a2fea4917cebad47deb82</id>
<content type='text'>
- tidy up memset to have rax set earlier for small sizes
- finish the tail in memset with an overlapping store
- align memset buffers to 16 bytes before using rep stos

Sponsored by:	The FreeBSD Foundation
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- tidy up memset to have rax set earlier for small sizes
- finish the tail in memset with an overlapping store
- align memset buffers to 16 bytes before using rep stos

Sponsored by:	The FreeBSD Foundation
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: convert libc bzero to a C func to avoid future bloat</title>
<updated>2018-11-15T20:20:39+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2018-11-15T20:20:39+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=6fff6344554e70f00588f2f26dcb223904096044'/>
<id>6fff6344554e70f00588f2f26dcb223904096044</id>
<content type='text'>
Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17549
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17549
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: import updated kernel memset to libc</title>
<updated>2018-10-05T19:27:42+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2018-10-05T19:27:42+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=167374a1620b4a0696702a25073fdfd88a0fbaff'/>
<id>167374a1620b4a0696702a25073fdfd88a0fbaff</id>
<content type='text'>
See r339205 for details.

An unused ERMS support is retained in the macro. It will be activated
after ifunc support lands.

Reviewed by:    kib
Approved by:    re (gjb)
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17405
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
See r339205 for details.

An unused ERMS support is retained in the macro. It will be activated
after ifunc support lands.

Reviewed by:    kib
Approved by:    re (gjb)
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17405
</pre>
</div>
</content>
</entry>
<entry>
<title>amd64: reimplement libc memset and bzero with kernel memset</title>
<updated>2018-10-01T20:39:17+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjg@FreeBSD.org</email>
</author>
<published>2018-10-01T20:39:17+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=7e02ad0769002c3270aa49b8a51dcf51f2d49d02'/>
<id>7e02ad0769002c3270aa49b8a51dcf51f2d49d02</id>
<content type='text'>
This is a depessimization, see r334537 for an explanation. Routines
remain significantly slower than they have to be.

bzero was removed from the kernel but remains in libc. Macroify to
accommodate differences to memset (no return value, always setting to 0).

The bzero.S file is left in place due to libc build magic which pulls in
a C variant if a matching .S file is missing.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17355
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a depessimization, see r334537 for an explanation. Routines
remain significantly slower than they have to be.

bzero was removed from the kernel but remains in libc. Macroify to
accommodate differences to memset (no return value, always setting to 0).

The bzero.S file is left in place due to libc build magic which pulls in
a C variant if a matching .S file is missing.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17355
</pre>
</div>
</content>
</entry>
<entry>
<title>Add section .note.GNU-stack for assembly files used by 386 and amd64.</title>
<updated>2011-01-07T16:08:40+00:00</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2011-01-07T16:08:40+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=93ab75867017bed8892f9f3b1e1bbd6120d49fcd'/>
<id>93ab75867017bed8892f9f3b1e1bbd6120d49fcd</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>We've been lax about matching END() macros in asm code for some time.  This</title>
<updated>2008-11-02T01:10:54+00:00</updated>
<author>
<name>Peter Wemm</name>
<email>peter@FreeBSD.org</email>
</author>
<published>2008-11-02T01:10:54+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=5d053f461caeb73f6de165aa1e07b2003101605c'/>
<id>5d053f461caeb73f6de165aa1e07b2003101605c</id>
<content type='text'>
is used to set the ELF size attribute for functions.  It isn't normally
critical but some things can make use of it (gdb for stack traces).
Valgrind needs it so I'm adding it in.  The problem is present on all
branches and on both i386 and amd64.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
is used to set the ELF size attribute for functions.  It isn't normally
critical but some things can make use of it (gdb for stack traces).
Valgrind needs it so I'm adding it in.  The problem is present on all
branches and on both i386 and amd64.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add machine-specific, optimized implementations of bcopy, bzero, memcpy,</title>
<updated>2005-04-07T03:56:03+00:00</updated>
<author>
<name>Alan Cox</name>
<email>alc@FreeBSD.org</email>
</author>
<published>2005-04-07T03:56:03+00:00</published>
<link rel='alternate' type='text/html' href='http://cgit.freebsd.org/src/commit/?id=91c09a383ab76991061b49c3e081c06b3da2a98c'/>
<id>91c09a383ab76991061b49c3e081c06b3da2a98c</id>
<content type='text'>
memmove, and memset.

PR: 73111
Submitted by: Ville-Pertti Keinonen &lt;will@iki.fi&gt; (taken from NetBSD)
MFC after: 3 weeks
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
memmove, and memset.

PR: 73111
Submitted by: Ville-Pertti Keinonen &lt;will@iki.fi&gt; (taken from NetBSD)
MFC after: 3 weeks
</pre>
</div>
</content>
</entry>
</feed>
