* Remove resolver_qual from DEFINE_IFUNC/DEFINE_UIFUNC macros.Konstantin Belousov2019-05-161-1/+1
Remove resolver_qual from DEFINE_IFUNC/DEFINE_UIFUNC macros.

In all practical situations, the resolver visibility is static.

Requested by: markj
Differential revision: https://reviews.freebsd.org/D20281
* x86 __vdso_gettc(): use machine/cpufunc.h function for CPUID.Konstantin Belousov2019-02-141-27/+2
x86 __vdso_gettc(): use machine/cpufunc.h function for CPUID.

Based on the discussion with: jkim
* Add comment noting that the strange spelling of GenuineIntel is for reason.Konstantin Belousov2019-02-071-0/+1
Add comment noting that the strange spelling of GenuineIntel is for reason.

Requested by: rpokala
* Use ifunc to select the barrier instruction for RDTSC.Konstantin Belousov2019-02-071-58/+19
Use ifunc to select the barrier instruction for RDTSC.

This optimizes out runtime switch and removes yet another cpuid from libc.

Note that this is the first use of ifunc in i386 libc, so ifunc-capable toolchain is required for building runnable userspace on i386, same as on amd64.

Discussed with: emaste
* Fix a regression in r321608.Konstantin Belousov2017-08-131-2/+2
Fix a regression in r321608.

On i386 with CPUID but without SSE2, set lfence_works to LMB_NONE instead of looping.

Reported and tested by: Andre Albsmeier <andre@fbsd.e4m.org>
* Simplify flow control.Konstantin Belousov2017-07-281-12/+11
Simplify flow control.

Also add explicit comment why libc cannot simply rely on open(2) failure in capability mode.
* Use MFENCE to serialize RDTSC on non-Intel CPUs.Konstantin Belousov2017-07-271-38/+89
Use MFENCE to serialize RDTSC on non-Intel CPUs.

Kernel already used the stronger barrier instruction for AMDs, correct the userspace fast gettimeofday() implementation as well.

Differential revision: https://reviews.freebsd.org/D11728
* Fix indent.Konstantin Belousov2017-07-251-1/+1
Fix indent.
* Remove unneeded variable initialization from r314319.Mariusz Zaborski2017-02-261-2/+0
Remove unneeded variable initialization from r314319.

Pointed out by: kib
* Don't try to open devices in the gettc() function which will alwaysMariusz Zaborski2017-02-261-11/+28
Don't try to open devices in the gettc() function which will always fail in the Capability mode. Instead silently fallback to the syscall method, which is done for example in the gettimeofday(2) function.

Reviewed by: kib
* Conditionalize hyperv support in gettimeofday(2) based on MK_HYPERVEnji Cooper2017-01-191-4/+4
Conditionalize hyperv support in gettimeofday(2) based on MK_HYPERV

The effect at runtime is negligible as the hyperv timer isn't available except when hyperv is loaded. This is a prerequisite for conditionalizing the header build/install out of the build

Reviewed by: sephe
Differential Revision: https://reviews.freebsd.org/D9242
* __vdso_gettc(): be extra careful with /dev/hpet mappings, never unmapKonstantin Belousov2017-01-041-19/+35
__vdso_gettc(): be extra careful with /dev/hpet mappings, never unmap the mapping which might be accessed by other threads.

If a pointer to the /dev/hpet register page mapping was stored into the hpet_dev_map, other threads might access the page at any time. Never unmap it, instead, keep track of mappings for all hpet units in smal array. Store pointer to the newly mapped registers page using CAS, to detect parallel mappings.

It appeared relatively easy to demonstrate the problem by arranging two threads which perform gettimeofday(2) concurently, first time in the process address space, when HPET is used for timecounter.

PR: 215715
* hyperv: Implement userspace gettimeofday(2) with Hyper-V reference TSCSepherosa Ziehau2016-12-191-0/+73
hyperv: Implement userspace gettimeofday(2) with Hyper-V reference TSC

This 6 times gettimeofday performance, as measured by tools/tools/syscall_timing

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D8789
* Implement userspace gettimeofday(2) with HPET timecounter.Konstantin Belousov2016-08-171-0/+179
Implement userspace gettimeofday(2) with HPET timecounter.

Right now, userspace (fast) gettimeofday(2) on x86 only works for RDTSC. For older machines, like Core2, where RDTSC is not C2/C3 invariant, and which fall to HPET hardware, this means that the call has both the penalty of the syscall and of the uncached hw behind the QPI or PCIe connection to the sought bridge. Nothing can me done against the access latency, but the syscall overhead can be removed.

System already provides mappable /dev/hpetX devices, which gives straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated to handle both RDTSC and HPET. For HPET, the index of the hpet device to mmap is passed from kernel to userspace, index might be changed and libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that timecounters which can be used from userspace, to provide tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64 libc/<arch>/sys/__vdso_gettc.c into one source file in the new libc/x86/sys location. __vdso_gettc() internal interface is changed to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster than userspace HPET access. But still, userspace HPET is three-four times faster than syscall HPET on several Core2 and SandyBridge machines.

Tested by: Howard Su <howard0su@gmail.com>
Differential revision: https://reviews.freebsd.org/D7473