path: root/contrib/libgmp/mpn/sparc32/README
diff options
Diffstat (limited to 'contrib/libgmp/mpn/sparc32/README')
1 files changed, 36 insertions, 0 deletions
diff --git a/contrib/libgmp/mpn/sparc32/README b/contrib/libgmp/mpn/sparc32/README
new file mode 100644
index 000000000000..7c19df7bc42d
--- /dev/null
+++ b/contrib/libgmp/mpn/sparc32/README
@@ -0,0 +1,36 @@
+This directory contains mpn functions for various SPARC chips. Code that
+runs only on version 8 SPARC implementations, is in the v8 subdirectory.
+ Load and Store timing
+On most early SPARC implementations, the ST instructions takes multiple
+cycles, while a STD takes just a single cycle more than an ST. For the CPUs
+in SPARCstation I and II, the times are 3 and 4 cycles, respectively.
+Therefore, combining two ST instrucitons into a STD when possible is a
+significant optimiation.
+Later SPARC implementations have single cycle ST.
+For SuperSPARC, we can perform just one memory instruction per cycle, even
+if up to two integer instructions can be executed in its pipeline. For
+programs that perform so many memory operations that there are not enough
+non-memory operations to issue in parallel with all memory operations, using
+LDD and STD when possible helps.
+1. On a SuperSPARC, mpn_lshift and mpn_rshift run at 3 cycles/limb, or 2.5
+ cycles/limb asymptotically. We could optimize speed for special counts
+ by using ADDXCC.
+2. On a SuperSPARC, mpn_add_n and mpn_sub_n runs at 2.5 cycles/limb, or 2
+ cycles/limb asymptotically.
+3. mpn_mul_1 runs at what is believed to be optimal speed.
+4. On SuperSPARC, mpn_addmul_1 and mpn_submul_1 could both be improved by a
+ cycle by avoiding one of the add instrucitons. See a29k/addmul_1.
+The speed of the code for other SPARC implementations is uncertain.