Subject: Re: Quad math on sparc64
To: James Chacon <jchacon@genuity.net>
From: James Chacon <jchacon@genuity.net>
List: tech-toolchain
Date: 03/28/2002 03:25:17
I fixed a few bugs I ran into after a snapshot build and the latest one
has no issues I can find so I just committed this.

For folks who want a clean snapshot of everything I'm uploading one to

ftp.netbsd.org:/pub/NetBSD/arch/sparc64/snapshot/20020328

James

>
>The history:
>
>Quad math isn't implemented natively on sparc processors (at least any I'm
>aware of) so anything trying to use long doubles will end up in 1 of 2 places
>when the compiler runs
>
>1. Generates instructions for quad math which trap to emulation in the kernel
>2. Generates soft-quad calls via the sparc 64bit ABI (the infamouse _Qp* calls)
>
>Right now in-tree gcc will do the first route unless one specifies
>-msoft-quad-float on the compiler options. Even then it doesn't do the right
>thing because it tries to generate calls that look like the sparc 32bit quad
>math calls (which take floating point args). The sparc64 calls take pointers
>to floating args and the compiler todays generates bad code for these.
>The ABI calls for option #2 are in libc now (I did these about a month back)
>and they work based on my testing.
>
>The patches I've attached will fix the code generation for the _Qp* calls. 
>These are the changes pulled back from gcc 3.x (it hasn't changed from 3.0.1 
>to -current really for these specific mods). Once these are in place and soft 
>quad math is used it'll generate the correct calls and the softfloat routines 
>generate the correct answers. 
>
>Now the problem...The code in gcc is basically f*cked from being able to 
>generate hard quad calls once these changes go in. The insn expansion gets
>nasty looking for the hard quad case and nothing today can reduce/handle it.
>The reality is that while it should be fixed it really just isn't that 
>important overall. 
>
>I've verified this is still the case even on gcc-current as of today so it's
>not the patches, it's just gcc having it's own issues.
>
>gcc-current sets soft-quad math as the default for solaris, linux, etc so thats
>what I'm also seeing is the solution here.
>
>There's no point to hard quad math out of the compiler really (it's a lot
>slower to trap through the kernel for math emulation when a userland library
>and API can do it just fine). With these patches soft-quad is the default so
>other issues go away as well. We can take out the 3-4 sparc64 specific places
>in the tree like awk and libgcc that have special options. Plus pkgsrc should 
>get easier to deal with on some packages without them needing sparc64 specific
>patches. It's just the compiler would never output "faddq" on it's own anymore.
>The assembler would still take it though and the kernel would still 
>trap/emulate it if presented so this doesn't present any issues WRT 
>compatability/ABI changes.
>
>I'm compiling a snapshot now with this compiler to test things but in all
>my unit/regression tests I didn't find any issues. 
>
>I'd like comments back because I'd like to commit this soon unless there's
>a stong objection otherwise.
>
>James
>
>Index: netbsd64.h
>===================================================================
>RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/netbsd64.h,v
>retrieving revision 1.7
>diff -u -r1.7 netbsd64.h
>--- netbsd64.h	2002/03/19 18:12:27	1.7
>+++ netbsd64.h	2002/03/22 10:19:56
>@@ -15,6 +15,11 @@
> 
> #include <sparc/sp64-elf.h>
> 
>+#undef TARGET_DEFAULT
>+#define TARGET_DEFAULT \
>+(MASK_V9 + MASK_PTR64 + MASK_64BIT + /* MASK_HARD_QUAD */ \
>+ + MASK_APP_REGS + MASK_EPILOGUE + MASK_FPU + MASK_STACK_BIAS)
>+
> #undef SPARC_DEFAULT_CMODEL
> #define SPARC_DEFAULT_CMODEL CM_MEDANY
> 
>Index: sparc.c
>===================================================================
>RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.c,v
>retrieving revision 1.4
>diff -u -r1.4 sparc.c
>--- sparc.c	2001/04/23 12:23:28	1.4
>+++ sparc.c	2002/03/22 10:19:58
>@@ -4578,6 +4578,152 @@
>   return string;
> }
> 
>+/* Emit a library call comparison between floating point X and Y.
>+   COMPARISON is the rtl operator to compare with (EQ, NE, GT, etc.).
>+   TARGET_ARCH64 uses _Qp_* functions, which use pointers to TFmode
>+   values as arguments instead of the TFmode registers themselves,
>+   that's why we cannot call emit_float_lib_cmp.  */
>+void
>+sparc_emit_float_lib_cmp (x, y, comparison)
>+     rtx x, y;
>+     enum rtx_code comparison;
>+{
>+  char *qpfunc;
>+  rtx slot0, slot1, result, tem, tem2;
>+  enum machine_mode mode;
>+
>+  switch (comparison)
>+    {
>+    case EQ:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_feq" : "_Q_feq";
>+      break;
>+
>+    case NE:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_fne" : "_Q_fne";
>+      break;
>+
>+    case GT:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_fgt" : "_Q_fgt";
>+      break;
>+
>+    case GE:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_fge" : "_Q_fge";
>+      break;
>+
>+    case LT:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_flt" : "_Q_flt";
>+      break;
>+
>+    case LE:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_fle" : "_Q_fle";
>+      break;
>+
>+      /*    case UNORDERED:
>+    case UNGT:
>+    case UNLT:
>+    case UNEQ:
>+    case UNGE:
>+    case UNLE:
>+    case LTGT:
>+      qpfunc = (TARGET_ARCH64) ? "_Qp_cmp" : "_Q_cmp";
>+      break;
>+      */
>+    default:
>+      abort();
>+      break;
>+    }
>+
>+  if (TARGET_ARCH64)
>+    {
>+      if (GET_CODE (x) != MEM)
>+        {
>+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot0, x));
>+        }
>+      else
>+        slot0 = x;
>+
>+      if (GET_CODE (y) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, y));
>+        }
>+      else
>+        slot1 = y;
>+
>+      emit_library_call (gen_rtx_SYMBOL_REF (Pmode, qpfunc), 1,
>+                         DImode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode);
>+
>+      mode = DImode;
>+    }
>+  else
>+    {
>+      emit_library_call (gen_rtx_SYMBOL_REF (Pmode, qpfunc), 1,
>+                         SImode, 2,
>+                         x, TFmode, y, TFmode);
>+
>+      mode = SImode;
>+    }
>+
>+
>+  /* Immediately move the result of the libcall into a pseudo
>+     register so reload doesn't clobber the value if it needs
>+     the return register for a spill reg.  */
>+  result = gen_reg_rtx (mode);
>+  emit_move_insn (result, hard_libcall_value (mode));
>+
>+  switch (comparison)
>+    {
>+    default:
>+      emit_cmp_insn (result, const0_rtx, NE,
>+                     NULL_RTX, mode, 0, 0);
>+      break;
>+      /*    case ORDERED:
>+    case UNORDERED:
>+      emit_cmp_insn (result, GEN_INT(3),
>+                     (comparison == UNORDERED) ? EQ : NE,
>+                     NULL_RTX, mode, 0, 0);
>+      break;
>+    case UNGT:
>+    case UNGE:
>+      emit_cmp_insn (result, const1_rtx,
>+                     (comparison == UNGT) ? GT : NE,
>+                     NULL_RTX, mode, 0, 0);
>+      break;
>+    case UNLE:
>+      emit_cmp_insn (result, const2_rtx, NE,
>+                     NULL_RTX, mode, 0, 0);
>+      break;
>+    case UNLT:
>+      tem = gen_reg_rtx (mode);
>+      if (TARGET_ARCH32)
>+        emit_insn (gen_andsi3 (tem, result, const1_rtx));
>+      else
>+        emit_insn (gen_anddi3 (tem, result, const1_rtx));
>+      emit_cmp_insn (tem, const0_rtx, NE,
>+                     NULL_RTX, mode, 0, 0);
>+      break;
>+    case UNEQ:
>+    case LTGT:
>+      tem = gen_reg_rtx (mode);
>+      if (TARGET_ARCH32)
>+        emit_insn (gen_addsi3 (tem, result, const1_rtx));
>+      else
>+        emit_insn (gen_adddi3 (tem, result, const1_rtx));
>+      tem2 = gen_reg_rtx (mode);
>+      if (TARGET_ARCH32)
>+        emit_insn (gen_andsi3 (tem2, tem, const2_rtx));
>+      else
>+        emit_insn (gen_anddi3 (tem2, tem, const2_rtx));
>+      emit_cmp_insn (tem2, const0_rtx,
>+                     (comparison == UNEQ) ? EQ : NE,
>+                     NULL_RTX, mode, 0, 0);
>+		     break;*/
>+    }
>+}
>+
> /* Return the string to output a conditional branch to LABEL, testing
>    register REG.  LABEL is the operand number of the label; REG is the
>    operand number of the reg.  OP is the conditional expression.  The mode
>Index: sparc.h
>===================================================================
>RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.h,v
>retrieving revision 1.2
>diff -u -r1.2 sparc.h
>--- sparc.h	2001/03/06 05:21:48	1.2
>+++ sparc.h	2002/03/22 10:20:00
>@@ -2625,26 +2625,25 @@
> #define MULSI3_LIBCALL "*.umul"
> 
> /* Define library calls for quad FP operations.  These are all part of the
>-   SPARC ABI.
>-   ??? ARCH64 still does not work as the _Qp_* routines take pointers.  */
>-#define ADDTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_add" : "_Q_add")
>-#define SUBTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_sub" : "_Q_sub")
>-#define NEGTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_neg" : "_Q_neg")
>-#define MULTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_mul" : "_Q_mul")
>-#define DIVTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_div" : "_Q_div")
>-#define FLOATSITF2_LIBCALL (TARGET_ARCH64 ? "_Qp_itoq" : "_Q_itoq")
>-#define FIX_TRUNCTFSI2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtoi" : "_Q_qtoi")
>-#define FIXUNS_TRUNCTFSI2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtoui" : "_Q_qtou")
>-#define EXTENDSFTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_stoq" : "_Q_stoq")
>-#define TRUNCTFSF2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtos" :  "_Q_qtos")
>-#define EXTENDDFTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_dtoq" : "_Q_dtoq")
>-#define TRUNCTFDF2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtod" : "_Q_qtod")
>-#define EQTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_feq" : "_Q_feq")
>-#define NETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fne" : "_Q_fne")
>-#define GTTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fgt" : "_Q_fgt")
>-#define GETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fge" : "_Q_fge")
>-#define LTTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_flt" : "_Q_flt")
>-#define LETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fle" : "_Q_fle")
>+   SPARC 32bit ABI. */
>+#define ADDTF3_LIBCALL "_Q_add"
>+#define SUBTF3_LIBCALL "_Q_sub"
>+#define NEGTF2_LIBCALL "_Q_neg"
>+#define MULTF3_LIBCALL "_Q_mul"
>+#define DIVTF3_LIBCALL "_Q_div"
>+#define FLOATSITF2_LIBCALL "_Q_itoq"
>+#define FIX_TRUNCTFSI2_LIBCALL "_Q_qtoi"
>+#define FIXUNS_TRUNCTFSI2_LIBCALL "_Q_qtou"
>+#define EXTENDSFTF2_LIBCALL "_Q_stoq"
>+#define TRUNCTFSF2_LIBCALL "_Q_qtos"
>+#define EXTENDDFTF2_LIBCALL "_Q_dtoq"
>+#define TRUNCTFDF2_LIBCALL "_Q_qtod"
>+#define EQTF2_LIBCALL "_Q_feq"
>+#define NETF2_LIBCALL "_Q_fne"
>+#define GTTF2_LIBCALL "_Q_fgt"
>+#define GETF2_LIBCALL "_Q_fge"
>+#define LTTF2_LIBCALL "_Q_flt"
>+#define LETF2_LIBCALL "_Q_fle"
> 
> /* We can define the TFmode sqrt optab only if TARGET_FPU.  This is because
>    with soft-float, the SFmode and DFmode sqrt instructions will be absent,
>@@ -2652,34 +2651,37 @@
>    for calls to the builtin function sqrt, but this fails.  */
> #define INIT_TARGET_OPTABS						\
>   do {									\
>-    add_optab->handlers[(int) TFmode].libfunc				\
>-      = gen_rtx_SYMBOL_REF (Pmode, ADDTF3_LIBCALL);			\
>-    sub_optab->handlers[(int) TFmode].libfunc				\
>-      = gen_rtx_SYMBOL_REF (Pmode, SUBTF3_LIBCALL);			\
>-    neg_optab->handlers[(int) TFmode].libfunc				\
>-      = gen_rtx_SYMBOL_REF (Pmode, NEGTF2_LIBCALL);			\
>-    smul_optab->handlers[(int) TFmode].libfunc				\
>-      = gen_rtx_SYMBOL_REF (Pmode, MULTF3_LIBCALL);			\
>-    flodiv_optab->handlers[(int) TFmode].libfunc			\
>-      = gen_rtx_SYMBOL_REF (Pmode, DIVTF3_LIBCALL);			\
>-    eqtf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EQTF2_LIBCALL);		\
>-    netf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, NETF2_LIBCALL);		\
>-    gttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GTTF2_LIBCALL);		\
>-    getf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GETF2_LIBCALL);		\
>-    lttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LTTF2_LIBCALL);		\
>-    letf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LETF2_LIBCALL);		\
>-    trunctfsf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFSF2_LIBCALL);   \
>-    trunctfdf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFDF2_LIBCALL);   \
>-    extendsftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDSFTF2_LIBCALL); \
>-    extenddftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDDFTF2_LIBCALL); \
>-    floatsitf_libfunc = gen_rtx_SYMBOL_REF (Pmode, FLOATSITF2_LIBCALL);    \
>-    fixtfsi_libfunc = gen_rtx_SYMBOL_REF (Pmode, FIX_TRUNCTFSI2_LIBCALL);  \
>-    fixunstfsi_libfunc							\
>-      = gen_rtx_SYMBOL_REF (Pmode, FIXUNS_TRUNCTFSI2_LIBCALL);		\
>-    if (TARGET_FPU)							\
>-      sqrt_optab->handlers[(int) TFmode].libfunc			\
>-	= gen_rtx_SYMBOL_REF (Pmode, "_Q_sqrt");			\
>-    INIT_SUBTARGET_OPTABS;						\
>+    if (TARGET_ARCH32)                                                  \
>+      {                                                                 \
>+        add_optab->handlers[(int) TFmode].libfunc                       \
>+          = gen_rtx_SYMBOL_REF (Pmode, ADDTF3_LIBCALL);                 \
>+        sub_optab->handlers[(int) TFmode].libfunc                       \
>+          = gen_rtx_SYMBOL_REF (Pmode, SUBTF3_LIBCALL);                 \
>+        neg_optab->handlers[(int) TFmode].libfunc                       \
>+          = gen_rtx_SYMBOL_REF (Pmode, NEGTF2_LIBCALL);                 \
>+        smul_optab->handlers[(int) TFmode].libfunc                      \
>+          = gen_rtx_SYMBOL_REF (Pmode, MULTF3_LIBCALL);                 \
>+        flodiv_optab->handlers[(int) TFmode].libfunc                    \
>+          = gen_rtx_SYMBOL_REF (Pmode, DIVTF3_LIBCALL);                 \
>+        eqtf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EQTF2_LIBCALL);      \
>+        netf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, NETF2_LIBCALL);      \
>+        gttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GTTF2_LIBCALL);      \
>+        getf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GETF2_LIBCALL);      \
>+        lttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LTTF2_LIBCALL);      \
>+        letf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LETF2_LIBCALL);      \
>+        trunctfsf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFSF2_LIBCALL);  \
>+        trunctfdf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFDF2_LIBCALL);  \
>+        extendsftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDSFTF2_LIBCALL);\
>+        extenddftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDDFTF2_LIBCALL);\
>+        floatsitf_libfunc = gen_rtx_SYMBOL_REF (Pmode, FLOATSITF2_LIBCALL);   \
>+        fixtfsi_libfunc = gen_rtx_SYMBOL_REF (Pmode, FIX_TRUNCTFSI2_LIBCALL); \
>+        fixunstfsi_libfunc                                              \
>+          = gen_rtx_SYMBOL_REF (Pmode, FIXUNS_TRUNCTFSI2_LIBCALL);      \
>+        if (TARGET_FPU)                                                 \
>+          sqrt_optab->handlers[(int) TFmode].libfunc                    \
>+            = gen_rtx_SYMBOL_REF (Pmode, "_Q_sqrt");                    \
>+      }                                                                 \
>+    INIT_SUBTARGET_OPTABS;                                              \
>   } while (0)
> 
> /* This is meant to be redefined in the host dependent files */
>Index: sparc.md
>===================================================================
>RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.md,v
>retrieving revision 1.4
>diff -u -r1.4 sparc.md
>--- sparc.md	2001/04/23 12:23:28	1.4
>+++ sparc.md	2002/03/22 10:20:01
>@@ -837,7 +837,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }      
>@@ -890,7 +890,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }      
>@@ -911,7 +911,7 @@
> {
>   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }
>@@ -932,7 +932,7 @@
> {
>   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }
>@@ -953,7 +953,7 @@
> {
>   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }
>@@ -974,7 +974,7 @@
> {
>   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
>       emit_insn (gen_sne (operands[0]));
>       DONE;
>     }
>@@ -1608,7 +1608,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1632,7 +1632,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1656,7 +1656,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1690,7 +1690,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1724,7 +1724,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1758,7 +1758,7 @@
>     }
>   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>     {
>-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
>+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
>       emit_jump_insn (gen_bne (operands[0]));
>       DONE;
>     }      
>@@ -1774,6 +1774,145 @@
>   "
> { operands[1] = gen_compare_reg (LEU, sparc_compare_op0, sparc_compare_op1);
> }")
>+
>+;;(define_expand "bunordered"
>+;;  [(set (pc)
>+;;        (if_then_else (unordered (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1,
>+;;                                UNORDERED);
>+;;      emit_jump_insn (gen_beq (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNORDERED, sparc_compare_op0,
>+;;                                 sparc_compare_op1);
>+;;}")
>+
>+;;(define_expand "bordered"
>+;;  [(set (pc)
>+;;        (if_then_else (ordered (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, ORDERED);
>+;;      emit_jump_insn (gen_bne (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (ORDERED, sparc_compare_op0,
>+;;                                 sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "bungt"
>+;;  [(set (pc)
>+;;        (if_then_else (ungt (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNGT);
>+;;      emit_jump_insn (gen_bgt (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNGT, sparc_compare_op0, sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "bunlt"
>+;;  [(set (pc)
>+;;        (if_then_else (unlt (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNLT);
>+;;      emit_jump_insn (gen_bne (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNLT, sparc_compare_op0, sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "buneq"
>+;;  [(set (pc)
>+;;        (if_then_else (uneq (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNEQ);
>+;;      emit_jump_insn (gen_beq (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNEQ, sparc_compare_op0, sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "bunge"
>+;;  [(set (pc)
>+;;        (if_then_else (unge (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNGE);
>+;;      emit_jump_insn (gen_bne (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNGE, sparc_compare_op0, sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "bunle"
>+;;  [(set (pc)
>+;;        (if_then_else (unle (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNLE);
>+;;      emit_jump_insn (gen_bne (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (UNLE, sparc_compare_op0, sparc_compare_op1);
>+;;}")
>+;;
>+;;(define_expand "bltgt"
>+;;  [(set (pc)
>+;;        (if_then_else (ltgt (match_dup 1) (const_int 0))
>+;;                      (label_ref (match_operand 0 "" ""))
>+;;                      (pc)))]
>+;;  ""
>+;;  "
>+;;{
>+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
>+;;    {
>+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LTGT);
>+;;      emit_jump_insn (gen_bne (operands[0]));
>+;;      DONE;
>+;;    }
>+;;  operands[1] = gen_compare_reg (LTGT, sparc_compare_op0, sparc_compare_op1);
>+;;}")
> 
> ;; Now match both normal and inverted jump.
> 
>@@ -4518,16 +4657,70 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>-(define_insn "extendsftf2"
>+(define_expand "extendsftf2"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(float_extend:TF
> 	 (match_operand:SF 1 "register_operand" "f")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_stoq\"), 0,
>+                         VOIDmode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         operands[1], SFmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*extendsftf2_hq"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (float_extend:TF
>+         (match_operand:SF 1 "register_operand" "f")))]
>   "TARGET_FPU && TARGET_HARD_QUAD"
>   "fstoq\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "extenddftf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (float_extend:TF
>+         (match_operand:DF 1 "register_operand" "e")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_dtoq\"), 0,
>+                         VOIDmode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         operands[1], DFmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
> 
>-(define_insn "extenddftf2"
>+(define_insn "*extenddftf2_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(float_extend:TF
> 	 (match_operand:DF 1 "register_operand" "e")))]
>@@ -4544,8 +4737,34 @@
>   "fdtos\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "trunctfsf2"
>+  [(set (match_operand:SF 0 "register_operand" "=f")
>+        (float_truncate:SF
>+         (match_operand:TF 1 "register_operand" "e")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+        }
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtos\"),
>+                               operands[0], 0, SFmode, 1,
>+                               XEXP (slot0, 0), Pmode);
>+      DONE;
>+    }
>+}")
> 
>-(define_insn "trunctfsf2"
>+(define_insn "*trunctfsf2_hq"
>   [(set (match_operand:SF 0 "register_operand" "=f")
> 	(float_truncate:SF
> 	 (match_operand:TF 1 "register_operand" "e")))]
>@@ -4554,7 +4773,33 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>-(define_insn "trunctfdf2"
>+(define_expand "trunctfdf2"
>+  [(set (match_operand:DF 0 "register_operand" "=f")
>+        (float_truncate:DF
>+         (match_operand:TF 1 "register_operand" "e")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+        }
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtod\"),
>+                               operands[0], 0, DFmode, 1,
>+                               XEXP (slot0, 0), Pmode);
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*trunctfdf2_hq"
>   [(set (match_operand:DF 0 "register_operand" "=e")
> 	(float_truncate:DF
> 	 (match_operand:TF 1 "register_operand" "e")))]
>@@ -4580,8 +4825,34 @@
>   "fitod\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "floatsitf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (float:TF (match_operand:SI 1 "register_operand" "f")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_itoq\"), 0,
>+                         VOIDmode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         operands[1], SImode);
> 
>-(define_insn "floatsitf2"
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*floatsitf2_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(float:TF (match_operand:SI 1 "register_operand" "f")))]
>   "TARGET_FPU && TARGET_HARD_QUAD"
>@@ -4589,6 +4860,29 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>+(define_expand "floatunssitf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (unsigned_float:TF (match_operand:SI 1 "register_operand" "e")))]
>+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
>+  "
>+{
>+  rtx slot0;
>+
>+  if (GET_CODE (operands[1]) != MEM)
>+    slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+  else
>+    slot0 = operands[1];
>+
>+  emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_uitoq\"), 0,
>+                     VOIDmode, 2,
>+                     XEXP (slot0, 0), Pmode,
>+                     operands[1], SImode);
>+
>+  if (GET_CODE (operands[0]) != MEM)
>+    emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+  DONE;
>+}")
>+
> ;; Now the same for 64 bit sources.
> 
> (define_insn "floatdisf2"
>@@ -4606,8 +4900,34 @@
>   "fxtod\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "floatditf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (float:TF (match_operand:DI 1 "register_operand" "e")))]
>+  "TARGET_FPU && TARGET_V9 && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_xtoq\"), 0,
>+                         VOIDmode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         operands[1], DImode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
> 
>-(define_insn "floatditf2"
>+(define_insn "*floatditf2_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(float:TF (match_operand:DI 1 "register_operand" "e")))]
>   "TARGET_V9 && TARGET_FPU && TARGET_HARD_QUAD"
>@@ -4615,6 +4935,29 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>+(define_expand "floatunsditf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (unsigned_float:TF (match_operand:DI 1 "register_operand" "e")))]
>+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
>+  "
>+{
>+  rtx slot0;
>+
>+  if (GET_CODE (operands[1]) != MEM)
>+    slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+  else
>+    slot0 = operands[1];
>+
>+  emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_uxtoq\"), 0,
>+                     VOIDmode, 2,
>+                     XEXP (slot0, 0), Pmode,
>+                     operands[1], DImode);
>+
>+  if (GET_CODE (operands[0]) != MEM)
>+    emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+  DONE;
>+}")
>+
> ;; Convert a float to an actual integer.
> ;; Truncation is performed as part of the conversion.
> 
>@@ -4634,14 +4977,61 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>-(define_insn "fix_trunctfsi2"
>+(define_expand "fix_trunctfsi2"
>   [(set (match_operand:SI 0 "register_operand" "=f")
>+        (fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+        }
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoi\"),
>+                               operands[0], 0, SImode, 1,
>+                               XEXP (slot0, 0), Pmode);
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*fix_trunctfsi2_hq"
>+  [(set (match_operand:SI 0 "register_operand" "=f")
> 	(fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>   "TARGET_FPU && TARGET_HARD_QUAD"
>   "fqtoi\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>+(define_expand "fixuns_trunctfsi2"
>+  [(set (match_operand:SI 0 "register_operand" "=f")
>+        (unsigned_fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
>+  "
>+{
>+  rtx slot0;
>+
>+  if (GET_CODE (operands[1]) != MEM)
>+    {
>+      slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+    }
>+  else
>+    slot0 = operands[1];
>+
>+  emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoui\"),
>+                           operands[0], 0, SImode, 1,
>+                           XEXP (slot0, 0), Pmode);
>+  DONE;
>+}")
>+
> ;; Now the same, for V9 targets
> 
> (define_insn "fix_truncsfdi2"
>@@ -4660,13 +5050,61 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>-(define_insn "fix_trunctfdi2"
>+(define_expand "fix_trunctfdi2"
>   [(set (match_operand:DI 0 "register_operand" "=e")
>+        (fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>+  "TARGET_V9 && TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0;
>+
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+        }
>+      else
>+        slot0 = operands[1];
>+
>+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtox\"),
>+                               operands[0], 0, DImode, 1,
>+                               XEXP (slot0, 0), Pmode);
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*fix_trunctfdi2_hq"
>+  [(set (match_operand:DI 0 "register_operand" "=e")
> 	(fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>   "TARGET_V9 && TARGET_FPU && TARGET_HARD_QUAD"
>   "fqtox\\t%1, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "fixuns_trunctfdi2"
>+  [(set (match_operand:DI 0 "register_operand" "=f")
>+        (unsigned_fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
>+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
>+  "
>+{
>+  rtx slot0;
>+
>+  if (GET_CODE (operands[1]) != MEM)
>+    {
>+      slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
>+    }
>+  else
>+    slot0 = operands[1];
>+
>+  emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoux\"),
>+                           operands[0], 0, DImode, 1,
>+                           XEXP (slot0, 0), Pmode);
>+  DONE;
>+}")
>+
> 
> ;;- arithmetic instructions
> 
>@@ -6591,8 +7029,50 @@
>    (set_attr "length" "1")])
> 
> ;; Floating point arithmetic instructions.
>+
>+(define_expand "addtf3"
>+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
>+        (plus:TF (match_operand:TF 1 "general_operand" "")
>+                 (match_operand:TF 2 "general_operand" "")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0, slot1, slot2;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
>+        }
>+      else
>+        slot1 = operands[1];
>+      if (GET_CODE (operands[2]) != MEM)
>+        {
>+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
>+        }
>+      else
>+        slot2 = operands[2];
> 
>-(define_insn "addtf3"
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_add\"), 0,
>+                         VOIDmode, 3,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode,
>+                         XEXP (slot2, 0), Pmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*addtf3_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(plus:TF (match_operand:TF 1 "register_operand" "e")
> 		 (match_operand:TF 2 "register_operand" "e")))]
>@@ -6618,8 +7098,50 @@
>   "fadds\\t%1, %2, %0"
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
>+
>+(define_expand "subtf3"
>+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
>+        (minus:TF (match_operand:TF 1 "general_operand" "")
>+                  (match_operand:TF 2 "general_operand" "")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0, slot1, slot2;
> 
>-(define_insn "subtf3"
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
>+        }
>+      else
>+        slot1 = operands[1];
>+      if (GET_CODE (operands[2]) != MEM)
>+        {
>+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
>+        }
>+      else
>+        slot2 = operands[2];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_sub\"), 0,
>+                         VOIDmode, 3,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode,
>+                         XEXP (slot2, 0), Pmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*subtf3_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(minus:TF (match_operand:TF 1 "register_operand" "e")
> 		  (match_operand:TF 2 "register_operand" "e")))]
>@@ -6646,7 +7168,49 @@
>   [(set_attr "type" "fp")
>    (set_attr "length" "1")])
> 
>-(define_insn "multf3"
>+(define_expand "multf3"
>+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
>+        (mult:TF (match_operand:TF 1 "general_operand" "")
>+                 (match_operand:TF 2 "general_operand" "")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0, slot1, slot2;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
>+        }
>+      else
>+        slot1 = operands[1];
>+      if (GET_CODE (operands[2]) != MEM)
>+        {
>+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
>+        }
>+      else
>+        slot2 = operands[2];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_mul\"), 0,
>+                         VOIDmode, 3,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode,
>+                         XEXP (slot2, 0), Pmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
>+(define_insn "*multf3_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(mult:TF (match_operand:TF 1 "register_operand" "e")
> 		 (match_operand:TF 2 "register_operand" "e")))]
>@@ -6691,8 +7255,50 @@
>   [(set_attr "type" "fpmul")
>    (set_attr "length" "1")])
> 
>+(define_expand "divtf3"
>+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
>+        (div:TF (match_operand:TF 1 "general_operand" "")
>+                (match_operand:TF 2 "general_operand" "")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0, slot1, slot2;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
>+        }
>+      else
>+        slot1 = operands[1];
>+      if (GET_CODE (operands[2]) != MEM)
>+        {
>+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
>+        }
>+      else
>+        slot2 = operands[2];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_div\"), 0,
>+                         VOIDmode, 3,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode,
>+                         XEXP (slot2, 0), Pmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
>+
> ;; don't have timing for quad-prec. divide.
>-(define_insn "divtf3"
>+(define_insn "*divtf3_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(div:TF (match_operand:TF 1 "register_operand" "e")
> 		(match_operand:TF 2 "register_operand" "e")))]
>@@ -6962,8 +7568,41 @@
>   "fabss\\t%1, %0"
>   [(set_attr "type" "fpmove")
>    (set_attr "length" "1")])
>+
>+(define_expand "sqrttf2"
>+  [(set (match_operand:TF 0 "register_operand" "=e")
>+        (sqrt:TF (match_operand:TF 1 "register_operand" "e")))]
>+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
>+  "
>+{
>+  if (! TARGET_HARD_QUAD)
>+    {
>+      rtx slot0, slot1;
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+      else
>+        slot0 = operands[0];
>+      if (GET_CODE (operands[1]) != MEM)
>+        {
>+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
>+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
>+        }
>+      else
>+        slot1 = operands[1];
>+
>+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_sqrt\"), 0,
>+                         VOIDmode, 2,
>+                         XEXP (slot0, 0), Pmode,
>+                         XEXP (slot1, 0), Pmode);
>+
>+      if (GET_CODE (operands[0]) != MEM)
>+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
>+      DONE;
>+    }
>+}")
> 
>-(define_insn "sqrttf2"
>+(define_insn "*sqrttf2_hq"
>   [(set (match_operand:TF 0 "register_operand" "=e")
> 	(sqrt:TF (match_operand:TF 1 "register_operand" "e")))]
>   "TARGET_FPU && TARGET_HARD_QUAD"
>
>
>
>
>
>