Subject: Quad math on sparc64
To: None <tech-toolchain@netbsd.org, port-sparc64@netbsd.org>
From: James Chacon <jchacon@genuity.net>
List: port-sparc64
Date: 03/22/2002 05:36:27
The history:

Quad math isn't implemented natively on sparc processors (at least any I'm
aware of) so anything trying to use long doubles will end up in 1 of 2 places
when the compiler runs

1. Generates instructions for quad math which trap to emulation in the kernel
2. Generates soft-quad calls via the sparc 64bit ABI (the infamouse _Qp* calls)

Right now in-tree gcc will do the first route unless one specifies
-msoft-quad-float on the compiler options. Even then it doesn't do the right
thing because it tries to generate calls that look like the sparc 32bit quad
math calls (which take floating point args). The sparc64 calls take pointers
to floating args and the compiler todays generates bad code for these.
The ABI calls for option #2 are in libc now (I did these about a month back)
and they work based on my testing.

The patches I've attached will fix the code generation for the _Qp* calls. 
These are the changes pulled back from gcc 3.x (it hasn't changed from 3.0.1 
to -current really for these specific mods). Once these are in place and soft 
quad math is used it'll generate the correct calls and the softfloat routines 
generate the correct answers. 

Now the problem...The code in gcc is basically f*cked from being able to 
generate hard quad calls once these changes go in. The insn expansion gets
nasty looking for the hard quad case and nothing today can reduce/handle it.
The reality is that while it should be fixed it really just isn't that 
important overall. 

I've verified this is still the case even on gcc-current as of today so it's
not the patches, it's just gcc having it's own issues.

gcc-current sets soft-quad math as the default for solaris, linux, etc so thats
what I'm also seeing is the solution here.

There's no point to hard quad math out of the compiler really (it's a lot
slower to trap through the kernel for math emulation when a userland library
and API can do it just fine). With these patches soft-quad is the default so
other issues go away as well. We can take out the 3-4 sparc64 specific places
in the tree like awk and libgcc that have special options. Plus pkgsrc should 
get easier to deal with on some packages without them needing sparc64 specific
patches. It's just the compiler would never output "faddq" on it's own anymore.
The assembler would still take it though and the kernel would still 
trap/emulate it if presented so this doesn't present any issues WRT 
compatability/ABI changes.

I'm compiling a snapshot now with this compiler to test things but in all
my unit/regression tests I didn't find any issues. 

I'd like comments back because I'd like to commit this soon unless there's
a stong objection otherwise.

James

Index: netbsd64.h
===================================================================
RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/netbsd64.h,v
retrieving revision 1.7
diff -u -r1.7 netbsd64.h
--- netbsd64.h	2002/03/19 18:12:27	1.7
+++ netbsd64.h	2002/03/22 10:19:56
@@ -15,6 +15,11 @@
 
 #include <sparc/sp64-elf.h>
 
+#undef TARGET_DEFAULT
+#define TARGET_DEFAULT \
+(MASK_V9 + MASK_PTR64 + MASK_64BIT + /* MASK_HARD_QUAD */ \
+ + MASK_APP_REGS + MASK_EPILOGUE + MASK_FPU + MASK_STACK_BIAS)
+
 #undef SPARC_DEFAULT_CMODEL
 #define SPARC_DEFAULT_CMODEL CM_MEDANY
 
Index: sparc.c
===================================================================
RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.c,v
retrieving revision 1.4
diff -u -r1.4 sparc.c
--- sparc.c	2001/04/23 12:23:28	1.4
+++ sparc.c	2002/03/22 10:19:58
@@ -4578,6 +4578,152 @@
   return string;
 }
 
+/* Emit a library call comparison between floating point X and Y.
+   COMPARISON is the rtl operator to compare with (EQ, NE, GT, etc.).
+   TARGET_ARCH64 uses _Qp_* functions, which use pointers to TFmode
+   values as arguments instead of the TFmode registers themselves,
+   that's why we cannot call emit_float_lib_cmp.  */
+void
+sparc_emit_float_lib_cmp (x, y, comparison)
+     rtx x, y;
+     enum rtx_code comparison;
+{
+  char *qpfunc;
+  rtx slot0, slot1, result, tem, tem2;
+  enum machine_mode mode;
+
+  switch (comparison)
+    {
+    case EQ:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_feq" : "_Q_feq";
+      break;
+
+    case NE:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_fne" : "_Q_fne";
+      break;
+
+    case GT:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_fgt" : "_Q_fgt";
+      break;
+
+    case GE:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_fge" : "_Q_fge";
+      break;
+
+    case LT:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_flt" : "_Q_flt";
+      break;
+
+    case LE:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_fle" : "_Q_fle";
+      break;
+
+      /*    case UNORDERED:
+    case UNGT:
+    case UNLT:
+    case UNEQ:
+    case UNGE:
+    case UNLE:
+    case LTGT:
+      qpfunc = (TARGET_ARCH64) ? "_Qp_cmp" : "_Q_cmp";
+      break;
+      */
+    default:
+      abort();
+      break;
+    }
+
+  if (TARGET_ARCH64)
+    {
+      if (GET_CODE (x) != MEM)
+        {
+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot0, x));
+        }
+      else
+        slot0 = x;
+
+      if (GET_CODE (y) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, y));
+        }
+      else
+        slot1 = y;
+
+      emit_library_call (gen_rtx_SYMBOL_REF (Pmode, qpfunc), 1,
+                         DImode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode);
+
+      mode = DImode;
+    }
+  else
+    {
+      emit_library_call (gen_rtx_SYMBOL_REF (Pmode, qpfunc), 1,
+                         SImode, 2,
+                         x, TFmode, y, TFmode);
+
+      mode = SImode;
+    }
+
+
+  /* Immediately move the result of the libcall into a pseudo
+     register so reload doesn't clobber the value if it needs
+     the return register for a spill reg.  */
+  result = gen_reg_rtx (mode);
+  emit_move_insn (result, hard_libcall_value (mode));
+
+  switch (comparison)
+    {
+    default:
+      emit_cmp_insn (result, const0_rtx, NE,
+                     NULL_RTX, mode, 0, 0);
+      break;
+      /*    case ORDERED:
+    case UNORDERED:
+      emit_cmp_insn (result, GEN_INT(3),
+                     (comparison == UNORDERED) ? EQ : NE,
+                     NULL_RTX, mode, 0, 0);
+      break;
+    case UNGT:
+    case UNGE:
+      emit_cmp_insn (result, const1_rtx,
+                     (comparison == UNGT) ? GT : NE,
+                     NULL_RTX, mode, 0, 0);
+      break;
+    case UNLE:
+      emit_cmp_insn (result, const2_rtx, NE,
+                     NULL_RTX, mode, 0, 0);
+      break;
+    case UNLT:
+      tem = gen_reg_rtx (mode);
+      if (TARGET_ARCH32)
+        emit_insn (gen_andsi3 (tem, result, const1_rtx));
+      else
+        emit_insn (gen_anddi3 (tem, result, const1_rtx));
+      emit_cmp_insn (tem, const0_rtx, NE,
+                     NULL_RTX, mode, 0, 0);
+      break;
+    case UNEQ:
+    case LTGT:
+      tem = gen_reg_rtx (mode);
+      if (TARGET_ARCH32)
+        emit_insn (gen_addsi3 (tem, result, const1_rtx));
+      else
+        emit_insn (gen_adddi3 (tem, result, const1_rtx));
+      tem2 = gen_reg_rtx (mode);
+      if (TARGET_ARCH32)
+        emit_insn (gen_andsi3 (tem2, tem, const2_rtx));
+      else
+        emit_insn (gen_anddi3 (tem2, tem, const2_rtx));
+      emit_cmp_insn (tem2, const0_rtx,
+                     (comparison == UNEQ) ? EQ : NE,
+                     NULL_RTX, mode, 0, 0);
+		     break;*/
+    }
+}
+
 /* Return the string to output a conditional branch to LABEL, testing
    register REG.  LABEL is the operand number of the label; REG is the
    operand number of the reg.  OP is the conditional expression.  The mode
Index: sparc.h
===================================================================
RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.h,v
retrieving revision 1.2
diff -u -r1.2 sparc.h
--- sparc.h	2001/03/06 05:21:48	1.2
+++ sparc.h	2002/03/22 10:20:00
@@ -2625,26 +2625,25 @@
 #define MULSI3_LIBCALL "*.umul"
 
 /* Define library calls for quad FP operations.  These are all part of the
-   SPARC ABI.
-   ??? ARCH64 still does not work as the _Qp_* routines take pointers.  */
-#define ADDTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_add" : "_Q_add")
-#define SUBTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_sub" : "_Q_sub")
-#define NEGTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_neg" : "_Q_neg")
-#define MULTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_mul" : "_Q_mul")
-#define DIVTF3_LIBCALL (TARGET_ARCH64 ? "_Qp_div" : "_Q_div")
-#define FLOATSITF2_LIBCALL (TARGET_ARCH64 ? "_Qp_itoq" : "_Q_itoq")
-#define FIX_TRUNCTFSI2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtoi" : "_Q_qtoi")
-#define FIXUNS_TRUNCTFSI2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtoui" : "_Q_qtou")
-#define EXTENDSFTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_stoq" : "_Q_stoq")
-#define TRUNCTFSF2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtos" :  "_Q_qtos")
-#define EXTENDDFTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_dtoq" : "_Q_dtoq")
-#define TRUNCTFDF2_LIBCALL (TARGET_ARCH64 ? "_Qp_qtod" : "_Q_qtod")
-#define EQTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_feq" : "_Q_feq")
-#define NETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fne" : "_Q_fne")
-#define GTTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fgt" : "_Q_fgt")
-#define GETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fge" : "_Q_fge")
-#define LTTF2_LIBCALL (TARGET_ARCH64 ? "_Qp_flt" : "_Q_flt")
-#define LETF2_LIBCALL (TARGET_ARCH64 ? "_Qp_fle" : "_Q_fle")
+   SPARC 32bit ABI. */
+#define ADDTF3_LIBCALL "_Q_add"
+#define SUBTF3_LIBCALL "_Q_sub"
+#define NEGTF2_LIBCALL "_Q_neg"
+#define MULTF3_LIBCALL "_Q_mul"
+#define DIVTF3_LIBCALL "_Q_div"
+#define FLOATSITF2_LIBCALL "_Q_itoq"
+#define FIX_TRUNCTFSI2_LIBCALL "_Q_qtoi"
+#define FIXUNS_TRUNCTFSI2_LIBCALL "_Q_qtou"
+#define EXTENDSFTF2_LIBCALL "_Q_stoq"
+#define TRUNCTFSF2_LIBCALL "_Q_qtos"
+#define EXTENDDFTF2_LIBCALL "_Q_dtoq"
+#define TRUNCTFDF2_LIBCALL "_Q_qtod"
+#define EQTF2_LIBCALL "_Q_feq"
+#define NETF2_LIBCALL "_Q_fne"
+#define GTTF2_LIBCALL "_Q_fgt"
+#define GETF2_LIBCALL "_Q_fge"
+#define LTTF2_LIBCALL "_Q_flt"
+#define LETF2_LIBCALL "_Q_fle"
 
 /* We can define the TFmode sqrt optab only if TARGET_FPU.  This is because
    with soft-float, the SFmode and DFmode sqrt instructions will be absent,
@@ -2652,34 +2651,37 @@
    for calls to the builtin function sqrt, but this fails.  */
 #define INIT_TARGET_OPTABS						\
   do {									\
-    add_optab->handlers[(int) TFmode].libfunc				\
-      = gen_rtx_SYMBOL_REF (Pmode, ADDTF3_LIBCALL);			\
-    sub_optab->handlers[(int) TFmode].libfunc				\
-      = gen_rtx_SYMBOL_REF (Pmode, SUBTF3_LIBCALL);			\
-    neg_optab->handlers[(int) TFmode].libfunc				\
-      = gen_rtx_SYMBOL_REF (Pmode, NEGTF2_LIBCALL);			\
-    smul_optab->handlers[(int) TFmode].libfunc				\
-      = gen_rtx_SYMBOL_REF (Pmode, MULTF3_LIBCALL);			\
-    flodiv_optab->handlers[(int) TFmode].libfunc			\
-      = gen_rtx_SYMBOL_REF (Pmode, DIVTF3_LIBCALL);			\
-    eqtf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EQTF2_LIBCALL);		\
-    netf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, NETF2_LIBCALL);		\
-    gttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GTTF2_LIBCALL);		\
-    getf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GETF2_LIBCALL);		\
-    lttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LTTF2_LIBCALL);		\
-    letf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LETF2_LIBCALL);		\
-    trunctfsf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFSF2_LIBCALL);   \
-    trunctfdf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFDF2_LIBCALL);   \
-    extendsftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDSFTF2_LIBCALL); \
-    extenddftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDDFTF2_LIBCALL); \
-    floatsitf_libfunc = gen_rtx_SYMBOL_REF (Pmode, FLOATSITF2_LIBCALL);    \
-    fixtfsi_libfunc = gen_rtx_SYMBOL_REF (Pmode, FIX_TRUNCTFSI2_LIBCALL);  \
-    fixunstfsi_libfunc							\
-      = gen_rtx_SYMBOL_REF (Pmode, FIXUNS_TRUNCTFSI2_LIBCALL);		\
-    if (TARGET_FPU)							\
-      sqrt_optab->handlers[(int) TFmode].libfunc			\
-	= gen_rtx_SYMBOL_REF (Pmode, "_Q_sqrt");			\
-    INIT_SUBTARGET_OPTABS;						\
+    if (TARGET_ARCH32)                                                  \
+      {                                                                 \
+        add_optab->handlers[(int) TFmode].libfunc                       \
+          = gen_rtx_SYMBOL_REF (Pmode, ADDTF3_LIBCALL);                 \
+        sub_optab->handlers[(int) TFmode].libfunc                       \
+          = gen_rtx_SYMBOL_REF (Pmode, SUBTF3_LIBCALL);                 \
+        neg_optab->handlers[(int) TFmode].libfunc                       \
+          = gen_rtx_SYMBOL_REF (Pmode, NEGTF2_LIBCALL);                 \
+        smul_optab->handlers[(int) TFmode].libfunc                      \
+          = gen_rtx_SYMBOL_REF (Pmode, MULTF3_LIBCALL);                 \
+        flodiv_optab->handlers[(int) TFmode].libfunc                    \
+          = gen_rtx_SYMBOL_REF (Pmode, DIVTF3_LIBCALL);                 \
+        eqtf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EQTF2_LIBCALL);      \
+        netf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, NETF2_LIBCALL);      \
+        gttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GTTF2_LIBCALL);      \
+        getf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, GETF2_LIBCALL);      \
+        lttf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LTTF2_LIBCALL);      \
+        letf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, LETF2_LIBCALL);      \
+        trunctfsf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFSF2_LIBCALL);  \
+        trunctfdf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, TRUNCTFDF2_LIBCALL);  \
+        extendsftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDSFTF2_LIBCALL);\
+        extenddftf2_libfunc = gen_rtx_SYMBOL_REF (Pmode, EXTENDDFTF2_LIBCALL);\
+        floatsitf_libfunc = gen_rtx_SYMBOL_REF (Pmode, FLOATSITF2_LIBCALL);   \
+        fixtfsi_libfunc = gen_rtx_SYMBOL_REF (Pmode, FIX_TRUNCTFSI2_LIBCALL); \
+        fixunstfsi_libfunc                                              \
+          = gen_rtx_SYMBOL_REF (Pmode, FIXUNS_TRUNCTFSI2_LIBCALL);      \
+        if (TARGET_FPU)                                                 \
+          sqrt_optab->handlers[(int) TFmode].libfunc                    \
+            = gen_rtx_SYMBOL_REF (Pmode, "_Q_sqrt");                    \
+      }                                                                 \
+    INIT_SUBTARGET_OPTABS;                                              \
   } while (0)
 
 /* This is meant to be redefined in the host dependent files */
Index: sparc.md
===================================================================
RCS file: /cvsroot/gnusrc/gnu/dist/toolchain/gcc/config/sparc/sparc.md,v
retrieving revision 1.4
diff -u -r1.4 sparc.md
--- sparc.md	2001/04/23 12:23:28	1.4
+++ sparc.md	2002/03/22 10:20:01
@@ -837,7 +837,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }      
@@ -890,7 +890,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }      
@@ -911,7 +911,7 @@
 {
   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }
@@ -932,7 +932,7 @@
 {
   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }
@@ -953,7 +953,7 @@
 {
   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }
@@ -974,7 +974,7 @@
 {
   if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
       emit_insn (gen_sne (operands[0]));
       DONE;
     }
@@ -1608,7 +1608,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, EQ);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1632,7 +1632,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, NE);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1656,7 +1656,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GT);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1690,7 +1690,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LT);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1724,7 +1724,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, GE);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1758,7 +1758,7 @@
     }
   else if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
     {
-      emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
+      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LE);
       emit_jump_insn (gen_bne (operands[0]));
       DONE;
     }      
@@ -1774,6 +1774,145 @@
   "
 { operands[1] = gen_compare_reg (LEU, sparc_compare_op0, sparc_compare_op1);
 }")
+
+;;(define_expand "bunordered"
+;;  [(set (pc)
+;;        (if_then_else (unordered (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1,
+;;                                UNORDERED);
+;;      emit_jump_insn (gen_beq (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNORDERED, sparc_compare_op0,
+;;                                 sparc_compare_op1);
+;;}")
+
+;;(define_expand "bordered"
+;;  [(set (pc)
+;;        (if_then_else (ordered (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, ORDERED);
+;;      emit_jump_insn (gen_bne (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (ORDERED, sparc_compare_op0,
+;;                                 sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "bungt"
+;;  [(set (pc)
+;;        (if_then_else (ungt (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNGT);
+;;      emit_jump_insn (gen_bgt (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNGT, sparc_compare_op0, sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "bunlt"
+;;  [(set (pc)
+;;        (if_then_else (unlt (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNLT);
+;;      emit_jump_insn (gen_bne (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNLT, sparc_compare_op0, sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "buneq"
+;;  [(set (pc)
+;;        (if_then_else (uneq (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNEQ);
+;;      emit_jump_insn (gen_beq (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNEQ, sparc_compare_op0, sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "bunge"
+;;  [(set (pc)
+;;        (if_then_else (unge (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNGE);
+;;      emit_jump_insn (gen_bne (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNGE, sparc_compare_op0, sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "bunle"
+;;  [(set (pc)
+;;        (if_then_else (unle (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, UNLE);
+;;      emit_jump_insn (gen_bne (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (UNLE, sparc_compare_op0, sparc_compare_op1);
+;;}")
+;;
+;;(define_expand "bltgt"
+;;  [(set (pc)
+;;        (if_then_else (ltgt (match_dup 1) (const_int 0))
+;;                      (label_ref (match_operand 0 "" ""))
+;;                      (pc)))]
+;;  ""
+;;  "
+;;{
+;;  if (GET_MODE (sparc_compare_op0) == TFmode && ! TARGET_HARD_QUAD)
+;;    {
+;;      sparc_emit_float_lib_cmp (sparc_compare_op0, sparc_compare_op1, LTGT);
+;;      emit_jump_insn (gen_bne (operands[0]));
+;;      DONE;
+;;    }
+;;  operands[1] = gen_compare_reg (LTGT, sparc_compare_op0, sparc_compare_op1);
+;;}")
 
 ;; Now match both normal and inverted jump.
 
@@ -4518,16 +4657,70 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
-(define_insn "extendsftf2"
+(define_expand "extendsftf2"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(float_extend:TF
 	 (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_stoq\"), 0,
+                         VOIDmode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         operands[1], SFmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
+(define_insn "*extendsftf2_hq"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (float_extend:TF
+         (match_operand:SF 1 "register_operand" "f")))]
   "TARGET_FPU && TARGET_HARD_QUAD"
   "fstoq\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "extenddftf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (float_extend:TF
+         (match_operand:DF 1 "register_operand" "e")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_dtoq\"), 0,
+                         VOIDmode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         operands[1], DFmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
 
-(define_insn "extenddftf2"
+(define_insn "*extenddftf2_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(float_extend:TF
 	 (match_operand:DF 1 "register_operand" "e")))]
@@ -4544,8 +4737,34 @@
   "fdtos\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "trunctfsf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+        (float_truncate:SF
+         (match_operand:TF 1 "register_operand" "e")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+        }
+      else
+        slot0 = operands[1];
+
+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtos\"),
+                               operands[0], 0, SFmode, 1,
+                               XEXP (slot0, 0), Pmode);
+      DONE;
+    }
+}")
 
-(define_insn "trunctfsf2"
+(define_insn "*trunctfsf2_hq"
   [(set (match_operand:SF 0 "register_operand" "=f")
 	(float_truncate:SF
 	 (match_operand:TF 1 "register_operand" "e")))]
@@ -4554,7 +4773,33 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
-(define_insn "trunctfdf2"
+(define_expand "trunctfdf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+        (float_truncate:DF
+         (match_operand:TF 1 "register_operand" "e")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+        }
+      else
+        slot0 = operands[1];
+
+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtod\"),
+                               operands[0], 0, DFmode, 1,
+                               XEXP (slot0, 0), Pmode);
+      DONE;
+    }
+}")
+
+(define_insn "*trunctfdf2_hq"
   [(set (match_operand:DF 0 "register_operand" "=e")
 	(float_truncate:DF
 	 (match_operand:TF 1 "register_operand" "e")))]
@@ -4580,8 +4825,34 @@
   "fitod\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "floatsitf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (float:TF (match_operand:SI 1 "register_operand" "f")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[1];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_itoq\"), 0,
+                         VOIDmode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         operands[1], SImode);
 
-(define_insn "floatsitf2"
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
+(define_insn "*floatsitf2_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(float:TF (match_operand:SI 1 "register_operand" "f")))]
   "TARGET_FPU && TARGET_HARD_QUAD"
@@ -4589,6 +4860,29 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
+(define_expand "floatunssitf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (unsigned_float:TF (match_operand:SI 1 "register_operand" "e")))]
+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
+  "
+{
+  rtx slot0;
+
+  if (GET_CODE (operands[1]) != MEM)
+    slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+  else
+    slot0 = operands[1];
+
+  emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_uitoq\"), 0,
+                     VOIDmode, 2,
+                     XEXP (slot0, 0), Pmode,
+                     operands[1], SImode);
+
+  if (GET_CODE (operands[0]) != MEM)
+    emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+  DONE;
+}")
+
 ;; Now the same for 64 bit sources.
 
 (define_insn "floatdisf2"
@@ -4606,8 +4900,34 @@
   "fxtod\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "floatditf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (float:TF (match_operand:DI 1 "register_operand" "e")))]
+  "TARGET_FPU && TARGET_V9 && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[1];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_xtoq\"), 0,
+                         VOIDmode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         operands[1], DImode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
 
-(define_insn "floatditf2"
+(define_insn "*floatditf2_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(float:TF (match_operand:DI 1 "register_operand" "e")))]
   "TARGET_V9 && TARGET_FPU && TARGET_HARD_QUAD"
@@ -4615,6 +4935,29 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
+(define_expand "floatunsditf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (unsigned_float:TF (match_operand:DI 1 "register_operand" "e")))]
+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
+  "
+{
+  rtx slot0;
+
+  if (GET_CODE (operands[1]) != MEM)
+    slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+  else
+    slot0 = operands[1];
+
+  emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_uxtoq\"), 0,
+                     VOIDmode, 2,
+                     XEXP (slot0, 0), Pmode,
+                     operands[1], DImode);
+
+  if (GET_CODE (operands[0]) != MEM)
+    emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+  DONE;
+}")
+
 ;; Convert a float to an actual integer.
 ;; Truncation is performed as part of the conversion.
 
@@ -4634,14 +4977,61 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
-(define_insn "fix_trunctfsi2"
+(define_expand "fix_trunctfsi2"
   [(set (match_operand:SI 0 "register_operand" "=f")
+        (fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+        }
+      else
+        slot0 = operands[1];
+
+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoi\"),
+                               operands[0], 0, SImode, 1,
+                               XEXP (slot0, 0), Pmode);
+      DONE;
+    }
+}")
+
+(define_insn "*fix_trunctfsi2_hq"
+  [(set (match_operand:SI 0 "register_operand" "=f")
 	(fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
   "TARGET_FPU && TARGET_HARD_QUAD"
   "fqtoi\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
+(define_expand "fixuns_trunctfsi2"
+  [(set (match_operand:SI 0 "register_operand" "=f")
+        (unsigned_fix:SI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
+  "
+{
+  rtx slot0;
+
+  if (GET_CODE (operands[1]) != MEM)
+    {
+      slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+    }
+  else
+    slot0 = operands[1];
+
+  emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoui\"),
+                           operands[0], 0, SImode, 1,
+                           XEXP (slot0, 0), Pmode);
+  DONE;
+}")
+
 ;; Now the same, for V9 targets
 
 (define_insn "fix_truncsfdi2"
@@ -4660,13 +5050,61 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
-(define_insn "fix_trunctfdi2"
+(define_expand "fix_trunctfdi2"
   [(set (match_operand:DI 0 "register_operand" "=e")
+        (fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
+  "TARGET_V9 && TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0;
+
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+        }
+      else
+        slot0 = operands[1];
+
+      emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtox\"),
+                               operands[0], 0, DImode, 1,
+                               XEXP (slot0, 0), Pmode);
+      DONE;
+    }
+}")
+
+(define_insn "*fix_trunctfdi2_hq"
+  [(set (match_operand:DI 0 "register_operand" "=e")
 	(fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
   "TARGET_V9 && TARGET_FPU && TARGET_HARD_QUAD"
   "fqtox\\t%1, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "fixuns_trunctfdi2"
+  [(set (match_operand:DI 0 "register_operand" "=f")
+        (unsigned_fix:DI (fix:TF (match_operand:TF 1 "register_operand" "e"))))]
+  "TARGET_FPU && TARGET_ARCH64 && ! TARGET_HARD_QUAD"
+  "
+{
+  rtx slot0;
+
+  if (GET_CODE (operands[1]) != MEM)
+    {
+      slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      emit_insn (gen_rtx_SET (VOIDmode, slot0, operands[1]));
+    }
+  else
+    slot0 = operands[1];
+
+  emit_library_call_value (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_qtoux\"),
+                           operands[0], 0, DImode, 1,
+                           XEXP (slot0, 0), Pmode);
+  DONE;
+}")
+
 
 ;;- arithmetic instructions
 
@@ -6591,8 +7029,50 @@
    (set_attr "length" "1")])
 
 ;; Floating point arithmetic instructions.
+
+(define_expand "addtf3"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+        (plus:TF (match_operand:TF 1 "general_operand" "")
+                 (match_operand:TF 2 "general_operand" "")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0, slot1, slot2;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
+        }
+      else
+        slot1 = operands[1];
+      if (GET_CODE (operands[2]) != MEM)
+        {
+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
+        }
+      else
+        slot2 = operands[2];
 
-(define_insn "addtf3"
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_add\"), 0,
+                         VOIDmode, 3,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode,
+                         XEXP (slot2, 0), Pmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
+(define_insn "*addtf3_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(plus:TF (match_operand:TF 1 "register_operand" "e")
 		 (match_operand:TF 2 "register_operand" "e")))]
@@ -6618,8 +7098,50 @@
   "fadds\\t%1, %2, %0"
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
+
+(define_expand "subtf3"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+        (minus:TF (match_operand:TF 1 "general_operand" "")
+                  (match_operand:TF 2 "general_operand" "")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0, slot1, slot2;
 
-(define_insn "subtf3"
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
+        }
+      else
+        slot1 = operands[1];
+      if (GET_CODE (operands[2]) != MEM)
+        {
+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
+        }
+      else
+        slot2 = operands[2];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_sub\"), 0,
+                         VOIDmode, 3,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode,
+                         XEXP (slot2, 0), Pmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
+(define_insn "*subtf3_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(minus:TF (match_operand:TF 1 "register_operand" "e")
 		  (match_operand:TF 2 "register_operand" "e")))]
@@ -6646,7 +7168,49 @@
   [(set_attr "type" "fp")
    (set_attr "length" "1")])
 
-(define_insn "multf3"
+(define_expand "multf3"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+        (mult:TF (match_operand:TF 1 "general_operand" "")
+                 (match_operand:TF 2 "general_operand" "")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0, slot1, slot2;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
+        }
+      else
+        slot1 = operands[1];
+      if (GET_CODE (operands[2]) != MEM)
+        {
+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
+        }
+      else
+        slot2 = operands[2];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_mul\"), 0,
+                         VOIDmode, 3,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode,
+                         XEXP (slot2, 0), Pmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
+(define_insn "*multf3_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(mult:TF (match_operand:TF 1 "register_operand" "e")
 		 (match_operand:TF 2 "register_operand" "e")))]
@@ -6691,8 +7255,50 @@
   [(set_attr "type" "fpmul")
    (set_attr "length" "1")])
 
+(define_expand "divtf3"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+        (div:TF (match_operand:TF 1 "general_operand" "")
+                (match_operand:TF 2 "general_operand" "")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0, slot1, slot2;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
+        }
+      else
+        slot1 = operands[1];
+      if (GET_CODE (operands[2]) != MEM)
+        {
+          slot2 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot2, operands[2]));
+        }
+      else
+        slot2 = operands[2];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_div\"), 0,
+                         VOIDmode, 3,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode,
+                         XEXP (slot2, 0), Pmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
+
 ;; don't have timing for quad-prec. divide.
-(define_insn "divtf3"
+(define_insn "*divtf3_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(div:TF (match_operand:TF 1 "register_operand" "e")
 		(match_operand:TF 2 "register_operand" "e")))]
@@ -6962,8 +7568,41 @@
   "fabss\\t%1, %0"
   [(set_attr "type" "fpmove")
    (set_attr "length" "1")])
+
+(define_expand "sqrttf2"
+  [(set (match_operand:TF 0 "register_operand" "=e")
+        (sqrt:TF (match_operand:TF 1 "register_operand" "e")))]
+  "TARGET_FPU && (TARGET_HARD_QUAD || TARGET_ARCH64)"
+  "
+{
+  if (! TARGET_HARD_QUAD)
+    {
+      rtx slot0, slot1;
+
+      if (GET_CODE (operands[0]) != MEM)
+        slot0 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+      else
+        slot0 = operands[0];
+      if (GET_CODE (operands[1]) != MEM)
+        {
+          slot1 = assign_stack_temp (TFmode, GET_MODE_SIZE(TFmode), 0);
+          emit_insn (gen_rtx_SET (VOIDmode, slot1, operands[1]));
+        }
+      else
+        slot1 = operands[1];
+
+      emit_library_call (gen_rtx (SYMBOL_REF, Pmode, \"_Qp_sqrt\"), 0,
+                         VOIDmode, 2,
+                         XEXP (slot0, 0), Pmode,
+                         XEXP (slot1, 0), Pmode);
+
+      if (GET_CODE (operands[0]) != MEM)
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0], slot0));
+      DONE;
+    }
+}")
 
-(define_insn "sqrttf2"
+(define_insn "*sqrttf2_hq"
   [(set (match_operand:TF 0 "register_operand" "=e")
 	(sqrt:TF (match_operand:TF 1 "register_operand" "e")))]
   "TARGET_FPU && TARGET_HARD_QUAD"