[PATCH] target/i386: fix phminposuw in-place operation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[PATCH] target/i386: fix phminposuw in-place operation

Joseph Myers
The SSE4.1 phminposuw instruction finds the minimum 16-bit element in
the source vector, putting the value of that element in the low 16
bits of the destination vector, the index of that element in the next
three bits and zeroing the rest of the destination.  The helper for
this operation fills the destination from high to low, meaning that
when the source and destination are the same register, the minimum
source element can be overwritten before it is copied to the
destination.  This patch fixes it to fill the destination from low to
high instead, so the minimum source element is always copied first.
This fixes one gcc test failure in my GCC 6-based testing (and so
concludes the present sequence of patches, as I don't have any further
gcc test failures left in that testing that I attribute to QEMU bugs).

Signed-off-by: Joseph Myers <[hidden email]>

---

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 16509d0..ed05989 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1707,10 +1710,10 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
         idx = 7;
     }
 
-    d->Q(1) = 0;
-    d->L(1) = 0;
-    d->W(1) = idx;
     d->W(0) = s->W(idx);
+    d->W(1) = idx;
+    d->L(1) = 0;
+    d->Q(1) = 0;
 }
 
 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,

--
Joseph S. Myers
[hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [PATCH] target/i386: fix phminposuw in-place operation

Paolo Bonzini-5
On 11/08/2017 16:23, Joseph Myers wrote:

> The SSE4.1 phminposuw instruction finds the minimum 16-bit element in
> the source vector, putting the value of that element in the low 16
> bits of the destination vector, the index of that element in the next
> three bits and zeroing the rest of the destination.  The helper for
> this operation fills the destination from high to low, meaning that
> when the source and destination are the same register, the minimum
> source element can be overwritten before it is copied to the
> destination.  This patch fixes it to fill the destination from low to
> high instead, so the minimum source element is always copied first.
> This fixes one gcc test failure in my GCC 6-based testing (and so
> concludes the present sequence of patches, as I don't have any further
> gcc test failures left in that testing that I attribute to QEMU bugs).
>
> Signed-off-by: Joseph Myers <[hidden email]>

Nice, thanks for the patches!  Queued too.

Paolo

Loading...