|
|
Created:
4 years, 1 month ago by brucedawson Modified:
4 years ago CC:
chromium-reviews, wwcv, fgalligan1 Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionRemove /arch:AVX and /arch:AVX2 from libvpx
Mixing translation units compiled with /arch:AVX and /arch:AVX2 with
those compiled 'normally' can lead to illegal instruction crashes.
The problem is that if an inline function (such as ceilf or floorf)
is called from /arch:AVX* code but not inlined then AVX code may be
generated and this version of the function may be selected by the
linker.
It's an ODR violation, essentially.
The problem of inline functions in /arch:AVX* translation units can
be solved by marking the functions as static or putting them in an
anonymous namespace, or it can be hacked around with __forceinline,
but none of these work well for CRT inline functions.
clang-cl doesn't allow using AVX intrinsics without /arch:AVX which
is why /arch:AVX is retained in clang-cl.
BUG=666707
Patch Set 1 #Patch Set 2 : Fixed line length #Patch Set 3 : Reenable /arch:AVX* for clang-cl, and mandatory gn format??? #Messages
Total messages: 21 (11 generated)
The CQ bit was checked by brucedawson@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was checked by brucedawson@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
brucedawson@chromium.org changed reviewers: + tomfinegan@chromium.org
I'm not thrilled about having to do this, but without this change we can't guarantee that Chrome will run on machines that have AVX unavailable or disabled. PTAL
The CQ bit was checked by brucedawson@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== Remove /arch:AVX and /arch:AVX2 from libvpx Mixing translation units compiled with /arch:AVX and /arch:AVX2 with those compiled 'normally' can lead to illegal instruction crashes. The problem is that if an inline function (such as ceilf or floorf) is called from /arch:AVX* code but not inlined then AVX code may be generated and this version of the function may be selected by the linker. It's an ODR violation, essentially. The problem of inline functions in /arch:AVX* translation units can be solved by marking the functions as static or putting them in an anonymous namespace, or it can be hacked around with __forceinline, but none of these work well for CRT inline functions. BUG=666707 ========== to ========== Remove /arch:AVX and /arch:AVX2 from libvpx Mixing translation units compiled with /arch:AVX and /arch:AVX2 with those compiled 'normally' can lead to illegal instruction crashes. The problem is that if an inline function (such as ceilf or floorf) is called from /arch:AVX* code but not inlined then AVX code may be generated and this version of the function may be selected by the linker. It's an ODR violation, essentially. The problem of inline functions in /arch:AVX* translation units can be solved by marking the functions as static or putting them in an anonymous namespace, or it can be hacked around with __forceinline, but none of these work well for CRT inline functions. clang-cl doesn't allow using AVX intrinsics without /arch:AVX which is why /arch:AVX is retained in clang-cl. BUG=666707 ==========
On 2016/11/18 20:03:59, brucedawson wrote: > I'm not thrilled about having to do this, but without this change we can't > guarantee that Chrome will run on machines that have AVX unavailable or > disabled. > The intrinsics will still get built, what get emitted when the flag is left off? If avx2 is disabled we'll see ~3-5% drop in decode performance at 1080p. Would auditing the CRT usage in these files be enough to reenable the flag here?
On 2016/11/18 22:27:00, jzern wrote: > On 2016/11/18 20:03:59, brucedawson wrote: > > I'm not thrilled about having to do this, but without this change we can't > > guarantee that Chrome will run on machines that have AVX unavailable or > > disabled. > > > > The intrinsics will still get built, what get emitted when the flag is left off? > If avx2 is disabled we'll see ~3-5% drop in decode performance at 1080p. Would > auditing the CRT usage in these files be enough to reenable the flag here? The intrinsics will still be used, but the compiler will not generate AVX instructions on its own. It is hard to estimate the performance impact. Auditing CRT usage in these files should be sufficient. Ideally we will need to make this robust against future breakage, perhaps by checking at compile time for #defines that we know indicate the inclusion of CRT header files.
On 2016/11/18 22:51:17, brucedawson wrote: > On 2016/11/18 22:27:00, jzern wrote: > > On 2016/11/18 20:03:59, brucedawson wrote: > > > I'm not thrilled about having to do this, but without this change we can't > > > guarantee that Chrome will run on machines that have AVX unavailable or > > > disabled. > > > > > > > The intrinsics will still get built, what get emitted when the flag is left > off? > > If avx2 is disabled we'll see ~3-5% drop in decode performance at 1080p. Would > > auditing the CRT usage in these files be enough to reenable the flag here? > > The intrinsics will still be used, but the compiler will not generate AVX > instructions on its own. It is hard to estimate the performance impact. > > Auditing CRT usage in these files should be sufficient. Ideally we will need to > make this robust against future breakage, perhaps by checking at compile time > for #defines that we know indicate the inclusion of CRT header files. lgtm
brucedawson@chromium.org changed reviewers: + mtklein@chromium.org, thakis@chromium.org
thakis@ and mtklein@ - this CL may be an important part of getting canary stable again for non-AVX users but it sounds like this solution is not popular. Thoughts? I will be offline for some hours so please commit this change if it is acceptable as a temporary solution. The long-term fix will be, as in skia, to avoid CRT header files, but a short-term fix seems justifiable.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
On 2016/11/18 at 23:05:02, brucedawson wrote: > thakis@ and mtklein@ - this CL may be an important part of getting canary stable again for non-AVX users but it sounds like this solution is not popular. Thoughts? > > I will be offline for some hours so please commit this change if it is acceptable as a temporary solution. The long-term fix will be, as in skia, to avoid CRT header files, but a short-term fix seems justifiable. I think I've found a likely proximal cause of the problem we were seeing: - SkOpts_hsw.cpp includes SkRasterPipeline_opts.h - SkRasterPipeline_opts.h includes SkColorSpaceXformPriv.h - SkColorSpaceXformPriv.h uses sk_float_floor2int() and sk_float_ceil2int(), which boil down to floorf() and ceilf(). This all came together recently, last week or so. So I think there's probably no rush to fix things outside Skia. It feels pretty likely that if I get the Skia bits turned off, the current crashes will stop. The underlying problem is very real. Just may not be a panic.
I agree with that. We've had this code here for a long time without problems. On Fri, Nov 18, 2016 at 6:58 PM, <mtklein@chromium.org> wrote: > On 2016/11/18 at 23:05:02, brucedawson wrote: > > thakis@ and mtklein@ - this CL may be an important part of getting > canary > stable again for non-AVX users but it sounds like this solution is not > popular. > Thoughts? > > > > I will be offline for some hours so please commit this change if it is > acceptable as a temporary solution. The long-term fix will be, as in skia, > to > avoid CRT header files, but a short-term fix seems justifiable. > > I think I've found a likely proximal cause of the problem we were seeing: > - SkOpts_hsw.cpp includes SkRasterPipeline_opts.h > - SkRasterPipeline_opts.h includes SkColorSpaceXformPriv.h > - SkColorSpaceXformPriv.h uses sk_float_floor2int() and > sk_float_ceil2int(), > which boil down to floorf() and ceilf(). > This all came together recently, last week or so. > > So I think there's probably no rush to fix things outside Skia. It feels > pretty > likely that if I get the Skia bits turned off, the current crashes will > stop. > > The underlying problem is very real. Just may not be a panic. > > https://codereview.chromium.org/2516703002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
(without apparent problems, that is) On Fri, Nov 18, 2016 at 7:18 PM, Nico Weber <thakis@chromium.org> wrote: > I agree with that. We've had this code here for a long time without > problems. > > On Fri, Nov 18, 2016 at 6:58 PM, <mtklein@chromium.org> wrote: > >> On 2016/11/18 at 23:05:02, brucedawson wrote: >> > thakis@ and mtklein@ - this CL may be an important part of getting >> canary >> stable again for non-AVX users but it sounds like this solution is not >> popular. >> Thoughts? >> > >> > I will be offline for some hours so please commit this change if it is >> acceptable as a temporary solution. The long-term fix will be, as in >> skia, to >> avoid CRT header files, but a short-term fix seems justifiable. >> >> I think I've found a likely proximal cause of the problem we were seeing: >> - SkOpts_hsw.cpp includes SkRasterPipeline_opts.h >> - SkRasterPipeline_opts.h includes SkColorSpaceXformPriv.h >> - SkColorSpaceXformPriv.h uses sk_float_floor2int() and >> sk_float_ceil2int(), >> which boil down to floorf() and ceilf(). >> This all came together recently, last week or so. >> >> So I think there's probably no rush to fix things outside Skia. It feels >> pretty >> likely that if I get the Skia bits turned off, the current crashes will >> stop. >> >> The underlying problem is very real. Just may not be a panic. >> >> https://codereview.chromium.org/2516703002/ >> > > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
> >> I think I've found a likely proximal cause of the problem we were seeing: > >> - SkOpts_hsw.cpp includes SkRasterPipeline_opts.h > >> - SkRasterPipeline_opts.h includes SkColorSpaceXformPriv.h > >> - SkColorSpaceXformPriv.h uses sk_float_floor2int() and > >> sk_float_ceil2int(), > >> which boil down to floorf() and ceilf(). > >> This all came together recently, last week or so. > >> > >> So I think there's probably no rush to fix things outside Skia. It feels > >> pretty > >> likely that if I get the Skia bits turned off, the current crashes will > >> stop. > >> > >> The underlying problem is very real. Just may not be a panic. That sounds like it maps to the problem, so that's good. Given the unpredictability of this bug the only real test is to see whether the crashes go away. Looking at the bug I'm seeing mixed reports about whether the crashes have gone away. One report says yes, but fracas (and my quick manual inspection of the crash data) shows that they are continuing.
On 2016/11/20 at 15:38:55, brucedawson wrote: > > >> I think I've found a likely proximal cause of the problem we were seeing: > > >> - SkOpts_hsw.cpp includes SkRasterPipeline_opts.h > > >> - SkRasterPipeline_opts.h includes SkColorSpaceXformPriv.h > > >> - SkColorSpaceXformPriv.h uses sk_float_floor2int() and > > >> sk_float_ceil2int(), > > >> which boil down to floorf() and ceilf(). > > >> This all came together recently, last week or so. > > >> > > >> So I think there's probably no rush to fix things outside Skia. It feels > > >> pretty > > >> likely that if I get the Skia bits turned off, the current crashes will > > >> stop. > > >> > > >> The underlying problem is very real. Just may not be a panic. > > That sounds like it maps to the problem, so that's good. Given the unpredictability of this bug the only real test is to see whether the crashes go away. Looking at the bug I'm seeing mixed reports about whether the crashes have gone away. One report says yes, but fracas (and my quick manual inspection of the crash data) shows that they are continuing. Still think it's not time to panic. The builds fracas is mentioning are pre-fix. The first build with "Stub out Skia AVX and HSW opts." in it is 57.0.2926.0. |