|
|
DescriptionIndexedDB: Measure the total database open time.
This change records time form the receipt (in the browser process) of the
request to open an IndexedDB database, until one of the "next" database
event handlers is called - one of: onupgradeneeded, onsuccess, onerror, or
onblocked.
BUG=397552, 395799
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=290419
Patch Set 1 #
Total comments: 16
Patch Set 2 : db -> connection #Patch Set 3 : One histogram per event type #
Total comments: 1
Patch Set 4 : conn_open_start_time_ -> connection_open_start_time_ #
Messages
Total messages: 19 (0 generated)
dgrogam for full review isherman for review of histograms.xml
Histograms LGTM % nits. https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_callbacks.cc (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.cc:40: static const char* const kDatabaseOpenTime = "WebCore.IndexedDB.OpenTime"; nit: "const char kDatabaseOpenTime[]" is slightly preferred. https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.cc:40: static const char* const kDatabaseOpenTime = "WebCore.IndexedDB.OpenTime"; nit: I'd include "histogram", and probably also "name", somewhere in the name of the constant.
https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_callbacks.cc (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.cc:609: void IndexedDBCallbacks::SetDatbaseOpenStartTime( Typo (but at least it compiles!) https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_callbacks.h (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.h:109: void SetDatbaseOpenStartTime(const base::TimeTicks& start_time); Typo: 'Datbase' https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.h:140: base::TimeTicks db_open_start_time_; Nit: Maybe 'connection' instead of 'db'? That would apply to other names as well. (e.g. Database -> Connection) https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.h:140: base::TimeTicks db_open_start_time_; I just noticed that these were base::TimeTicks instead of base::Time - does that play well with the UMA stats? (It looks like operators return TimeTicks rather than TimeDeltas) https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_dispatcher_host.cc (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_dispatcher_host.cc:294: base::TimeTicks begin_time = base::TimeTicks::Now(); Why a temporary here rather than just passing in Now() below? (it doesn't look like we do a measurable amount of work in between) https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:35866: +<histogram name="WebCore.IndexedDB.OpenTime" units="milliseconds"> I'm not sure we should measure all of these using the same histogram. It would be hard to determine what a "typical" open looks like if some unknown fraction fail very quickly. The common cases seem to me to be: open -> upgrade -> success (new or updated database) open -> success (existing database, or another tab is upgrading) open -> blocked -> upgrade -> success (upgrade, but other connections busy) open -> upgrade -> error (upgrade aborts: rare!) open -> error (some internal error like quota: rare!) ... or at least reflect that what we're measuring is 'time to first event' in the histogram name, although it's not necessarily being opened.
https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_callbacks.h (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.h:140: base::TimeTicks db_open_start_time_; On 2014/08/14 22:58:06, jsbell wrote: > I just noticed that these were base::TimeTicks instead of base::Time - does that > play well with the UMA stats? (It looks like operators return TimeTicks rather > than TimeDeltas) TimeTicks is usually preferred, as it's at least guaranteed to be monotonically non-decreasing. TimeDelta is just a difference between time values -- it works with both Time and TimeTicks.
https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:35866: +<histogram name="WebCore.IndexedDB.OpenTime" units="milliseconds"> On 2014/08/14 22:58:06, jsbell wrote: > I'm not sure we should measure all of these using the same histogram. It would > be hard to determine what a "typical" open looks like if some unknown fraction > fail very quickly. I had similar thoughts. There's a good chance this data won't lead to any insights because it's measuring such disparate code paths. Open->success should be nearly instantaneous when the backing store is already open and the requested version matches the current version. Open->success will be measuring a bunch of disk activity if the backing store is not already open. Open->upgradeneeded could include time used by other tasks on the IDB thread before VersionChangeOperation runs in addition to disk activity. Open->blocked will measure a bunch of IPCs etc After thinking about it a bit I think you want one histogram per event type, as Josh alluded to. Thoughts?
https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_callbacks.h (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_callbacks.h:140: base::TimeTicks db_open_start_time_; On 2014/08/14 22:58:05, jsbell wrote: > Nit: Maybe 'connection' instead of 'db'? > > That would apply to other names as well. (e.g. Database -> Connection) Done. https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_dispatcher_host.cc (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_dispatcher_host.cc:294: base::TimeTicks begin_time = base::TimeTicks::Now(); On 2014/08/14 22:58:06, jsbell wrote: > Why a temporary here rather than just passing in Now() below? > > (it doesn't look like we do a measurable amount of work in between) Just being cautious. The stack value is super small and I wanted this to be a bit more "future proof" so that if somebody puts in something that does take time then it'll get measured. If you want it removed let me know - no problem. https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:35866: +<histogram name="WebCore.IndexedDB.OpenTime" units="milliseconds"> No problem. I figured this change would require a different approach so separated it from the previous CL for that reason. Do you two think I should support all five cases, or maybe just the first three "success" cases? On 2014/08/14 22:58:06, jsbell wrote: > I'm not sure we should measure all of these using the same histogram. It would > be hard to determine what a "typical" open looks like if some unknown fraction > fail very quickly. > > The common cases seem to me to be: > > open -> upgrade -> success (new or updated database) > open -> success (existing database, or another tab is upgrading) > open -> blocked -> upgrade -> success (upgrade, but other connections busy) > open -> upgrade -> error (upgrade aborts: rare!) > open -> error (some internal error like quota: rare!) > > ... or at least reflect that what we're measuring is 'time to first event' in > the histogram name, although it's not necessarily being opened.
https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... File content/browser/indexed_db/indexed_db_dispatcher_host.cc (right): https://codereview.chromium.org/470373002/diff/1/content/browser/indexed_db/i... content/browser/indexed_db/indexed_db_dispatcher_host.cc:294: base::TimeTicks begin_time = base::TimeTicks::Now(); On 2014/08/14 23:53:03, cmumford wrote: > On 2014/08/14 22:58:06, jsbell wrote: > > Why a temporary here rather than just passing in Now() below? > > > > (it doesn't look like we do a measurable amount of work in between) > > Just being cautious. The stack value is super small and I wanted this to be a > bit more "future proof" so that if somebody puts in something that does take > time then it'll get measured. > > If you want it removed let me know - no problem. It's fine as is, and I agree with your logic. :) https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:35866: +<histogram name="WebCore.IndexedDB.OpenTime" units="milliseconds"> I think the first three is probably enough. Understanding these numbers is probably going to require some thought. open->success is easy to reason about. open->upgrade->success is dependent on how much work the site's upgrade code does. open->blocked->upgrade->success is dependent on that *plus* how long any other connections linger. A poorly implemented site could cause the latter number to spike even if Chrome didn't change (we had a bug report like that recently where a Chrome app started blocking its own upgrades!) But we shouldn't overthink things here, as long as we're distinguish these cases. The important thing is being able to replicate reports from $CUSTOMERS who say some fraction of opens are very slow. Let's measure what's easy and go from there.
https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... tools/metrics/histograms/histograms.xml:35866: +<histogram name="WebCore.IndexedDB.OpenTime" units="milliseconds"> On 2014/08/15 00:05:16, jsbell wrote: > I think the first three is probably enough. > > Understanding these numbers is probably going to require some thought. > open->success is easy to reason about. open->upgrade->success is dependent on > how much work the site's upgrade code does. I thought we were just going to measure the time until the backend decides which event it will fire _first_. In which case the site's upgrade code is irrelevant. That's the data we're getting from customers, right? The time to _first_ event? Though open->success will still be bimodal, one peak for the already-opened case and another (very broad) peak for the not-yet-opened case.
On 2014/08/15 02:09:30, dgrogan wrote: > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > File tools/metrics/histograms/histograms.xml (right): > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > tools/metrics/histograms/histograms.xml:35866: +<histogram > name="WebCore.IndexedDB.OpenTime" units="milliseconds"> > On 2014/08/15 00:05:16, jsbell wrote: > > I think the first three is probably enough. > > > > Understanding these numbers is probably going to require some thought. > > open->success is easy to reason about. open->upgrade->success is dependent on > > how much work the site's upgrade code does. > > I thought we were just going to measure the time until the backend decides which > event it will fire _first_. In which case the site's upgrade code is irrelevant. > That's the data we're getting from customers, right? The time to _first_ event? > > Though open->success will still be bimodal, one peak for the already-opened case > and another (very broad) peak for the not-yet-opened case. Yeah, I'm fine with that (simple, and we can reason about it pretty easily).
On 2014/08/15 02:09:30, dgrogan wrote: > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > File tools/metrics/histograms/histograms.xml (right): > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > tools/metrics/histograms/histograms.xml:35866: +<histogram > name="WebCore.IndexedDB.OpenTime" units="milliseconds"> > On 2014/08/15 00:05:16, jsbell wrote: > > I think the first three is probably enough. > > > > Understanding these numbers is probably going to require some thought. > > open->success is easy to reason about. open->upgrade->success is dependent on > > how much work the site's upgrade code does. > > I thought we were just going to measure the time until the backend decides which > event it will fire _first_. In which case the site's upgrade code is irrelevant. > That's the data we're getting from customers, right? The time to _first_ event? > > Though open->success will still be bimodal, one peak for the already-opened case > and another (very broad) peak for the not-yet-opened case. That's exactly what this CL is - or did I miss something? Aside from the histogram value name, which is currently "WebCore.IndexedDB.OpenTime" it measures the time from the request to open, until the first event message being sent back. Maybe "WebCore.IndexedDB.OpenFirstEventTime" would be a better name?
On 2014/08/18 16:32:23, cmumford wrote: > On 2014/08/15 02:09:30, dgrogan wrote: > > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > > File tools/metrics/histograms/histograms.xml (right): > > > > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > > tools/metrics/histograms/histograms.xml:35866: +<histogram > > name="WebCore.IndexedDB.OpenTime" units="milliseconds"> > > On 2014/08/15 00:05:16, jsbell wrote: > > > I think the first three is probably enough. > > > > > > Understanding these numbers is probably going to require some thought. > > > open->success is easy to reason about. open->upgrade->success is dependent > on > > > how much work the site's upgrade code does. > > > > I thought we were just going to measure the time until the backend decides > which > > event it will fire _first_. In which case the site's upgrade code is > irrelevant. > > That's the data we're getting from customers, right? The time to _first_ > event? > > > > Though open->success will still be bimodal, one peak for the already-opened > case > > and another (very broad) peak for the not-yet-opened case. > > That's exactly what this CL is - or did I miss something? I think dgrogan and I are suggesting we have 4 different histograms based on the first event time, i.e. one for success, one for blocked, one for upgradeneeded, one for error.
On 2014/08/18 16:34:00, jsbell wrote: > On 2014/08/18 16:32:23, cmumford wrote: > > On 2014/08/15 02:09:30, dgrogan wrote: > > > > > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > > > File tools/metrics/histograms/histograms.xml (right): > > > > > > > > > https://codereview.chromium.org/470373002/diff/1/tools/metrics/histograms/his... > > > tools/metrics/histograms/histograms.xml:35866: +<histogram > > > name="WebCore.IndexedDB.OpenTime" units="milliseconds"> > > > On 2014/08/15 00:05:16, jsbell wrote: > > > > I think the first three is probably enough. > > > > > > > > Understanding these numbers is probably going to require some thought. > > > > open->success is easy to reason about. open->upgrade->success is dependent > > on > > > > how much work the site's upgrade code does. > > > > > > I thought we were just going to measure the time until the backend decides > > which > > > event it will fire _first_. In which case the site's upgrade code is > > irrelevant. > > > That's the data we're getting from customers, right? The time to _first_ > > event? > > > > > > Though open->success will still be bimodal, one peak for the already-opened > > case > > > and another (very broad) peak for the not-yet-opened case. > > > > That's exactly what this CL is - or did I miss something? > > I think dgrogan and I are suggesting we have 4 different histograms based on the > first event time, i.e. one for success, one for blocked, one for upgradeneeded, > one for error. No problem, I thought that from an earlier comment, but somehow thought the request was reversed.
Again with four histograms.
lgtm with nit https://codereview.chromium.org/470373002/diff/40001/content/browser/indexed_... File content/browser/indexed_db/indexed_db_callbacks.h (right): https://codereview.chromium.org/470373002/diff/40001/content/browser/indexed_... content/browser/indexed_db/indexed_db_callbacks.h:140: base::TimeTicks conn_open_start_time_; nit: Avoid uncommon abbreviations in names (sorry, makes it very long)
lgtm
The CQ bit was checked by cmumford@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/cmumford@chromium.org/470373002/60001
FYI, CQ is re-trying this CL (attempt #1). The failing builders are: android_clang_dbg on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_clang_d...) android_dbg on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_dbg/bui...)
Message was sent while issue was closed.
Committed patchset #4 (60001) as 290419 |