Perfetto: Stop splitting _COMPLETE events

The current Perfetto backend splits _COMPLETE trace events into
separate _BEGIN and _END pairs, as it's not very feasible to modify
existing events after they're written into the Shared Memory Buffers.

This is causing some issues with the trace-viewer which has some
assumptions about the ordering of begin/end events vs. async events,
and is also bloating the sizes of traces and adding extra
overhead for the perf infra.

Instead, we now keep the _COMPLETE events in an internal stack in
TLS and only emit them when we have their duration.

R=eseckler@chromium.org,skyostil@chromium.org

Bug: 909728,888558
Change-Id: I80e37264de66d8bbcb6c9095d21047957fd6eb9f
Reviewed-on: https://chromium-review.googlesource.com/c/1354503
Commit-Queue: oysteine <oysteine@chromium.org>
Reviewed-by: Eric Seckler <eseckler@chromium.org>
Reviewed-by: Sami Kyöstilä <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#612360}
6 files changed