This is to avoid value metrics skipping buckets due to
DUMP_REPORT_REQUESTED, since if the metric needs to include the current
bucket under time constraints and needs to pull, it will drop the bucket
since it cannot pull.
Test: atest statsd_test
Bug: 158879346
Change-Id: Ia61e69530456ce2b3530add03ec6e068ffb25fb5
Test: m
Test: manually verified that DeviceIdleModeStateChanged atom gets logged
before boot completes (using ag/11529814 to log)
Bug: 156913221
Change-Id: I3dbf154083f1cbe660625066dc50b6a8ffd60d7c
If a pull happens at the same event time, we should reuse the existing
data, regardless of whether or not the cool down has been met. For
example, if an app upgrade happens at time t, and two metrics need to
pull atom a, if metric one pulls at time t, but metric two initiates the
pull at time t+2, we should still reuse the pull from time t since that
is when the app upgrade happened.
Bug: 156294650
Test: atest statsd_test
Change-Id: I4efc49545093f6683bf6dd89ed68c5dfa5b44d8f
This CL aims to fix two race conditions:
1. When statsd restarts after a crash, the ordering of sayHiToStatsd and
binderDied is not guaranteed. However, previously, we assumed that
binderDied would get called first and reset sStatsd to null. To solve,
we don't assume a function ordering and don't throw an error message in
sayHiToStatsd if sStatsd is not null.
2. When statsd was linked to death, the death recipient was not informed
about all broadcast receivers. Thus, the death recipient might have
known only a partial list of receivers when #binderDied was triggered. To
solve, we make sure that the death recipient knows about all receivers
before we link to death.
Test: atest statsd_test
Test: atest CtsStatsdHostTestCases
Bug: 154275510
Change-Id: I11be65ca2135cde200ab8ecb611a363d8f7c2eb6
Also clean up a bit of code on splitting on app upgrades
Piggy-backed off the app upgrade tests, adding parameterized tests to
also test boot complete event.
Refactored some value metric test code to increase code reuse and
assertions.
Fixed a broken value metric test that had assertions commented out.
Refactored NamedLatch into MultiConditionTrigger to avoid creating a
thread before necessary.
Test: atest statsd_test
Test: push a simple test config, reboot, wait, get data. Made sure
the bucket was split
Bug: 144099206
Bug: 154511974
Change-Id: I73858b5db08e8cda762bd8091b30da8738d1fd88
Also remove sendAppBreadcrumb binder api because it's no longer used.
Bug: 154264326
Test: atest com.google.android.statsd.gts.StatsdHostTestCases
Change-Id: Ic51a057bb01a89a24337521a49c54a52e2073cd1
This is a sychronizing primitive that is similar to a latch, but a
thread must count down with an identifier. It will be used to make sure
the boot complete, uid map, and all pullers signals are received before
triggering the bucket split.
The latch's countDown operation takes in a string identifier, so that if
the same operation happens twice, it is only counted once.
Bug: 144099206
Test: atest statsd_test
Change-Id: I261a3e50eabbc4998ca30ddf2d67a9a1e788911e
Overall flow of implementation:
1. parsing the config in MetricsManager to store the uids per atom. It
follows the mAllowedLogSources logic very closely
2. MetricsManager register itself as a PullUidProvider with the
PullerManager.
3. Metrics pass the config key when pulling (for both registering
receivers and normal pulls) , and the puller manager gets
the allowed uids from the PullUidProvider for that config.
4. PullerManager keys receivers by <atomId, configKey> so that it can
look up the uids for that atom using the PullUidProvider as well.
5. Added shell subscriber support. Hardcode a default of AID_SYSTEM for
them and also allow packages per atom. This involved adding a second
interface to Pull that simply accepts the uids, since I didnt want to
make the ShellSubscriber a PullUidProvider as well.
6. Change adb shell cmd stats pull-source to allow users to specify a
package. Default to AID_SYSTEM as well.
Notes:
The feature is flagged off right now, since configs do not pass in the
desired package. Another approach could be to hardcode in the current
mapping, but that doesn't work for OEM pulled atoms.
Test: m statsd
Test: bit statsd_test:* with useUids = false
Test: bit statsd_test:* with useUids = true
Bug: 144099783
Bug: 151978258
Change-Id: I4a7481d7402a52b9beb4ea28b102803f9e50e79f
Update statsd to take in times in milliseconds instead of nanoseconds.
Also make appropriate updates for graphics stats, odpm, subsystem sleep
state, and LibStatsPullTests
Test: atest LibStatsPullTests
Test: bit statsd_test:*
Bug: 150788562
Change-Id: I593552d6c50bb4dcb89ca9cc1c737781653e7cc5
1. Rename registerPullAtomCallback to setPullAtomCallback
2. Rename unregisterPullAtomCallback to clearPullAtomCallback
3. Add getters to PullAtomMetadata
4. Change Ns to Millis (when I tried to make it Nanos, I received a
built time error saying to prefer millis unless we need the precision.
We do not need the precision, so I changed it).
5. Fix out of order params.
I did not change usePooledBuffer to setPooledBuffer because I think use
is more appropriate for our use case.
Test: make
Test: atest PullAtomMetadataTest
Test: atest GtsStatsdHostTestCases
Bug: 149475498
Change-Id: Ib07aa57a6e02c77917fe0e65a3d4a77c00ce8565
Major changes include:
- Removing unused permission checks within StatsService. These
include ENFORCE_DUMP_AND_USAGE_STATS, checkDumpAndUsageStats,
kOpUsage, and kPermissionUsage.
- Converting from sp to shared_ptr
- Using libbinder_ndk functions instead of libbinder functions
(e.g. for installing death recipients, getting calling uids, etc.)
- New death recipients were added in StatsService,
ConfigManager, and SubscriberReporter.
- Using a unique token (timestamp) to identify shell subscribers
instead of IResultReceiver because IResultReceiver is not exposed by
libbinder_ndk. Currently, statsd cannot detect if perfd dies; we
will fix that later.
Bug: 145232107
Bug: 148609603
Test: m statsd
Test: m statsd_test
Test: bit stastd_test:*
Test: atest GtsStatsdHostTestCases
Change-Id: Ia1fda7280c22320bc4ebc8371acaadbe8eabcbd2
Also changed StatsLog to call write() instead of the hard coded function
in StatsService.
Test: gts-tradefed run gts-dev --module GtsStatsdHostTestCases
Change-Id: I26171fa4cfc877e1e179b74ec8076d964aff8548
The checkCallingPermission function is not supported by libbinder_ndk.
To circumvent this issue, statsd will now ask StatsCompanionService
(SCS) to do that check using a synchronous Binder call. Once
libbinder_ndk does support checkCallingPermission, this workaround will
be unnecessary.
Test: m -j
Test: atest GtsStatsdHostTestCases
Bug: 145566180
Change-Id: I11f0e82f88aa5921cf531fd041b0a18d3a26a0a0
change #setBroadcastSubscriber and #unsetBroadcastSubscriber
to avoid using intentsender
Bug: 146074295
Test: Ran GTS Tests
Change-Id: I1510e44bcdf49b579fd49f51081c6a40618039fa
avoid using intentsender in #sendActiveConfigsChangedBroadcast
and #removeActiveConfigsChangedBroadcast.
Bug: 146074295
Test: Ran GTS Tests
Change-Id: I9313299ea0bc89f092b1c62fbfc34e06a127eaa9
Adds an API to unregister pullAtomCallbacks.
Fixes bug where the StatsEventParcels are null.
Makes onPullAtom a oneway call.
OnPullAtom returns an int code instead of a boolean
Made the APIs return RuntimeExceptions
Test: make, boots
Test: atest GtsStatsdHostTestCases
Bug: 146385842
Bug: 146385173
Bug: 144373250
Change-Id: I107a705a9024240c5c9f9e276293de8410e2b6f3
To help with monitoring Mainline releases, log the reason
for a watchdog-initiated rollback. This may be due to
native crashes, app crashes, ANRs or explicit health check
failures.
Add a mapping from PackageWatchdog failure reason to the
new metrics.
Bug: 138782888
Test: atest PackageWatchdogTest
Test: atest StatsdHostTestCases
Change-Id: Ia3e73d955508297004591eac762555665c557b8a
As part of becoming a Mainline module, we need to understand basic
information about the health of our module in the wild. For the
MediaProvider module, we're interested in identifying these cases:
-- When scan operations result in an unexpectedly large number of
inserts, updates, or deletes in proportion to the total number of
files indexed. This typically indicates user data loss or a missing
database upgrade step.
-- When the overall duration of scan operations becomes significantly
slower in relation to the number of files indexed. This typically
indicates an indexing performance regression.
-- When a scan operation skips over an entire directory tree. This
can indicate an app placing ".nomedia" files in unexpected locations.
-- When a large number of media files are deleted by a specific
app. This can help identify data loss bugs caused by MediaProvider
directly, or attribute data loss bugs in other apps.
-- When database upgrade/downgrade operations take a substantial
amount of time in proportion to the total number of files. This
typically indicates a performance regression.
-- When performing idle maintenance, a large number of stale or
expiring media can indicate an invalidation bug.
Bug: 143723019
Test: manual
Change-Id: I89c5b5b51a843a67348a7bb4b8e6ac01fb2b15b9
Internal implementation of the puller API. Registers pullers by putting
them in the kAllPullAtomInfo map. Implements the actual pull,
with condition variables to timeout.
Lastly, keys the kAllPullAtom info by a PullerKey, which is a uid and
atom id. However, the uid is just set to a default of -1 for now. I will
work the security implementation in a follow up CL.
Test: builds, boots
Test: I will write unit tests in the future. It's very difficult to
write any without StatsEvent being completed.
Change-Id: Id602dd297b6ba7df811e2d5ab2e77efc0684e418
Currently, it is possible for two threads in statsd to concurrently
access/modify memory in ConditionTrackers since they do not have locks.
This happens when one thread is processing LogEvents (lock on
StatsLogProcessor mutex), while the other thread receives uidmap updates
and locks on the mutex in the MetricProducer. This Cl changes uidmap
updates to also go through the mutex in StatsLogProcessor.
Test: bit statsd_test:*
Test: atest CtsStatsdHostTestCases
Test: local test (ag/9725088) that forced the race condition now passes
Bug: 144373785
Change-Id: I04ae2f7ed025f5ce8bc4fdeb7f10717e20d76282
This creates a java API for registering pullers. Will implement the
statsd side in a follow up CL.
Test: builds, boots
Change-Id: Ib6735984297ce3148839a6370a3c15b2a585baf5
ShellSubscriber is lazily initialized, and multiple threads can attempt
to write the same pointer since it is not initialized in threadsafe
code. Additionally, there is an NPE that crashes statsd when a null
ResultReceiver is passed in, which allows an attacker to repeatedly
crash statsd until the race condition occurs. More details, including a
proof of concept attack, are in the bug.
Bug: 141243101
Test: repro steps in bug no longer crash statsd
Test: with only the lock on iniitiallizing mShellSubscriber, statsd
still crashed but after ~7 minutes, no race condition occurred.
Change-Id: Ib56f888620497fb41d1627c07867693eb251738e