Commit Graph

53 Commits

Author SHA1 Message Date
Tej Singh
756541526a Mark bucket as condition_unknown properly
Previously, if the condition was unknown for the entire bucket, the
bucket would be annotated with a NO_DATA rather than being dropped
because CONDITION_UNKNOWN. This updates all the buckets to be dropped
due to CONDITION_UNKNOWN. Note that this will make the condition unknown
problem appear worse, when in reality it is making our bucket drop
reasons more correct and should not have any impact on the true
CONDITION_UNKNOWN.

Also fix an issue where a test was failing due to a previous test
leaving data persisted to disk. This fix is a hack for now, and should
be properly fixed later.

Bug: 158897179
Test: atest statsd_test
Change-Id: Ic703600d37aa1873ec67cb1a279dcf349d48fc9f
2020-06-13 03:50:59 -07:00
Jeffrey Huang
0017b441a8 Merge "Avoid timestamp update when data is kept on dump" into rvc-dev 2020-06-11 18:27:49 +00:00
Jeffrey Huang
289eae6dbd Avoid timestamp update when data is kept on dump
Bug: 158703584
Test: atest statsd_test
Change-Id: Ia6814c2cdb67dde2fd790ddc18fc785b1bba062c
2020-06-10 19:01:35 -07:00
Tej Singh
5a4727d78b Remove invalid configs from memory
Right now if we receive an invalid config update, we keep the existing
config in memory. This makes us remove the existing config so that we
don't have any config for the invalid config key.

Test: atest statsd_test
Bug: 158617758
Change-Id: I9daecb1c96e3a63fea3a45b07d1295f3b4ba452a
2020-06-09 18:50:25 -07:00
Muhammad Qureshi
dff78d6240 Use ASSERT_EQ for size assertions.
This fixes tests halting when accessing invalid indices.

Fixes: 156373877
Test: statsd_test
Change-Id: Ia2a9d7c71228e84467607e1485dbec33e8e6a094
2020-05-12 09:56:37 -07:00
Muhammad Qureshi
b635b3a2eb Add tests for mapIsolatedUidToHostUid methods
- Remove unneeded dependency on atoms.proto
- Remove extractIntoVector function. Assert on actual output directly.
- Use a helper method for creating UidMap mock for all tests.
- Add AttributionChain variants of existing tests in puller_util_tests.
- Add tests in StatsLogProcessor_test to test mapIsolatedUidToHostUid()
    in StatsLogProcessor.

Bug: 154285085
Test: bit statsd_test:*

Change-Id: I6d0822be9f0fbca23bd2f441dc63b97110567307
2020-04-29 09:58:03 -07:00
Tej Singh
3eb9cede0b Fix PullUidProvider unregistering on config update
Previously, MetricsManagers would unregister themselves as a
PullUidProvider for a given ConfigKey in the destructor. This caused all
pulls to fail after a config update because the new MetricsManager would
register itself before the old MetricsManager was destructed and
unregistered. This resulted in the old MetricsManager removing the new
config since they shared the same config key. The fix is for the
PullerManager to check that the PullUidProviders are equal in the
unregister function before actually erasing it.

Test: bit statsd_test:* (wrote a failing test that now passes).
Test: statsd_localdrive to manually update a config, ensured pulls still
worked.
Bug: 154544328

Change-Id: Id7af3b3b407e24bee74fc34bd1c2b9e0575e9c9e
2020-04-21 20:24:26 -07:00
Tej Singh
5d823b30fa Statsd test mapping
Makes a test mapping for statsd so that unit tests run on presubmit.

Changes to make the tests pass:
1. Require root. This is needed to write to disk, since the tests don't
run as statsd's uid

2. Remove AndroidTest.xml file in favor of the autogenerated one.

3. Remove a check in StatsService.test for getUidFromArgs. The test
checked a failure case where we passed a number bigger than INT32_MAX.
However, on a 32 bit device, strtol will return INT32_MAX when an
overflow happens, since it returns a 32 bit number on a 32 bit device.

4. Refactor a lot of e2e tests to sort dimensions, ensuring that the
dimensions are always in order, instead of relying on implicit ordering
of hashing, which can change.

5. Change a long to an int64 in TestActivationsPersistAcrossSystemServerRestart

Test: statsd_test
Bug: 129613474
Change-Id: I80dfa3bfd50ebe6d2c8c0c3ba201f3ad06b68910
2020-04-09 22:20:59 -07:00
tsaichristine
a3d2ed8cb2 (Part 3) Use new socket schema with statsd tests
Update last set of statsd tests to use new socket schema

Test: bit statsd_test:*
Bug: 149590301
Change-Id: I0fe2c219ad75813db54ff0cfbad50f55e29cb626
2020-03-23 12:23:26 -07:00
Jeffrey Huang
3eb84d4da9 Migrate statsd_test to use libstatslog_statsdtest
This is part of the migration to remove libstatslog from statsd

Since we still depend on libstatsmetadata, we cannot fully mock out
all the atoms. So we will create a new libstatslog_statsdtest to house
atoms used only in the tests.

Bug: 150976524
Test: bit statsd_test:*
Change-Id: I6368305eb89b2c35e670e42907a308afd922e604
2020-03-17 11:59:01 -07:00
Jeffrey Huang
1e4368aa43 Comment out Statsd tests
Added todos to make them ues the new schema.

Bug: 145923087
Test: m -j && bit statsd_test:*
Change-Id: I0749760eb3123407b78b9ace9a93967bac727bf5
2020-02-18 18:36:02 -08:00
Ruchir Rastogi
e449b0c185 Move statsd (and tests) to libbinder_ndk
Major changes include:
    - Removing unused permission checks within StatsService. These
      include ENFORCE_DUMP_AND_USAGE_STATS, checkDumpAndUsageStats,
      kOpUsage, and kPermissionUsage.
    - Converting from sp to shared_ptr
    - Using libbinder_ndk functions instead of libbinder functions
      (e.g. for installing death recipients, getting calling uids, etc.)
        - New death recipients were added in StatsService,
          ConfigManager, and SubscriberReporter.
    - Using a unique token (timestamp) to identify shell subscribers
      instead of IResultReceiver because IResultReceiver is not exposed by
      libbinder_ndk. Currently, statsd cannot detect if perfd dies; we
      will fix that later.

Bug: 145232107
Bug: 148609603
Test: m statsd
Test: m statsd_test
Test: bit stastd_test:*
Test: atest GtsStatsdHostTestCases
Change-Id: Ia1fda7280c22320bc4ebc8371acaadbe8eabcbd2
2020-02-14 18:07:37 -08:00
Tej Singh
480392f55e Data/activation broadcasts use elapsed realtime
The data and activation broadcasts were guardrailed using
elapsedRealtime of the *LogEvent*. However, it's possible to use
incorrect timestamps, and it's also possible that we could process
events that are old, which would result in the broadcast getting sent
too frequently. To fix this, we should use the current elapsedRealtime
instead of the LogEvent's elapsedRealtime.

I can remove the config activation broadcast if we think we should hold
off on it.

Test: bit statsd_test:*
Bug: 143155387
Change-Id: I4c58d2558d6ba3b4fd15a4a619d6f80a7bd7113f
2019-10-29 14:19:38 -07:00
Ruchir Rastogi
2b1d4febae Cleaned up StatsLogProcessor Test
Test: bit statsd_test:*
Bug: 131658651
Change-Id: I9c70bb493d0d7cb668f2105b9b723d8f58e272ac
2019-10-03 17:11:44 -07:00
Tej Singh
3ce441f63c Statsd unit tests: clear data on disk after tests
A couple of tests had data stored to disk, but were not reading the
data. This meant that the data would persist across tests (and between
tests runs and forever). This causes issues when conbined with the fact
that in tests, we use ConfigKey without initializing it. This means that
the uid/id of the config key will just be set to some random stack
space. In Christine's case, adding State to the StatsdConfig.proto
caused the "random" stack space to be the same as the one in
ConfigTttl_e2e_test, which also wrote data to disk. Then, when the
duration metric test called onDumpReport, it would grab the data on disk
(from ConfigTtl, which used a count metric), in addition to the actual
data we cared about, Adding the logic to convert the output to base64
string likely changed which stack space was allocated, meaning the
config ttl data would not be pulled, thereby destroying our debugging
efforts.

Test: unit tests pass, nothing is written to /data/misc/stats-data after
running unit tests.
Test: unit tests pass with Christine's proto change to add State to
StatsdConfig

Change-Id: Ica987b7a1c089dae6c45d0500bf6557fa7402191
2019-09-24 19:19:55 -07:00
Tej Singh
f53d445cef Persist active metric status across system server
Previously, all metrics/configs would deactivate on system server death.
Now, active status is restored.

Bug: 129717537
Test: bit statsd_test:*
Test: libprotoutil_test:ProtoOutputStream*
Change-Id: Idf372457f60a931a2d00176a5eab58c534a25e41
2019-05-13 15:37:19 -07:00
Muhammad Qureshi
15f8da95f1 Add ActivationType to EventActivation
This allows setting one metric with both IMMEDIATE and ACTIVATE_ON_BOOT
EventActivations

Also, if an on-boot Activation that is already active gets another
activation signal, ignore it.

Bug: 128880263
Fixes: 128880263
Test: statsd_test
Change-Id: I8d483882836c9abc184230b4a70d4734d49d93c3
2019-04-30 07:35:44 -07:00
Muhammad Qureshi
844694bc5d Save EventActivations to disk
Also:
- rename time_to_live to ttl
- rename activation_ns to start_ns

Bug: 129719662
Fixes: 129719662
Test: statsd_test

Change-Id: I4069f85d0c1f5bd0885a9588d8a9157d94b2c587
2019-04-30 07:35:24 -07:00
Muhammad Qureshi
3a5ebf589e Cancel Metric activations
Cancel Metric activations triggered by atom matchers

Bug: 128218061
Test: statsd_test
Test: statsd_localdrive
Change-Id: I90a705d74725c2aa04025e18e1fa77ec4fefc522
2019-04-03 06:03:48 -07:00
Olivier Gaillard
6c75ecdcef Introduces an option to set a dump latency requirement.
We are currently dumping invalid data for pulled metrics. Pulled metrics
require a new pull when flushing a bucket. We should either do another
pull or invalidate the previous bucket.

There are cases where we cannot afford to do another pull, e.g. statsd
being killed. If we do not have enough time, we'll just invalidte the
bucket to make sure we have correct data.

Bug: 123866830
Test: atest statsd_test
Change-Id: I090127cace3b7265032ebb2c9bddae976c883771
2019-02-20 17:59:50 +00:00
Tej Singh
d823aebd2f Unit tests for active config change broadcast
Modifies statsd unit tests to also assert the active configs changed
broadcast is called at the right time with the right configs. One case
has one config with multiple activation signals, while the other has
multiple configs that are activated at the same time with one signal.

Test: this is the test
Change-Id: I6e1cd19a8a0a8dbf9745152d4ad4980854be0cbd
2019-02-13 16:43:49 -08:00
Tej Singh
6ede28bcb9 Statsd sends active config broadcasts
Statsd now sends active configs changed broadcasts when needed per uid.
Also made an adb command to help debug.

More gts tests and unit tests required, will follow.
Test: GTS in topic
Bug: 123372077
Change-Id: Ib079018ded85d002581ffc2ba1240138ce7a54e7
2019-02-12 19:28:26 -08:00
Chenjie Yu
a9a310ec54 metric activation on boot
Bug: 123038368
Test: unit test
Change-Id: Id374bdfd8d15264ada0e7bac0388080be308ac8f
2019-02-07 10:54:55 -08:00
Chenjie Yu
c7939cba3d Persist active metrics to disk and read back
For metric activation that spans across boots, we need to persist active
metrics onto disk upon shutdown and load them on boot.

Bug: 123904359
Test: unit test
Change-Id: I5a4142a42595c8c132175fb574c3aa2ad30dcac0
2019-02-06 09:54:38 -08:00
Bookatz
ea20bffedc Fix statsd_test TestOnDumpReportEraseData
After erasing the statsd data, allow statsd to have an empty report, or
even a report with a non-empty metrics wrapper, as long as it doesn't
have any of the former count metrics in it.

Bug: 77909781
Test: make -j8 statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Change-Id: I525c61aea97a185df8916e4c1b4c4118493ed780
2018-12-18 10:07:56 -08:00
Bookatz
3e90658294 statsd local tool
Adds a tool for local usage of statsd. The tool can:
-upload a config from a file
-get the report data from statsd
Both the config and the report can be either in binary or human-readable
format, as specified.

Usage:
make statsd_localdrive
./out/host/linux-x86/bin/statsd_localdrive

Also, adds the ability to specify whether dump-report should also erase
the data when it returns it. A test for this is added.

Test: make -j8 statsd_test && adb sync data && adb shell data/nativetest64/statsd_test/statsd_test
Test: make statsd_localdrive && ./out/host/linux-x86/bin/statsd_localdrive <commands>
Bug: 77909781
Change-Id: I9a38964988e90c4158a555f41879534267aadd32
2018-12-12 11:42:39 -08:00
TreeHugger Robot
b1c6ba026d Merge "Statsd uidmap includes vers string and installer" 2018-11-15 08:56:28 +00:00
Bookatz
ff71cadecc Statsd can dump data as proto to bugreport
* Creates an incident section for statsd data.
* Allows dump to output statsd data, in proto format.
* Hooks up two statsd outputs to bugreports:
  -statsd report data in proto format
  -statsd metadata (statsdstats) in text format

The incident section does not import stats_log.proto because that turns
out to be extremely difficult: stats_log.proto imports atoms.proto,
which imports more things and is enormous and causes all sorts of
problems. atoms.proto was purposefully never compiled in AOSP, so to
retain that feature, the incident section uses 'bytes' instead of an
actual message. Since this isn't ever read in AOSP (other than testing),
this should be fine.

Bug: 115678461
Test: take a bug report and confirm valid proto
Test: cts-tradefed run cts-dev -m CtsStatsdHostTestCases -t android.cts.statsd.atom.HostAtomTests#testDumpsysStats

Change-Id: I1c370af7678d1dc7440ce299ea5ea4da6d33832b
2018-11-05 18:52:49 -08:00
dwchen
730403e757 Statsd uidmap includes vers string and installer
Each config can choose to include version strings and installer with
each metrics report. This data may be useful in the cloud to filter
the app-specific data.

BUG: 115626330
Change-Id: I3972ff2a94e7f0347ac0cc8a443cf328c1731e13
Test: Modified unit-tests, verified on marlin-eng
2018-10-30 16:24:25 -07:00
Yao Chen
3ff3a490e4 Remove the obsolete code for logd and add statsd socket log loss detection.
+ Remove dead code
+ Add a simple log loss detection as a starter to see if there is any log loss
  detected at all.

TODO: If we do see log loss, we can add more sophisticated logging and reset mechanism.

Bug: 80538532
Test: statsd_test
Change-Id: Iff150c9d8f9f936dbd4586161a3484bef7035c28
2018-08-06 16:24:49 -07:00
Chenjie Yu
e22192071d StatsPullerManager not use singleton
This is to be consistent with other patterns such as UidMap.
This also makes unit test simpler.

Change-Id: I1558cd609e470481f269ecf2ae616277a95cfbf0
Bug: 72722120
Test: unit test
2018-06-14 15:46:54 -07:00
David Chen
56ae0d9a48 Fixes statsd reports missing strings and SCS.
Reports written to disk don't contain the strings used, which will
make this report unusable if there are strings that don't show up
again. We should always include the strings, so this option is
removed entirely.

Also, we hard-coded the wrong number of fields when pulling
ModemActivityInfo. There are actually 10 fields, not 6.

Bug: 79601503
Test: Tested unit-tests pass on marlin-eng.
Change-Id: I6834b096ced77418a9cc2ddd79b08d1c9c447fae
2018-05-11 17:04:56 -07:00
David Chen
9e6dbbdadf Fix statsd returning uidmap with empty reports.
We notice devices uploading a bunch of bytes for the uidmap even if
the device is running an empty config, so there are no actual metrics
to report. This hardcodes some logic to skip the inclusion of the
uidmap if there are exactly 0 metrics.

Bug: 79381210
Test: Tested unit-tests on marlin-eng
Change-Id: I96348235341a7faf15ff57d4d1eccac635a3a999
2018-05-07 18:07:19 -07:00
David Chen
48944901f7 Fixes statsd returning too much data at once.
We observe a single ConfigMetricsReportList can be greater than the
safe size for the binder transaction buffer since we only check the
size of the current metrics in progress, but we also return the
previous reports stored on disk.

This change will attempt to send another ConfigMetricsReportList
as soon as possible if there's already a report on disk.

Also fixes a bug when trying to trigger data fetch before the client
has registered the corresponding dataFetchOperation.

Bug: 79201869
Test: Tested manually on marlin-eng
Change-Id: I2d3677162804a27e7a7a95d482d80c46bd994a67
2018-05-04 17:09:16 -07:00
Yangster-mac
9def8e3995 Reduce statsd log data size.
1. Hash the strings in metric dimensions.
2. Optimize the timestamp encoding in bucket.
   Use bucket num for full bucket and millis for
   partial bucket.
3. Encode the dimension path per metric and avoid
   deduping it across dimensons.

Test: statsd test
Change-Id: I18f69654de85edb21a9c835c73edead756295e05
BUG: b/77813755
2018-04-26 04:30:18 -07:00
Chenjie Yu
e36018b272 add dump report reason to reports
+ also change uidmapping version numbers to int64_t

Bug: 78132855
Change-Id: Iac7ea93e4bf651bd65bd03383e7ab4971af4fc29
Fix: 78132855
Test: gts test
2018-04-18 20:19:21 +00:00
Yao Chen
163d2602db Handle logd reconnect.
When statsd reconnects to logd, statsd will read all logs from buffer again. To prevent us from
reprocessing old events, we do the following:

1. At any given moment, record the largest timestamp(T_max) and last timestamp (check point) that
   we've seen before.
2. When reconnection happens, we look for the check point until we see a new log with a timestamp
   larger than T_max.
   -> If we found the CP, resume after the CP. Success
   -> If we can't find CP, there is definitely log loss. We reset all configs.

Note:
1. Logd has an API to read logs after a certain timestamp. But this api is vulnerable to
time changes from Settings. So we cannot rely on it.

2. If logd inserts a new log (with older timestamp) before CP, we cannot detect it. It's not
   possible to detect it without record all timestamps we have seen.

Test: statsd_test
Bug: 77813113

Change-Id: Ic3fdb47230807606ab11dc994cb162194adb8448
2018-04-10 22:06:03 -07:00
Yangster-mac
e68f3a5811 Flush the partial bucket when startd shuts down or config updated.
Test: statsd test

BUG: b/77556036
Change-Id: Ie4a04ace55e07c4529cdff5906ba874f8815f620
2018-04-05 18:05:57 -07:00
David Chen
bd12527c90 Fix uid map to be simpler and fix partial bucket.
The previous scheme captured periodic snapshots for each config with
complex logic that's unnecessary and wasted memory. We actually don't
need to store any snapshots since we just convert the current state
into a snapshot and also include the deltas (change events) since the
previous report until now.

To make the system more robust, we also include up to 100 of the
deleted apps in the uid map.

Also, fix the wiring of the partial buckets so the metric producers
form partial buckets on both app upgrade and removal, but not on
installation of a new app.

Also, we update StatsCompanionService to also include disabled apps.

Bug: 77607583
Test: Verified unit-tests pass and added new e2e tests.
Change-Id: I98e1f544d6e6571545ae1581c4cebab807596f51
2018-04-05 16:15:01 -07:00
Yangster-mac
b142cc8add Statsd config TTL
Roughly check the config every hour to see whether the ttl expired.
If so, read the config from disk and recreate the metric manager.

Test: statsd test

BUG: b/77274363

Change-Id: I16838afe5bbe966c3a0f638869751f9b59a5a259
2018-04-04 15:59:43 +00:00
David Chen
faa1af535b Includes annotations with statsd reports.
It's tricky to determine the source of the metrics on a device
currently since we can take the union of multiple configs and send
only one giant statsd_config into statsd. We will use the int64 field
to track the sub config id's and the int32 field to track the version
for each sub config, but the fields are named more generically as
annotations.

The annotations are available in both the reports and metadata.

Test: Check that all unit-tests pass on marlin-eng
Bug: 77327261
Change-Id: Ic37c549c8b2991676f69948c515156765c9f5108
2018-04-03 18:20:40 -07:00
Yangster-mac
c04feba805 Move forward the alarm timestamp when config is added to statsd.
Test: statsd test
BUG: b/77344187

Change-Id: Ieacffaa29422829b8956f2b3fcb2c647c8c3eed9
2018-04-02 18:12:36 -07:00
David Chen
35045cbc34 Fix uidmap in statsd.
Previously tried an optimization that results in corrupted proto
output. This changes to a safer approach of storing the snapshot data
in memory and only converting to proto output when the
ProtoOutputStream is provided.

Also fixes a security issue when trying to invoke triggerUidSnapshot
since we forgot to use SCS' permissions.

Test: Added a unit-test to verify output of StatsLogProcessor.
Bug: 76231867
Change-Id: Id410ce3505fda9d71caa71942ef3068b55872c66
2018-03-24 15:01:04 -07:00
Yao Chen
06dba5d79c Add API to let metrics directly drop data without writing to an output.
+ Metrics will do flushIfNeeded() to correctly move the clock and informing
  AnomalyTracker the past bucket info, and then clear past buckets.

+ We will still keep the current bucket data for the validity of the future metrics.

Bug: 70571383
Test: statsd_test
Change-Id: Ib13c45574974e7b4e82bd8f305091dc93bda76f5
2018-03-01 15:22:55 -08:00
Yangster-mac
932ececa16 Alarm: wakes up statsd and notifies the subscribers.
Test: manually tested it.
Change-Id: Id796a68976aeb1611183023ba4e9c6a8b8c44bb8
2018-02-27 13:30:48 -08:00
Yao Chen
8a8d16ceea Statsd CPU optimization.
The key change is to revamp how we parse/store/match a log event, especially how we match repeated
field and attribution nodes, and how we construct dimensions and compare them.

+ We use a integer to encode the field of a log element. And also encode the FieldMatcher into an
integer and a bit mask. The log matching becomes 2 integer operations.

+ Dimension is stored as encoded field and value pair. Checking if 2 dimensions are equal is then
  becoming checking if the underlying integers are equal. The integers are stored contiguously
  in memory, so it's much faster than previous tree structure.

Start review from FieldValue.h

Test: statsd_test + new unit tests

Bug: 72659059

Change-Id: Iec8daeacdd3f39ab297c10ab9cd7b710a9c42e86
2018-02-12 10:38:45 -08:00
Yangster-mac
b0d0628a29 Thread-safety at log processor level.
Test: statsd unit test passed.

Change-Id: Ibe8c8d3cc8297875b16ee385c077b71c87353147
2018-01-08 14:59:42 -08:00
Yangster-mac
94e197cceb 1/ Change all "name" to id in statsD.
2/ Handle Subscription for alert.
3/ Support no_report_metric

Bug: 69522276
Test: all statsd unit tests passed.
Change-Id: I851b235f2d149b8602b0cad632d5bf541962f40a
2018-01-03 15:34:00 -08:00
Yangster-mac
2087716f2b 1/ Support nested message and repeated fields in statsd.
2/ Filter gauge fields by FieldMatcher.
3/ Wire up wakelock attribution chain.
4/ e2e test: wakelock duration metric with aggregated predicate dimensions.
5/ e2e test: count metric with multiple metric condition links for 2 predicates and 1 non-sliced predicate.

Test: statsd unit test passed.

Change-Id: I89db31cb068184a54e0a892fad710966d3127bc9
2018-01-01 10:01:36 -08:00
Yao Chen
d10f7b1c7b Add log source filtering in statsd to filter out spams.
+ Add log source whitelist in StatsdConfig
+ Some changes in UidMap API. Listener needs to be wp instead of sp.
+ Update dogfood app config to have log source
+ Increase the stats service thread pool size to 10 (9+1).

TODO: add unit tests(b/70805664). This unit test takes some time to write.

Test: statsd_test & manual

Change-Id: I129b1cc13db5114db7417580962bd7cc4438519d
2017-12-20 18:45:43 -08:00