Skip to content
Snippets Groups Projects
  1. Jan 01, 2023
  2. Oct 01, 2022
  3. Sep 30, 2022
  4. Sep 29, 2022
  5. Sep 28, 2022
    • Robert Carr's avatar
      BlastBufferQueue: Fake release if not received by complete · 405e2f68
      Robert Carr authored
      This is a cherry-pick from of a CL from sc-v2 adding a safety mechanism
      for lost buffers. The original CL probably should have been merged in to master,
      and I can't relate to my thought process adding DO NOT MERGE on the
      original CL. Recently we have started seeing some of these ANRs again.
      Well actually we have seen them continuously, but most have been due to
      GPU hangs. Recently we started seeing some which aren't due to GPU
      hangs. We found one trace, which showed the sort of situation described
      below, which lead me to realizing this code was never merged in to
      master. It shoud be relatively safe since we released it on sc-v2.
      
      Perfetto telemetry shows a cluster where a transactionComplete is
      arriving but an earlier releaseBufferCallback has never arrived.
      This leads to blocking and ANRs. We try to work around this by
      generating a fake release buffer callback for previous buffers when we
      receive a TransactionCompleteCallback. How can we know this is safe? The
      criteria to safely release buffers are as follows:
      	Condition 1. SurfaceFlinger does not plan to use the buffer
      	Condition 2. Have the acquire fence (or a later signalling fence)if
      	dropped, have the HWC release fence if presented.
      Now lets break down cases where the transaction complete callback
      arrives before the release buffer callback. All release buffer callbacks
      fall in to two categories:
      	Category 1. In band with the complete callback. In which case the client
      	library in SurfaceComposerClient always sends release callbacks
      	first.
      	Category 2. Out of band, e.g. from setBuffer or merge when dropping buffers
      
      In category 2 there are two scenarios:
      	Scenario 1: Release callback was never going to arrive (bug)
      	Scenario 2. Release callback descheduled, e.g.
      		a. Transaction is filled with buffer and held in VRI/WM/SysUI
      		b. A second transaction with buffer is merged in, the release
      		buffer callback is emitted but not scheduled (async binder)
      		c. The merged transaction is applied
      		d. SurfaceFlinger processes a frame and emits the transaction
      		complete callback
      		e. The transaction complete callback is scheduled before the
      		release callback is ever scheduled (since the 1 way binders are
      		from different processes).
      In both scenarios, we can satisfy Conditions 1 and 2, because the HWC
      present fence is a later signalling fence than the acquire fence which
      the out of band callback will pass. While we may block extra on this
      fence. We will be safe by condition 2. Receiving the transaction
      complete callback should indicate condition 1 is satisfied.
      
      In category 1 there should only be a single scenario, the release buffer
      callback for frame N-1 is emitted immediately before the transaction
      callback for frame N, and so if it hasn't come at that time (and isn't
      out of band/scheduled e.g. category 2) then it will never come. We can
      further break this down in to two scenarios:
      	1. stat.previousReleaseFence is set correctly. This is the
      	expected case. In this case, this is the same fence we would
      	receive from the release buffer callback, and so we can satisfy
      	condition 1 and 2 and safely release.
      	2. stat.previousReleaseFence is not set correctly. There is no
      	known reason for this to happen, but there is also no known
      	reason why we would receive a complete callback without the
      	corresponding release, and so we have to explore the option as
      	a potential risk/behavior change from the change.
      Hypothetically in category 1 case 2 we could experience some tearing.
      In category 1 we are currently experiencing ANR, and so the tradeoff
      is a risk of tearing vs a certainty of ANR. To tear the following would
      have to happen:
      	1. Enter the case where we previously ANRed
      	2. Enter the subcase where the release fence is not set
      	correctly
      	3. Be scheduled very fast on the client side, in practicality
      	HWC present has already returned and the fence should be firing
      	very soon
      	4. The previous frame not be GL comp, in which case we were
      	already done with the buffer and don't need the fence anyway.
      Conditions 2 and 3 should already make the occurence of tearing much
      lower than the occurence of ANRs. Furthermore 100% of observed ANRs
      were during GL comp (rotation animation) and so the telemetry indicates
      the risk of introducing tearing is very low.
      
      To review, we have shown that we can break down the side effects from the change in
      to two buckets (category 1, and category 2). In category 2, the possible negative side
      effect is to use too late of a fence. However this is still safe, and should
      be exceedingly rare. In category 1 we have shown that there are two scenarios,
      and in one of these scenarios we can experience tearing. We have furthermore argued
      that the risk of tearing should be exceedingly low even in this scenario,
      while the risk of ANR in this scenario was nearly 100%.
      
      Bug: 247246160
      Bug: 244218818
      Test: Existing tests pass
      Change-Id: I757ed132ac076aa811df816e04a68f57b38e47e7
      405e2f68
  6. Sep 27, 2022
    • Alec Mouri's avatar
      Fix use-after-free in SurfaceFlinger::doDump · 3b3e5918
      Alec Mouri authored
      SurfaceFlinger::doDump previously contained two layer traversals, one on
      the main thread and one off the main thread. During concurrent dumpsys
      commands, the layer traversals may race with each other, which causes
      shared ownership of the underlying storage of a SortedVector containing
      the z-ordered list of layers. Because the implementation of
      SortedVector's STL iterators assumes that the underlying storage may be
      edited, this can cause the storage to be copied whenever SortedVector::begin
      and SortedVector::end are called, which means that SortedVector::begin()
      + SortedVector::size() == SortedVector::end() is not always true, which
      causes invalid iteration.
      
      In general, this use-after-free can happen as long as the off-main
      thread traversal exists in doDump(), because the traversal can run in
      parallel with any workload on the main thread that executes a layer
      traversal. So, this patch moves the traversal for dumping out the list
      of composition layers into the main thread.
      
      A future patch could explore either fixing SortedVector to fix judicious
      iterator invalidation, or building LayerVector on top of std::set, but
      either option is an invasive data structure change.
      
      Bug: 237291506
      Test: Test script that calls dumpsys SurfaceFlinger on many threads
      Change-Id: I0748396519c924dc1b84113d44259f22d0d7ebd6
      (cherry picked from commit 6761733a)
      Merged-In: I0748396519c924dc1b84113d44259f22d0d7ebd6
      3b3e5918
  7. Sep 20, 2022
  8. Sep 19, 2022
  9. Sep 14, 2022
  10. Sep 13, 2022
  11. Sep 12, 2022
  12. Sep 09, 2022
    • Rachel Lee's avatar
      Fix getLatestVsyncEventData deadline. · 4bce2e19
      Rachel Lee authored
      Bug: 239775097
      Test: atest libsurfaceflinger_unittest
      Test: pixel 4xl camera preview test (see bug)
      Change-Id: Ib92849291458f59c2ef2238d3586211b87174c7f
      (cherry picked from commit 0679cf20)
      Merged-In: Ib92849291458f59c2ef2238d3586211b87174c7f
      4bce2e19
    • joenchen's avatar
      CE: flush the staged brightness to HWC before power off · e72ba5e2
      joenchen authored
      After SurfaceFlinger receives the brightness commands, the commands are
      sent to HWC until the next frame presents. However, no following frames
      are present after the power mode is set to off, and HWC cannot receive
      the last brightness commands.
      
      This change forces SurfaceFlinger to flush the staged brightness to HWC
      prior to execution of power off.
      
      Bug: 239908529
      Test: suspend and resume repeatedly
      Change-Id: Ia8fb04399e757de16e06e6516d86bdfcfabd5a2e
      e72ba5e2
  13. Sep 01, 2022
    • Android Build Coastguard Worker's avatar
      Snap for 9018797 from 7fa3483d to tm-qpr1-release · 47224b75
      Android Build Coastguard Worker authored
      Change-Id: I65871b1a5ac8cef1e7d9113e718b9a1768c9ec3f
      47224b75
    • Prabir Pradhan's avatar
      Merge changes I111361f7,Ic4979b91,Ic289d772 into tm-qpr-dev · 7fa3483d
      Prabir Pradhan authored
      * changes:
        Reset the touch state when the active viewport is disabled
        Fix issues with InputMapper tests
        Fix spot not disappear when display id changed
      7fa3483d
    • Prabir Pradhan's avatar
      Reset the touch state when the active viewport is disabled · f670dad1
      Prabir Pradhan authored
      When an active viewport becomes inactive, cancel any ongoing gestures
      immediately. When the viewport becomes active again, make sure state is
      reset again so that we eliminate any race conditions between the
      viewport becoming active and first touches coming in from the device.
      
      This is a preventative measure to defend against unexpected touch
      behavior around viewports being activated and deactivated.
      
      Bug: 234662773
      Test: atest inputflinger_tests
      Change-Id: I111361f7470fdad39b493b516e8a8f167e0c681c
      Merged-In: I111361f7470fdad39b493b516e8a8f167e0c681c
      (cherry picked from commit c0bdeefd)
      f670dad1
    • Prabir Pradhan's avatar
      Fix issues with InputMapper tests · 36690419
      Prabir Pradhan authored
      - When an input device is added, a device reset notification is sent.
        This should be consumed when the device is added so that it does
        not need to be consumed in every test case.
      - The above change exposed an issue with a CursorInputMapper test case,
        where it was consuming the wrong device reset notification.
      - When a device is configured, it may produce device reset
        notifications. After configuring, we should loop the input reader so
        that the queued input listener is flushed so that these args can be
        sent out.
      
      Bug: 234662773
      Test: atest inputflinger_tests
      Change-Id: Ic4979b91207a6abf4c4ac65fd3db30307cb53729
      Merged-In: Ic4979b91207a6abf4c4ac65fd3db30307cb53729
      (cherry picked from commit b5174de1)
      36690419
  14. Aug 31, 2022
  15. Aug 30, 2022
  16. Aug 29, 2022
  17. Aug 27, 2022
    • Brian Duddie's avatar
      Fix double-close on direct channel registration · 83dde25b
      Brian Duddie authored
      In the AIDL sensor HAL wrapper, file descriptors associated with a
      direct channel were being wrapped in the AIDL NativeHandle type using
      makeToAidl(), which ends up taking ownership of the fds, and
      unintentionally closing them when the object goes out of scope (via
      ndk::ScopedFileDescriptor), so the same fds would be closed at a later
      point when the original native_handle_t is closed.
      
      Switch to dupToAidl() which does not take ownership of the input file
      handles.
      
      Bug: 234456046
      Test: apply fdsan protection (in different CL), confirm via
            test-sensorservice that the file descriptor is not closed twice
      Change-Id: I51c0ba0f31b43c56bf055d186a599b289ca0065f
      83dde25b
  18. Aug 26, 2022
    • linpeter's avatar
      Allow first power mode as off state · e8059323
      linpeter authored
      Bug: 237745199
      test: atest libsurfaceflinger_unittest:OnInitializeDisplaysTest
            atest libsurfaceflinger_unittest:SetPowerModeInternalTest
      Merged-In: I3a2eae4efc4a5c6113700a9ca9e9b261e364a878
      Change-Id: I3a2eae4efc4a5c6113700a9ca9e9b261e364a878
      e8059323
Loading