Xcode 7.2b2 (7C46t): [Swift] Overflow check & loop & closure combination prevents ARC optimization

Originator:janoschhildebrand
Number:rdar://23412050 Date Originated:05-Nov-2015
Status:Open Resolved:
Product:Developer Tools Product Version:Xcode 7.2b2
Classification:Performance Reproducible:Always
 
Summary:
Take the following type Wrapper:

struct Wrapper {
    private var buffer = Array<Int>()
    
    func slow() {
        buffer.withUnsafeBufferPointer {
            var index = $0.count
            while index > 0 {
                index = index - 1
            }
        }
    }
    
    func fast() {
        buffer.withUnsafeBufferPointer {
            var index = $0.count
            while index > 0 {
                index = index &- 1
            }
        }
    }
}

The two methods slow() and fast() are almost identical, the only difference being the use of the overflow checking '-' in f() vs '&-' in fast().

However, this small change leads to a (relatively) large change in performance as the overflow checking variant appears to prevent an ARC optimization.

The compiler's output for fast() is essentially what you'd expect but for slow(), the Array's backing store is retained before executing the closure contents and released afterwards.

Since the containing type already owns buffer this should not be necessary and for fast() the compiler is able to optimize the retain/release calls away.


An example project is attached that includes some basic performance measurement code to demonstrate the issue.

A few notes:

* I tried to reduce the example code as much as possible. The following parts seems to be necessary for the issue to occur (although there certainly might be other constellations):
    - The Wrapper type. If I put the contents of f() at the call site, the compiler is able to remove the redundant retain/release.
    - The loop. An equivalent for loop works as well. I'm not sure this is actually necessary for the problem to occur, but I haven't found another way to trigger the issue.
    - The overflow check. Obviously the issue doesn't occur without the overflow check. 

* The retain/release calls are present in the (optimized) SIL so I assume the issue is with the Swift optimizer (and not LLVM related).

* Also both the SIL and the assembly output for slow() are quite a bit more complex than the one for fast(). I haven't analyzed it in detail but perhaps this is a side-effect of the same optimization blocker.

* This issue is not Array specific (I originally noticed it with ManagedBuffer). Perhaps affects all class-methods with closure arguments but I haven't tested this further.

* Unsurprisingly, disabling overflow checks (SWIFT_DISABLE_SAFETY_CHECKS) 'fixes' this issue



* Not directly related, but I just found it interesting: When fast() is inlined, the loop in the closure body is essentially retained in the assembly. However, the standalone method implementation for fast() is empty (other than stack setup/teardown). So in that case the compiler seems to be able to determine that the function can be optimized away entirely but not when it's inlined :-)

Steps to Reproduce:
1) Open the attached project
2) Build & Run (will build release config)
3) The program will output the runtimes of 100000000 iterations of calling slow() and fast().

Expected Results:
The two functions should have very similar performance, as both do no work in this case.

Also the assembly for slow() should not contain any retain/release calls for the array backing store.

Actual Results:
On my machine I get the following results:
1.21469098329544
0.033428966999054 

slow() retains & releases the array backing store and profiling shows that execution time is dominated by that as expected.

Version:
Xcode 7.2b2 (7C46t)
Apple Swift version 2.1.1 (swiftlang-700.1.101.11 clang-700.1.79)
OS X 10.11.1 (15B42)

Comments


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!