Deadlock in bound stream pair during NSURLTask upload (10.11.4 regression?)

Originator:Jens.Alfke
Number:rdar://25197186 Date Originated:3/16/16
Status:Open Resolved:
Product:OS X Product Version:10.11.4 Beta (15E64a)
Classification:Crash/hang/data loss Reproducible:Always
 
Deadlock in the mutex of a bound CFStream pair, while the pair is being used for the HTTPBodyStream of an NSURLTask that's uploading data to an HTTP server. The NSURLConnectionLoader thread has two re-entrant calls into RequestBodyStreamProvider::readBodyStream, apparently reading the same stream, so I believe the inner call is deadlocking on the mutex acquired by the outer call.

This does not occur in 10.11.3 but does in 10.11.4 betas.

My thread hangs like this:

    frame #0: 0x00007fff88e8cde6 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fff965b7e4a libsystem_pthread.dylib`_pthread_mutex_lock_wait + 89
    frame #2: 0x00007fff9528e274 CoreFoundation`boundPairWrite + 52
    frame #3: 0x00007fff951aafa5 CoreFoundation`CFWriteStreamWrite + 437
  * frame #4: 0x0000000107116b5d CouchbaseLite`-[CBLMultiStreamWriter writeToOutput](self=0x000060c00004f780, _cmd="writeToOutput") + 429 at CBLMultiStreamWriter.m:254
    frame #5: 0x0000000107117712 CouchbaseLite`-[CBLMultiStreamWriter stream:handleEvent:](self=0x000060c00004f780, _cmd="stream:handleEvent:", stream=0x000060d000056170, event=NSStreamEventHasSpaceAvailable) + 1506 at CBLMultiStreamWriter.m:288
    frame #6: 0x00007fff95200094 CoreFoundation`_signalEventSync + 180
    frame #7: 0x00007fff9521d29e CoreFoundation`_cfstream_shared_signalEventSync + 478
    frame #8: 0x00007fff951de881 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
...

The CFNetwork thread deadlocks like this:

* thread #7: tid = 0x5393e, 0x00007fff88e8cde6 libsystem_kernel.dylib`__psynch_mutexwait + 10, name = 'com.apple.NSURLConnectionLoader'
  * frame #0: 0x00007fff88e8cde6 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fff965b7e4a libsystem_pthread.dylib`_pthread_mutex_lock_wait + 89
    frame #2: 0x00007fff9528dc57 CoreFoundation`boundPairRead + 151
    frame #3: 0x00007fff951b49c5 CoreFoundation`CFReadStreamRead + 389
    frame #4: 0x00007fff96c22b1e CFNetwork`RequestBodyStreamProvider::readBodyStream(bool) + 308
    frame #5: 0x00007fff95200094 CoreFoundation`_signalEventSync + 180
    frame #6: 0x00007fff9521c67b CoreFoundation`_cfstream_solo_signalEventSync + 251
    frame #7: 0x00007fff951fd8fc CoreFoundation`_CFStreamSignalEvent + 476
    frame #8: 0x00007fff9528dec8 CoreFoundation`boundPairRead + 776
    frame #9: 0x00007fff951b49c5 CoreFoundation`CFReadStreamRead + 389
    frame #10: 0x00007fff96c22b1e CFNetwork`RequestBodyStreamProvider::readBodyStream(bool) + 308
    frame #11: 0x00007fff95200094 CoreFoundation`_signalEventSync + 180
    frame #12: 0x00007fff9521d29e CoreFoundation`_cfstream_shared_signalEventSync + 478
    frame #13: 0x00007fff951de881 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
...

Steps to Reproduce:
1. Call CFStreamCreateBoundPair
2. Set the input stream as the HTTPBodyStream of an NSURLTask.
3. Schedule the output stream with the runloop
4. Start the NSURLTask
5. Whenever notified that the output stream has space available, write data to the stream; total data size is ~30MB

Expected Results:
NSURLTask completes and all the data written to the output stream is sent in the HTTP body to the server.

Actual Results:
Deadlock inside CFWriteStreamWrite at some point after writing a few megabytes of data. (See backtraces for details.)

Version:
10.11.4 Beta (15E64a)

Notes:
The stream related code of ours has been stable for 2 years. The HTTP level was recently switched from NSURLConnection to NSURLSession, but was working fine for a while. I ran into this deadlock three days ago, but can reproduce it with revisions of our code going back several weeks (revisions that passed unit tests at the time.)

Moreover, this deadlock does not occur on a different Mac running 10.11.3. Therefore I suspect this is a regression in the latest two betas of 10.11.4.

Configuration:
MacBook Pro (15" retina, late 2013) 16GB RAM

Comments


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!