kextcache updates fail silently for so many reasons
| Originator: | marco | ||
| Number: | rdar://12274888 | Date Originated: | 11-Sep-2012 11:28 AM |
| Status: | Open | Resolved: | No |
| Product: | Mac OS X | Product Version: | 10.8.1 (12B19) |
| Classification: | Serious Bug | Reproducible: | Sometimes |
==========================
Summary:
==========================
I'm one of the developers of an app, which installs a couple of apps, daemons, agents and one kernel extension. We have our own custom installer app that puts everything in place and triggers a kext cache update simply by putting the kext in /System/Library/Extensions, as it is recommended in man(8) kextcache. If the update fails, an older version of the kernel extension will be loaded (in the case of an update from an older version). Because we rely heavily on interprocess communication between our daemon and our kernel extension, their versions must match. If they don't, our app reports an error to the user and refuses to work.
The kextcache update fails for some of our users for a variety of reasons, none of which we are responsible for. With the update of 10.8.1, which delivers some kernel extensions of its own, you (Apple) are experiencing this problem, too, as indicated by system log messages such as these (from a user's system log during startup):
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.bsd, v12.1: already have loaded v12.0.
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.iokit, v12.1: already have loaded v12.0.
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.libkern, v12.1: already have loaded v12.0.
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.mach, v12.1: already have loaded v12.0.
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.private, v12.1: already have loaded v12.0.
<date> <host> kernel[0]: Refusing new kext com.apple.kpi.unsupported, v12.1: already have loaded v12.0.
Unfortunately, the user doesn't get notified about any errors if the kextcache update fails (I found one exception: changing the startup disk in the system preferences doesn't work anymore if the FileVault 2 is enabled. Instead, the user simply gets a message that the boot caches could not be updated).
==========================
Steps to Reproduce:
==========================
I'm testing the success of failure of the kextcache update by issuing the following command, which I'll call "Update Command" from here on:
sudo touch /System/Library/Extensions/ && sudo kextcache -v 6 -update-volume /
The "-update-volume" option is specifically mentioned in man(8) kextcache, but only on OS X 10.8 and 10.8.1, not on earlier OS X versions or the online version of the man page [link1]. Although, it seems to work the same on 10.6, 10.7 and 10.8 (these are the versions supported by our app).
I found different reasons for kextcache to fail:
- Return code 0 (success): not really a fail, but kext cache doesn't do anything if /mach_kernel doesn't exist. A user had FileVault 2 enabled (thus starting from Recovery HD and not using /mach_kernel) and that file was missing (don't know how the user managed to do that).
Can be reproduced simply by renaming /mach_kernel temporarily and then issuing the Update Command.
Fixed by telling user to reinstall 10.8.1, which puts /mach_kernel there again.
- Return code 28: FileVault 2 is enabled and Recovery HD is full: I've seen this a couple of times and in all cases the shadow file for network booting (/Volumes/Recovery HD/.com.apple.NetBootX/shadowfile) was using up almost all of the disk space. In one case, the user reported that he only booted from a network volume once: to install OS X.
Can be reproduced simply by putting a large file on the Recovery HD (adapt count accordingly) and then issuing the Update Command:
sudo dd if=/dev/zero of=/Volumes/Recovery\ HD/big_file count=117760 bs=1024
Fixed by deleting the shadow file.
- Return code 71: Problem with a 3rd party kernel extension.
Can be reproduced by installing VodafoneMobileBroadBand4.07.07.00.dmg from [link2]. Do not download the newest one (it works fine on 10.8.1). After installing, issue the Update Command and you'll get an error.
Fixed by deleting/updating offending kernel extensions.
- Another code 71: Unknown problem with a kernel extensions.
I could not reproduce this, but I attached a log by a user with the output of the Update Command [file1] and the list of his installed kernel extensions [file2]. The error message from the log:
setting load address of AppleAHCIPort.kext to 0xffffff7f824ce000
kxld[com.apple.driver.AppleAHCIPort]: The vtable for AppleAHCIPortPolledAdapter was
not patched because its parent, the vtable for IOAHCIPollerInterface, was not found.
==========================
Expected Results:
==========================
1. Some kind of user feedback. Anything. Tell the user that something is wrong, or better yet: fix it.
2. The step "Removing unnecessary bloat." that is logged while updating the helper partition should actually delete files that are not needed, like the "/Volumes/Recovery HD/big_file" we created earlier. And especially the shadow file for network boots, that is a real problem.
==========================
Actual Results:
==========================
Old versions of kernel extensions are kept in use after a system update or software update, which could result in unexpected behavior. And the user doesn't know anything about it or how to resolve the issue.
==========================
Regression:
==========================
We have a couple of support cases where users run into this issue. All of them are on OS X 10.8.1. Because our app is currently at the end of an open beta phase (release candidate is out), most of our users are probably rather tech-savvy and use the latest software versions. Maybe that's the reason that all of them run OS X 10.8.1, or maybe there's something in that OS release that causes these problems more often.
Most of our users that reported this issue use FileVault 2, but I don't know if it's all of them. kextcache seems to update the helper partition (Recovery HD) only if FileVault 2 is enabled and because one of the causes for the issue at hand is a full Recovery HD, the statistics may be skewed in that direction.
==========================
Notes:
==========================
[link1] http://developer.apple.com/library/mac/#documentation/Darwin/Reference/Manpages/man8/kextcache.8.html
[link2] http://www.business.vodafone.com/site/bus/public/enuk/support/10_productsupport/laptop_connectivity/40_software/software/10_latest/p_software.jsp
[file1] result_code_71_kextcache.log
[file2] result_code_71_installed_kexts.log
Comments
Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!