Please forgive the following rant, but I get tired of seeing technology articles making click-bait mountains out of development molehills without someone pushing back.
The Information reported recently that Apple’s chip design group is in disarray after suffering a design failure it claims was “unprecedented in the group’s history.” What was the nature of this horrible failure on Apple’s design team? The Information’s description of this terrible event was (and I quote):
“The iPhone 14 Pro models, which went on sale in September, showed only small gains in graphics performance compared to the leaps prior generations of iPhones had made over their predecessors, according to testing by independent chip-benchmarking firms.”
Other publications such as MacWorld joined the drumbeat of schadenfreude based on The Information’s article claiming:
“…”unprecedented” missteps caused Apple to scrap the biggest advancements in the A16 chip, including ray-tracing, the high-end effect that allows the GPU “to mimic how the human eye processes light and shadows in real-time.” Ray tracing is a major feature of high-end games and graphics cards and would have been one of the first phones to support it.”
OK, let’s do a sanity check on this claim. Quick, name all the iPhone apps that need real-time (not pre-rendered) ray-tracing. Take your time — I’ll wait.
If you answered “None”, you are correct.
Now I’m sure some readers said something along the lines of “Call of Duty: Zombie War Meets Game of Thrones would be awesome with ray-tracing.” To those folks, I’ll give you partial credit, but I will also challenge you to see the difference between a ordinary GPU rendering and ray-tracing on a 6-inch screen without squinting.
Now I’m sure The Information did very thorough research with the Apple engineering team, and I have no doubt that there was an A16 prototype that drew too much power. However, was it really an “unprecendented failure?”
I don’t have any inside information about what really went on, but my take is based on two things I believe from analyzing Apple over the last 35 years or so:
Apple employs talented and thoughtful people. I probably met more creative, super-star level developers and managers at Apple than any other company I covered.
Apple manages risk very carefully in product releases. While Apple encourages development creativity, everyone also knows that its $2+ trillion market cap depends on it delivering great and profitable products on time and on budget.
So if you accept those two axioms, answer this question:
Do you think Apple would bet its entire $190-billion-per-year iPhone business on a single chip design?
The answer, of course, is no. From what I know of Apple’s design process, it always develops multiple designs for its products. That’s so the folks who are charged with getting products to market can make tradeoffs of cost, schedule, risk, supply chain, and other factors other than design. I’d be willing to bet that the Apple chip design team had at least 3 different designs under consideration for the A16 name:
A good, solid, no-risk design. This chip would likely be an A15-derivative with the minimal number of changes necessary to power iPhone 14’s feature set. It could even be an actual A15 fabricated with Taiwan Semiconductor’s (abbreviated TSMC) mature 5nm process with no changes except those necessary for the phone’s new display and cameras. While marketing would hate using last-year’s chip, it would have the advantage of being easily manufactured, since it’s already being produced in volume at low cost.
A better but low-risk design. This chip would likely be an A15-derivative with a few new process tweaks or module changes (increased cache and memory are always popular changes because they are so straightforward to build), but the vast majority of it would be the same as the A15. Such a chip would likely be built with TSMC’s newer 4nm process, which would give it somewhat higher performance, even with the same processor designs. While this choice would carry more risk than using last year’s chip, Apple would still have high confidence that it could sell millions of iPhone 14s with this chip.
A “this is the best chip we can think of” design. This is the design that the engineering team was undoubtedly most excited about and probably spent most of their time on. It allows them to put in dramatic new features like ray-tracing that marketing can sell the heck out of. Further, it’s the type of design challenge that looks particularly impressive on designer resumes. It’s also the most risky and likely to fail of all the choices because the design team is taking chances.
Assuming Apple was using something akin to this good, better, best chip design competition, The Information’s story makes a ton more sense. What likely happened is as follows:
All three designs were laid out and fabricated. This is not a trivial undertaking. Each design requires millions of dollars of investment to produce one prototype, but having three designs all of which could work in an iPhone 14 gave Apple choices when decisions have to be made to hit launch dates.
The best design (choice #3 above) exceeded its power budget. Every chip module such as a CPU or GPU has a budget for both power and area, and those budgets are “hard” — no one can exceed them without putting the entire design at risk of not working. While trying to build the best possible GPU, the GPU team apparently added new features to the existing designs. Unfortunately, more features almost always means more transistors, more area, and more power. That makes the chips bigger, reduces chip manufacturing yield, and increases chip costs (at least on the same manufacturing process). The team undoubtedly had simulated just how much power was supposed to be consumed, but simulators aren’t perfect, especially when they are simulating new manufacturing processes such as TSMC’s 4nm one.
With new display and camera features already consuming additional power, management decided to go with the lower risk “better” processor choice. The iPhone 14 was already designed with some major improvements over the iPhone 13, including an always-on display and an even more improved cameras (iPhone cameras have gotten amazingly good). However, when it came to which processor it should use, Apple had a choice between a power-hungry “best” that would compromise battery life and customer experience for a ray-tracing feature no one uses yet or a design that would yield a small improvement over the A15, but that it was reasonably certain it could ship in quantities of millions. Needless to say, Apple chose the latter.
But what about the design team’s failure? Didn’t building a GPU that exceeded its power budget indicate that the team has lost its mojo and is in disarray?
I’m sorry, but anyone who believes that one design mistake means that an engineering team is broken has never worked in engineering.
Great engineering designs fail all the time in the lab. That’s because the only way you get better-than-average results is to take chances on new ideas and new techniques. Some of those new ideas and techniques don’t work when you build them with real components. That’s the nature of engineering: it’s built on testing, failure, and iteration, not on single “Eureka!” moments.
What’s remarkable about Apple is that despite engineering’s unpredictability, it manages to deliver new product iterations that its customers love and buy like clockwork, year after year. it does that by allowing designers to swing for the fences while simultaneously funding more conservative designs it can use if those don’t pan out. While critics may view the chip powering the iPhone 14 an “unprecedented engineering failure”, that so-called failure is going to generate roughly $200 billion in Apple’s current fiscal year because it could be built on-time and within budget.
And what about those ray-tracing features the team designed? Given that TSMC just announced that it will have its 3nm process available for making chips that are roughly 33% more power efficient in 2023, the engineering team will probably have to power budget to add them into the iPhone 15. So the design effort wasn’t wasted; like the baseball folks say, there’s always next year.