int foo();
int fiver(int num) {
  for(int j = 0; j < 5; j++)
    num = num + foo();
return num;
}


The compiler can either compile this as a loop (


But we can do more (ThinLTO)

That's a good start, but we can do better. Let's look at inlining - the compiler takes the code of a called function and inserts all of it at the callsite.


inline int foo() { return 3; };
int fiver_inline(int num) {
  for(int j = 0; j < 5; j++)
    num = num + foo();
return num;
}


When the compiler inlines foo(), it turns into


int fiver_inline(int num) {
  for(int j = 0; j < 5; j++)
    num = num + 3;
return num;
}


Not bad - saves us the function call and all the setup that goes with having a function. But the compiler can in fact even do better - because now all the information is in one place. The compiler can apply that knowledge and deduce that fiver_inline() adds the number three and does so 5 times - and so the entire code is

Now we're back to compiling individual source files. Distributed/cached compilation works again, small changes don't cause a full rebuild, and since "ThinLTO" does just inline a few functions, and it is relatively little overhead.

Of course, the question of "which functions should ThinLTO inline?" still needs to be answered. And the answer is still "the ones that are small and called a lot". Hey, we know those already - from the profiles we generated for Profile Guided Optimization (PGO). Talk about lucky coincidences!

But wait, there's more! (Callgraph Sorting)

We've done a lot for inlined function calls. Is there anything we can do to speed up functions that haven't been inlined, too? Turns out there is.

One important factor is that the CPU doesn't fetch data byte by byte, but in chunks. And so, if we could ensure that a chunk of data doesn't just contain the function we need right now, but ideally also the ones that we'll need next, we could ensure that we have to go out and get chunks of data less often.

In other words, we want functions that are called right after the other to live next to each other in memory also ("code locality"). And we already know which functions are called close to each other - because we ran our profiling and stored performance profiles for PGO.

We can then use that information to ensure that the right functions are next to each other when we link.

I.e.

g.c
  extern int f1();
  extern int f2();
  extern int f3();
  int g() {
    f1();
    for(..) {
      f3();
  }
  f1();
  f2();
}


could be interpreted as "g() calls f3() a lot - so keep that one really close. f1() is called twice, so… somewhat close. And if we can squeeze in f2, even better". The calling sequence is a "call graph", and so this sorting process is called "call graph sorting".

Just changing the order of functions in memory might not sound like a lot, but it leads to

And since this work requires changes to compilers and linkers, that would mean changing the build - and testing it - across 5 compilers and 4 linkers. But, thankfully, we've simplified our toolchain (Simplicity - another one of the 4S's!). To be able to do this, we worked with the LLVM community to make clang a great being worked on).

Sometimes, the best way to get more speed is not to change the code you wrote, but to change the way you build the software.

Posted by Rachel Blum, Engineering Director, Chrome Desktop

Data source for all statistics: Speedometer 2.0.










Isolated Splits to the Rescue

A few days of spelunking in the Android source code led us to the Chrome shipped with isolated splits in M89 we now have several months of data from the field, and are pleased to share significant improvements in memory usage, startup time, page load speed, and stability for all Chrome on Android users running Android Oreo or later:
  • Median total memory usage improved by 5.2%
  • Median renderer process memory usage improved by 7.9%
  • Median GPU process memory usage improved by 7.6%
  • Median browser process memory usage improved by 1.2%
  • 95th percentile startup time improved by 7.6%
  • 95th percentile page load speed improved by 2.3%
  • Large improvements in both browser crash rate and renderer hang rate
Posted by Clark Duvall, Chrome Software Engineer

Data source for all statistics: Real-world data anonymously aggregated from Chrome clients.

While this kind of progress is exciting, optimizing for Core Web Vitals can still be challenging. That's why we've been improving our tools to help developers better monitor, measure, and understand site performance. Some of these changes include:

  • Updates in PageSpeed Insights which make the distinction between "field data" from user experiences and "lab data" from the Lighthouse report more clear.
  • Capabilities in user flow by loading additional pages and simulating scrolls and link clicks.
  • Support for user flows, such as a checkout flow, in Recorder panel for exporting a recorded user journey to Puppeteer script.

We're also experimenting with two new performance metrics: overall input responsiveness and scrolling and animation smoothness. We'd love to get your feedback, so take a spin through at web.dev/smoothness.

Expanding the Toolbox for Digital Interfaces

We've got developers and designers covered with tons of changes coming down the pipeline for UI styling and DevTools, including updates to responsive design. Developers can now customize user experiences in a component-driven architecture model, and we're calling this The New Responsive:



With the new container queries spec—available for testing behind a flag in Chrome Canary—developers can access a parent element's width to make styling decisions for its children, nest container queries, and create named queries for easier access and organization.

This is a huge shift for component-based development, so we've been providing new DevTools for debugging, styling, and visualizing CSS layouts. To make creating interfaces even easier, we also launched a collection of off-the-shelf UI patterns.

Developers who want to learn more can dive into free resources such as Learn CSS course. There are also a few exciting CSS APIs in their first public working drafts, including:

  • Scroll-timeline for animating an element as people scroll (available via the experimental web platform features flag in Chrome Canary).
  • Size-adjust property for typography (available in Chromium and Firefox stable).
  • Accent-color for giving form controls a theme color (available in Chromium and Firefox stable).

One feature we're really excited to build on is Dark Mode, especially because we found indications that dark themes use 11% less battery power than light themes for OLED screens. Stay tuned for a machine-learning-aided, auto-dark algorithm feature in an upcoming version of Chrome.

Buckling Down for the Road Ahead

Part of what makes the web so special is that it's an open, decentralized ecosystem. We encourage everyone to make the most of this by getting involved in shaping the web's future in places such as:

We can't wait to see what the web looks like by next year's summit. Until then, check out our library of learning resources on the web.dev newsletter.

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant