The NavBoost Squashing Function: How Google Normalizes Click Data

What Is the Squashing Function?

In mathematical and engineering contexts, a "squashing function" is any function that compresses a wide range of input values into a narrower range of output values. Common examples include logarithmic functions, sigmoid functions, and hyperbolic tangent functions. These functions share a key property: as the input grows larger, the output grows proportionally slower.

In the context of NavBoost, the squashing function is applied to raw click data before that data is used in ranking calculations. The 2024 Google API leak revealed two distinct click data fields — unsquashedClicks and squashedClicks — confirming that Google stores both the raw and the processed versions of click signals.

The exact mathematical formula Google uses for its squashing function has not been disclosed. However, the concept and its effects can be understood through its general properties and the behavioral patterns it produces in the ranking system.

A Simple Illustration

Consider two search results competing for the same query:

Result A receives 500 clicks per month, with a consistent pattern over 13 months.
Result B receives 50 clicks per month for 12 months, then 5,000 clicks in a single month.

Without squashing, Result B's total click count (50 x 12 + 5,000 = 5,600) would far exceed Result A's (500 x 13 = 6,500) in raw terms, and that one month's spike would represent a massive signal. With squashing, the 5,000-click spike is compressed — perhaps down to the equivalent impact of 500-800 clicks — and Result A's consistent pattern produces a comparatively stronger normalized signal because each month's 500 clicks is squashed but remains meaningful across all 13 months.

The precise compression ratio is unknown, but the principle is clear: consistency of signal matters more than peak magnitude.

Why Google Uses Squashing

Google has strong incentives to normalize click data rather than using it raw. The squashing function serves multiple purposes that are critical to the integrity of the ranking system.

1. Preventing Click Manipulation

The most frequently cited reason for squashing is manipulation prevention. If raw click volume directly correlated with ranking influence, any actor with sufficient resources could boost rankings by generating artificial clicks — through bots, click farms, or other means. The squashing function breaks this linear relationship. Doubling the number of clicks does not double the ranking signal. This means that the cost of manipulation scales non-linearly: each additional artificial click produces less marginal benefit than the last.

This economic deterrent is fundamental. Even if Google's active click manipulation detection systems fail to flag a particular campaign, the squashing function ensures that the campaign's impact is inherently limited.

2. Normalizing Across Query Volumes

Google processes queries across an enormous range of search volumes. A head term like "weather" might receive millions of searches per day, while a long-tail query like "best budget turntable for jazz vinyl under $300" might receive a few hundred per month. Without normalization, head-term results would always have overwhelmingly stronger click signals than long-tail results, even when the long-tail results are equally or more satisfying to their respective audiences.

The squashing function levels this playing field. By compressing the absolute magnitude of click signals, it ensures that what matters is the pattern of clicks (the ratio of good to bad, the consistency over time) rather than the raw volume. This allows NavBoost to function meaningfully across the full spectrum of query popularity.

3. Reducing Noise and Volatility

Real-world click patterns are inherently noisy. A trending news event can temporarily spike clicks to certain results. A viral social media post might drive thousands of people to search for something they would not normally search for, generating atypical click behavior. Seasonal patterns create regular fluctuations.

Without squashing, these normal fluctuations would cause rankings to oscillate unpredictably. By compressing extreme values, the squashing function dampens noise and produces a more stable ranking signal. Rankings change when there is a genuine, sustained shift in user satisfaction — not when there is a temporary blip.

4. Protecting Against Competitive Sabotage

Squashing also protects against negative click manipulation — attempts to harm a competitor's rankings by generating artificial badClicks (pogo-sticking). Just as positive click spikes are compressed, negative click spikes are similarly dampened. A sudden burst of artificial badClicks against a competitor would be squashed in the same way a burst of artificial goodClicks would be, limiting the damage any such campaign could inflict.

How Squashing Works with the 13-Month Window

The squashing function and the 13-month rolling window are complementary mechanisms that, together, create a robust defense against manipulation and noise. Understanding how they interact is essential.

Temporal Smoothing + Signal Compression

The 13-month window provides temporal smoothing: any single month's data represents only approximately 7.7% of the total signal. The squashing function provides signal compression: extreme values within any single month are reduced before they enter the aggregation.

The combined effect is multiplicative. A manipulation attempt faces two sequential barriers:

First, the squashing function compresses the raw signal. A spike of 10,000 artificial clicks might be compressed to the equivalent of 1,000 or fewer in terms of ranking impact.
Then, the 13-month window dilutes the squashed signal. That compressed value represents only one-thirteenth of the total data used in the ranking calculation.

The net effect is that a single month of heavy manipulation might produce a ranking signal equivalent to a small fraction of what a single month of genuine, moderate click activity would produce over the same window.

Sustained Patterns vs. Short Bursts

This design means that sustained patterns of user behavior carry far more weight than short-term spikes. A result that consistently receives a healthy ratio of goodClicks over 13 months builds a durable NavBoost signal. A result that receives a burst of clicks in a single month — whether artificial or genuine — sees that burst compressed by squashing and diluted by the rolling window.

The practical implication is clear: lasting ranking improvements through NavBoost require lasting changes in how users interact with a result. A one-time spike — from a viral post, a media mention, or a manipulation campaign — will have a temporary and muted effect.

unsquashedClicks vs. squashedClicks: Evidence from the API Leak

The 2024 Google API leak provided direct evidence of the squashing function through the existence of two distinct API fields: unsquashedClicks and squashedClicks. The analysis of these fields by researchers — notably Rand Fishkin (SparkToro) and Mike King (iPullRank) — yielded several important observations.

Dual Storage Implies Active Comparison

The fact that Google stores both the raw and processed versions of click data suggests that the comparison between them has analytical value. If the raw data were only needed as an input to the squashing function and then discarded, there would be no need for a separate unsquashedClicks field in the API. The retention of both values implies that Google uses the relationship between them — perhaps to identify anomalies.

For example, if a URL's unsquashedClicks value is abnormally high relative to its squashedClicks value, this could indicate that a disproportionate number of clicks came from a concentrated source (a sign of potential manipulation). Legitimate organic traffic tends to produce a more predictable unsquashed-to-squashed ratio.

Processing-State Fields vs. Behavioral Fields

As detailed in the NavBoost Click Types analysis, the five click-related fields in the API leak fall into two categories:

Behavioral classifications (goodClicks, badClicks, lastLongestClicks): These describe the quality of the user interaction.
Processing states (unsquashedClicks, squashedClicks): These describe the data before and after normalization.

These categories are not mutually exclusive. Each behavioral click type (good, bad, lastLongest) likely has both an unsquashed and a squashed representation. The total unsquashedClicks field may aggregate across all behavioral types, or there may be per-type squashing — the leaked documentation does not provide enough detail to determine this conclusively.

Implications for Legitimate SEO

The squashing function has significant implications for how publishers and SEO practitioners should think about organic search performance.

Gradual Improvements Matter More Than Quick Wins

Because squashing compresses large signals and the 13-month window smooths temporal variation, the most effective approach to building positive NavBoost signals is gradual, sustained improvement in content quality and user experience. Specifically:

Improving page speed reduces the likelihood of users abandoning before engaging, which reduces badClicks and increases goodClicks.
Improving content comprehensiveness increases dwell time and the probability of receiving lastLongestClicks.
Aligning title tags and meta descriptions with actual content reduces the click-and-bounce pattern that generates badClicks.
Regularly updating content to keep it current ensures that the positive click signals do not erode as information becomes outdated.

These improvements compound over time. Each month of better user behavior adds to the 13-month window, and after a full cycle, the URL's NavBoost profile reflects a consistently positive pattern that is resistant to displacement by competitors.

Viral Traffic Spikes Are Not a Ranking Strategy

A common misconception is that driving a large volume of traffic to a page (through social media, email blasts, or PR campaigns) will boost its rankings because of the click signal. While such traffic may produce genuine clicks, the squashing function ensures that the ranking impact of a short-term traffic spike is heavily moderated. If the spike is temporary and the traffic does not continue, the signal will be diluted across the 13-month window and will have minimal lasting effect.

This does not mean viral traffic is valueless for SEO — it can generate backlinks, brand awareness, and other indirect benefits. But the direct NavBoost impact of a one-time traffic spike is inherently limited by design.

Consistency Is Structurally Rewarded

The combination of squashing and the 13-month window creates a system that structurally rewards consistency. A page that delivers strong user satisfaction every month for a year builds a NavBoost profile that is extremely difficult for a competitor to displace quickly. Conversely, a page that has inconsistent user satisfaction — strong some months, weak others — will have a weaker aggregate signal even if its peak months are very strong.

This structural advantage for consistent quality is one of the reasons why established, authoritative websites tend to maintain stable rankings over time, while newer or lower-quality sites struggle to break through even with occasional traffic spikes.

Implications for Click Manipulation

The squashing function is one of several mechanisms Google uses to limit the effectiveness of artificial click manipulation. Understanding its role helps explain why click manipulation campaigns produce the results they do.

Short-Term Spikes Get Averaged Out

The most common click manipulation approach involves generating a burst of artificial clicks over a period of days or weeks. The squashing function compresses this burst, and the 13-month window dilutes it. The result, as reported by many practitioners, is a modest and temporary ranking improvement that disappears when the campaign stops.

Some practitioners report that sustained click manipulation over many months can produce more durable results. This is consistent with the 13-month window model: if artificial clicks are generated consistently for 6-12 months, they occupy a significant portion of the window and may produce a visible signal even after squashing. However, sustaining a manipulation campaign at that duration significantly increases both cost and detection risk.

Volume Has Diminishing Returns

The non-linear nature of the squashing function means that generating more artificial clicks yields progressively smaller ranking benefits. The first 100 artificial clicks might produce a measurable (if small) effect. The next 1,000 might only double or triple that effect, not multiply it by 10. And the next 10,000 might add only a marginal increment beyond that.

This is a critical point for understanding the economics of click manipulation. The cost of generating clicks scales linearly (more clicks = more cost), but the ranking benefit scales sub-linearly (more clicks = diminishing returns). At some point, the marginal cost of additional clicks exceeds the marginal ranking benefit, making the campaign economically irrational.

Bot Patterns Are Easier to Squash Than Human Patterns

This is perhaps the most important practical distinction. Bot-generated clicks tend to have uniform characteristics: consistent timing, similar dwell times, predictable navigation patterns, identical device fingerprints. These patterns make it straightforward for Google's detection systems to identify and filter them — effectively squashing them to zero rather than merely compressing them.

Real human clicks from diverse sources — different devices, different locations, different times of day, different engagement patterns — are inherently harder to distinguish from genuine organic behavior. This is because they are genuine human behavior, even if the motivation to search and click was artificially induced. The squashing function still compresses their aggregate signal, but they are less likely to be flagged and filtered by active detection mechanisms.

This distinction explains why sophisticated click manipulation services that use real human clickers report better results than bot-based services. The humans produce behaviorally diverse click patterns that survive Google's filtering layer and are only moderated (not eliminated) by the squashing function.

Mathematical Context

While the exact squashing function Google uses is unknown, several common mathematical functions could serve as the basis. Understanding these provides useful intuition about the function's behavior.

Logarithmic Compression

A simple logarithmic function — f(x) = log(1 + x) — compresses large values significantly. Under this function:

Raw Clicks (x)	log(1+x) Output	Relative Signal
10	2.40	1.0x (baseline)
100	4.62	1.9x
1,000	6.91	2.9x
10,000	9.21	3.8x
100,000	11.51	4.8x

Table 1: Illustrative logarithmic compression. A 10,000x increase in raw clicks produces only a 3.8x increase in the output signal. The actual function Google uses is unknown but likely produces similar compression behavior.

Under logarithmic compression, a result with 100,000 clicks has less than 5 times the signal of a result with 10 clicks — not 10,000 times. This extreme compression makes volume-based manipulation fundamentally inefficient.

Sigmoid Functions

Another possibility is a sigmoid-style function, which produces an S-shaped curve. Sigmoid functions have the additional property of bounding the output — no matter how large the input, the output approaches but never exceeds a maximum value. This would mean that there is effectively a ceiling on how much any click signal can contribute to ranking, regardless of volume.

A sigmoid-based squashing function would be particularly effective at preventing manipulation because it imposes a hard ceiling: generating more and more clicks eventually produces virtually zero additional ranking benefit.

Likely a Custom Function

Google's actual implementation is almost certainly a custom function rather than a textbook logarithm or sigmoid. It likely incorporates query-volume normalization, time-decay weighting within the 13-month window, and other adjustments that standard mathematical functions do not capture. The key insight is not the specific formula but the general behavior: large inputs are compressed, consistency is rewarded over spikes, and there are diminishing returns to volume.

Sources and Further Reading

The information in this article is based on:

2024 Google API Leak: The unsquashedClicks and squashedClicks fields in the leaked NavBoost API documentation. See full analysis of the leak.
Rand Fishkin (SparkToro): Initial reporting on the API leak and analysis of NavBoost fields, May 2024.
Mike King (iPullRank): Technical analysis of the API documentation, including the normalization implications of the dual click fields, 2024.
RESONEO: "Google Leak Part 5: Click-data, NavBoost, Glue, and Beyond" — analysis of the squashing mechanism and its role in click signal processing.
Hobo Web: "NavBoost: How Google Uses Large-Scale User Interaction Data to Rank Websites" — contextual analysis of NavBoost's data processing pipeline.
U.S. v. Google Antitrust Trial (2023): Testimony from Pandu Nayak confirming NavBoost's role and the use of click data normalization.

For related topics, see What is NavBoost? for foundational context, How NavBoost Works for the complete technical architecture, NavBoost Click Types for detailed click classification analysis, The 13-Month Window for the temporal aggregation mechanism, and Click Manipulation Detection for Google's active defense mechanisms.