Top Banner
Deconstructing the App Store Rankings Formula with 5 Mad Science Experiments After seeing Rand's "Mad Science Experiments in SEO " presented at last year's MozCon, I was inspired to put on the lab coat and goggles and do a few experiments of my own--not in SEO, but in SEO's up-and-coming younger sister, ASO (app store optimization ). Working with Apptentive to guide enterprise apps and small startup apps alike to increase their discoverability in the app stores, I've learned a thing or two about app store optimization and what goes into an app's ranking. It's been my personal goal for some time now to pull back the curtains on Google and Apple. Yet, the deeper into the rabbit hole I go, the more untested assumptions I leave in my way. Hence, I thought it was due time to put some longstanding hypotheses through the gauntlet. As SEOs, we know how much of an impact a single ranking can mean on a SERP. One tiny rank up or down can make all the difference when it comes to your website's traffic--and revenue. In the world of apps, ranking is just as important when it comes to standing out in a sea of more than 1.3 million apps. Apptentive's recent mobile consumer survey shed a little more light this claim, revealing that nearly half of all mobile app users identified browsing the app store charts and search results (the placement on either of which depends on rankings) as a preferred method for finding new apps in the app stores. Simply put, better rankings mean more downloads and easier discovery. Like Google and Bing, the two leading app stores (the Apple App Store and Google Play) have a complex and highly guarded algorithms for determining rankings for both keyword-based app store searches and composite top charts. Unlike SEO, however, very little research and theory has been conducted around what goes into these rankings. Until now, that is. Over the course of five experiments analyzing various publicly available data points for a cross- section of the top 500 iOS (U.S. Apple App Store) and the top 500 Android (U.S. Google Play) apps, I'll attempt to set the record straight with a little myth-busting around ASO. In the process, I hope to assess and quantify any perceived correlations between app store ranks, ranking volatility, and a few of the factors commonly thought of as influential to an app's ranking. But first, a little context
14
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Deconstructing the app store rankings formula

Deconstructing the App Store Rankings Formula with 5 MadScience Experiments

After seeing Rand's "Mad Science Experiments in SEO" presented at last year's MozCon, I wasinspired to put on the lab coat and goggles and do a few experiments of my own--not in SEO, but inSEO's up-and-coming younger sister, ASO (app store optimization).

Working with Apptentive to guide enterprise apps and small startup apps alike to increase theirdiscoverability in the app stores, I've learned a thing or two about app store optimization and whatgoes into an app's ranking. It's been my personal goal for some time now to pull back the curtains onGoogle and Apple. Yet, the deeper into the rabbit hole I go, the more untested assumptions I leave inmy way.

Hence, I thought it was due time to put some longstanding hypotheses through the gauntlet.

As SEOs, we know how much of an impact a single ranking can mean on a SERP. One tiny rank up ordown can make all the difference when it comes to your website's traffic--and revenue.

In the world of apps, ranking is just as important when it comes to standing out in a sea of morethan 1.3 million apps. Apptentive's recent mobile consumer survey shed a little more light this claim,revealing that nearly half of all mobile app users identified browsing the app store charts and searchresults (the placement on either of which depends on rankings) as a preferred method for findingnew apps in the app stores. Simply put, better rankings mean more downloads and easier discovery.

Like Google and Bing, the two leading app stores (the Apple App Store and Google Play) have acomplex and highly guarded algorithms for determining rankings for both keyword-based app storesearches and composite top charts.

Unlike SEO, however, very little research and theory has been conducted around what goes intothese rankings.

Until now, that is.

Over the course of five experiments analyzing various publicly available data points for a cross-section of the top 500 iOS (U.S. Apple App Store) and the top 500 Android (U.S. Google Play) apps,I'll attempt to set the record straight with a little myth-busting around ASO. In the process, I hope toassess and quantify any perceived correlations between app store ranks, ranking volatility, and a fewof the factors commonly thought of as influential to an app's ranking.

But first, a little context

Page 2: Deconstructing the app store rankings formula

Image credit: Josh Tuininga, Apptentive

Both the Apple App Store and Google Play have roughly 1.3 million apps each, and both storesfeature a similar breakdown by app category. Apps ranking in the two stores should, theoretically,be on a fairly level playing field in terms of search volume and competition.

Of these apps, nearly two-thirds have not received a single rating and 99% are consideredunprofitable. These experiments, therefore, single out the rare exceptions to the rule--the top 500ranked apps in each store.

While neither Apple nor Google have revealed specifics about how they calculate search rankings, itis generally accepted that both app store algorithms factor in:

Average app store rating

Rating/review volume

Download and install counts

Uninstalls (what retention and churn look like for the app)

App usage statistics (how engaged an app's users are and how frequently they launch the app)

Growth trends weighted toward recency (how daily download counts changed over time and howtoday's ratings compare to last week's)

Keyword density of the app's landing page (Ian did a great job covering this factor in a previous Mozpost)

I've simplified this formula to a function highlighting the four elements with sufficient data (or atleast proxy data) for our experimentation:

Page 3: Deconstructing the app store rankings formula

Ranking = fn(Rating, Rating Count, Installs, Trends)

Of course, right now, this generalized function doesn't say much. Over the next five experiments,however, we'll revisit this function before ultimately attempting to compare the weights of each ofthese four variables on app store rankings.

(For the purpose of brevity, I'll stop here with the assumptions, but I've gone into far greater depthinto how I've reached these conclusions in a 55-page report on app store rankings.)

Now, for the Mad Science.

Experiment #1: App-les to app-les app store ranking volatility

The first, and most straight forward of the five experiments involves tracking daily movement in appstore rankings across iOS and Android versions of the same apps to determine any trends ofdifferences between ranking volatility in the two stores.

I went with a small sample of five apps for this experiment, the only criteria for which were that:

They were all apps I actively use (a criterion for coming up with the five apps but not one thatinfluences rank in the U.S. app stores)

They were ranked in the top 500 (but not the top 25, as I assumed app store rankings would bestickier at the top--an assumption I'll test in experiment #2)

They had an almost identical version of the app in both Google Play and the App Store, meaning theyshould (theoretically) rank similarly

They covered a spectrum of app categories

The apps I ultimately chose were Lyft, Venmo, Duolingo, Chase Mobile, and LinkedIn. These fiveapps represent the travel, finance, education banking, and social networking categories.

Hypothesis

Going into this experiment, I predicted slightly more volatility in Apple App Store rankings, based ontwo statistics:

Both of these assumptions will be tested in later experiments.

Results

Page 4: Deconstructing the app store rankings formula

Among these five apps, Google Play rankings were, indeed, significantly less volatile than App Storerankings. Among the 35 data points recorded, rankings within Google Play moved by as much as 23positions/ranks per day while App Store rankings moved up to 89 positions/ranks. The standarddeviation of ranking volatility in the App Store was, furthermore, 4.45 times greater than that ofGoogle Play.

Of course, the same apps varied fairly dramatically in their rankings in the two app stores, so I thenstandardized the ranking volatility in terms of percent change to control for the effect of numericrank on volatility. When cast in this light, App Store rankings changed by as much as 72% within a24-hour period while Google Play rankings changed by no more than 9%.

Also of note, daily rankings tended to move in the same direction across the two app storesapproximately two-thirds of the time, suggesting that the two stores, and their customers, may havemore in common than we think.

Experiment #2: App store ranking volatility across the top charts

Page 5: Deconstructing the app store rankings formula

Testing the assumption implicit in standardizing the data in experiment No. 1, this experiment wasdesigned to see if app store ranking volatility is correlated with an app's current rank. The samplefor this experiment consisted of the top 500 ranked apps in both Google Play and the App Store, withspecial attention given to those on both ends of the spectrum (ranks 1-100 and 401-500).

Hypothesis

I anticipated rankings to be more volatile the higher an app is ranked--meaning an app ranked No.450 should be able to move more ranks in any given day than an app ranked No. 50. This hypothesisis based on the assumption that higher ranked apps have more installs, active users, and ratings,and that it would take a large margin to produce a noticeable shift in any of these factors.

Results

One look at the chart above shows that apps in both stores have increasingly more volatile rankings(based on how many ranks they moved in the last 24 hours) the lower on the list they're ranked.

This is particularly true when comparing either end of the spectrum--with a seemingly straightvolatility line among Google Play's Top 100 apps and very few blips within the App Store's Top 100.Compare this section to the lower end, ranks 401-)500, where both stores experience much moreturbulence in their rankings. Across the gamut, I found a 24% correlation between rank and rankingvolatility in the Play Store and 28% correlation in the App Store.

To put this into perspective, the average app in Google Play's 401-)500 ranks moved 12.1 ranks inthe last 24 hours while the average app in the Top 100 moved a mere 1.4 ranks. For the App Store,these numbers were 64.28 and 11.26, making slightly lower-ranked apps more than five times asvolatile as the highest ranked apps. (I say slightly as these "lower-ranked" apps are still rankedhigher than 99.96% of all apps.)

Page 6: Deconstructing the app store rankings formula

The relationship between rank and volatility is pretty consistent across the App Store charts, whilerank has a much greater impact on volatility at the lower end of Google Play charts (ranks 1-100have a 35% correlation) than it does at the upper end (ranks 401-500 have a 1% correlation).

Experiment #3: App store rankings across the stars

The next experiment looks at the relationship between rank and star ratings to determine any trendsthat set the top chart apps apart from the rest and explore any ties to app store ranking volatility.

Hypothesis

Ranking = fn(Rating, Rating Count, Installs, Trends)

As discussed in the introduction, this experiment relates directly to one of the factors commonlyaccepted as influential to app store rankings: average rating.

Going into the experiment, I hypothesized that higher ranks generally correspond to higher ratings,cementing the role of star ratings in the ranking algorithm.

As far as volatility goes, I did not anticipate average rating to play a role in app store rankingvolatility, as I saw no reason for higher rated apps to be less volatile than lower rated apps, or viceversa. Instead, I believed volatility to be tied to rating volume (as we'll explore in our lastexperiment).

Results

Page 7: Deconstructing the app store rankings formula

The chart above plots the top 100 ranked apps in either store with their average rating (both historicand current, for App Store apps). If it looks a little chaotic, it's just one indicator of the complexity ofranking algorithm in Google Play and the App Store.

If our hypothesis was correct, we'd see a downward trend in ratings. We'd expect to see the No. 1ranked app with a significantly higher rating than the No. 100 ranked app. Yet, in neither store isthis the case. Instead, we get a seemingly random plot with no obvious trends that jump off thechart.

A closer examination, in tandem with what we already know about the app stores, reveals two otherinteresting points:

The average star rating of the top 100 apps is significantly higher than that of the average app.Across the top charts, the average rating of a top 100 Android app was 4.319 and the average topiOS app was 3.935. These ratings are 0.32 and 0.27 points, respectively, above the average rating ofall rated apps in either store. The averages across apps in the 401-)500 ranks approximately split thedifference between the ratings of the top ranked apps and the ratings of the average app.

The rating distribution of top apps in Google Play was considerably more compact than thedistribution of top iOS apps. The standard deviation of ratings in the Apple App Store top chart wasover 2.5 times greater than that of the Google Play top chart, likely meaning that ratings are moreheavily weighted in Google Play's algorithm.

Page 8: Deconstructing the app store rankings formula

Looking next at the relationship between ratings and app store ranking volatility reveals a -15%correlation that is consistent across both app stores; meaning the higher an app is rated, the less itsrank it likely to move in a 24-hour period. The exception to this rule is the Apple App Store'scalculation of an app's current rating, for which I did not find a statistically significant correlation.

Experiment #4: App store rankings across versions

This next experiment looks at the relationship between the age of an app's current version, its rankand its ranking volatility.

Hypothesis

Ranking = fn(Rating, Rating Count, Installs, Trends)

In alteration of the above function, I'm using the age of a current app's version as a proxy (albeit nota very good one) for trends in app store ratings and app quality over time.

Making the assumptions that (a) apps that are updated more frequently are of higher quality and (b)each new update inspires a new wave of installs and ratings, I'm hypothesizing that the older the ageof an app's current version, the lower it will be ranked and the less volatile its rank will be.

Results

Page 9: Deconstructing the app store rankings formula

The first and possibly most important finding of this experiment is that apps across the top charts inboth Google Play and the App Store are updated remarkably often as compared to the average app.

At the time of conducting the experiment, the current version of the average iOS app on the topchart was only 28 days old; the current version of the average Android app was 38 days old.

As hypothesized, the age of the current version is negatively correlated with the app's rank, with a13% correlation in Google Play and a 10% correlation in the App Store.

Page 10: Deconstructing the app store rankings formula

The next part of the experiment maps the age of the current app version to its app store rankingvolatility, finding that recently updated Android apps have less volatile rankings (correlation: 8.7%)while recently updated iOS apps have more volatile rankings (correlation: -3%).

Experiment #5: App store rankings across monthly active users

In the final experiment, I wanted to examine the role of an app's popularity on its ranking. In anideal world, popularity would be measured by an app's monthly active users (MAUs), but since fewmobile app developers have released this information, I've settled for two publicly available proxies:Rating Count and Installs.

Hypothesis

Ranking = fn(Rating, Rating Count, Installs, Trends)

For the same reasons indicated in the second experiment, I anticipated that more popular apps (e.g.,apps with more ratings and more installs) would be higher ranked and less volatile in rank. This,again, takes into consideration that it takes more of a shift to produce a noticeable impact in averagerating or any of the other commonly accepted influencers of an app's ranking.

Results

The first finding leaps straight off of the chart above: Android apps have been rated more times thaniOS apps, 15.8x more, in fact.

Page 11: Deconstructing the app store rankings formula

The average app in Google Play's Top 100 had a whopping 3.1 million ratings while the average appin the Apple App Store's Top 100 had 196,000 ratings. In contrast, apps in the 401-)500 ranks (stilltremendously successful apps in the 99.96 percentile of all apps) tended to have between one-tenth(Android) and one-fifth (iOS) of the ratings count as that of those apps in the top 100 ranks.

Considering that almost two-thirds of apps don't have a single rating, reaching rating counts thishigh is a huge feat, and a very strong indicator of the influence of rating count in the app storeranking algorithms.

To even out the playing field a bit and help us visualize any correlation between ratings and rankings(and to give more credit to the still-staggering 196k ratings for the average top ranked iOS app), I'veapplied a logarithmic scale to the chart above:

From this chart, we can see a correlation between ratings and rankings, such that apps with moreratings tend to rank higher. This equates to a 29% correlation in the App Store and a 40%correlation in Google Play.

Page 12: Deconstructing the app store rankings formula

Next up, I looked at how ratings count influenced app store ranking volatility, finding that apps withmore ratings had less volatile rankings in the Apple App Store (correlation: 17%). No conclusiveevidence was found within the Top 100 Google Play apps.

And last but not least, I looked at install counts as an additional proxy for MAUs. (Sadly, this is astatistic only listed in Google Play. so any resulting conclusions are applicable only to Android apps.)

Among the top 100 Android apps, this last experiment found that installs were heavily correlated

Page 13: Deconstructing the app store rankings formula

with ranks (correlation: -35.5%), meaning that apps with more installs are likely to rank higher inGoogle Play. Android apps with more installs also tended to have less volatile app store rankings,with a correlation of -16.5%.

Unfortunately, these numbers are slightly skewed as Google Play only provides install counts inbroad ranges (e.g., 500k-)1M). For each app, I took the low end of the range, meaning we can likelyexpect the correlation to be a little stronger since the low end was further away from the midpointfor apps with more installs.

Summary

To make a long post ever so slightly shorter, here are the nuts and bolts unearthed in these five madscience experiments in app store optimization:

Across the top charts, Apple App Store rankings are 4.45x more volatile than those of Google Play

Rankings become increasingly volatile the lower an app is ranked. This is particularly true across theApple App Store's top charts.

In both stores, higher ranked apps tend to have an app store ratings count that far exceeds that ofthe average app

Ratings appear to matter more to the Google Play algorithm, especially as the Apple App Store topcharts experience a much wider ratings distribution than that of Google Play's top charts

The higher an app is rated, the less volatile its rankings are

The 100 highest ranked apps in either store are updated much more frequently than the averageapp, and apps with older current versions are correlated with lower ratings

An app's update frequency is negatively correlated with Google Play's ranking volatility butpositively correlated with ranking volatility in the App Store. This likely due to how Apple weighs anapp's most recent ratings and reviews.

The highest ranked Google Play apps receive, on average, 15.8x more ratings than the highestranked App Store apps

In both stores, apps that fall under the 401-500 ranks receive, on average, 10-20% of the ratingvolume seen by apps in the top 100

Rating volume and, by extension, installs or MAUs, is perhaps the best indicator of ranks, with a 29-40% correlation between the two

Revisiting our first (albeit oversimplified) guess at the app stores' ranking algorithm gives us thisloosely defined function:

Ranking = fn(Rating, Rating Count, Installs, Trends)

I'd now re-write the function into a formula by weighing each of these four factors, where a, b, c, dare unknown multipliers, or weights:

Page 14: Deconstructing the app store rankings formula

Ranking = (Rating * a) + (Rating Count * b) + (Installs * c) + (Trends * d)

These five experiments on ASO shed a little more light on these multipliers, showing Rating Count tohave the strongest correlation with rank, followed closely by Installs, in either app store.

It's with the other two factors--rating and trends--that the two stores show the greatest discrepancy.I'd hazard a guess to say that the App Store prioritizes growth trends over ratings, given theimportance it places on an app's current version and the wide distribution of ratings across the topcharts. Google Play, on the other hand, seems to favor ratings, with an unwritten rule that apps justabout have to have at least four stars to make the top 100 ranks.

Thus, we conclude our mad science with this final glimpse into what it takes to make the top chartsin either store:

Weight of factors in the Apple App Store ranking algorithm

Rating Count Installs Trends Rating

Weight of factors in the Google Play ranking algorithm

Rating Count Installs Rating Trends

Again, we're oversimplifying for the sake of keeping this post to a mere 3,000 words, but additionalfactors including keyword density and in-app engagement statistics continue to be strong indicatorsof ranks. They simply lie outside the scope of these experiments.

I hope you found this deep-dive both helpful and interesting. Moving forward, I also hope to seeASOs conducting the same experiments that have brought SEO to the center stage, and encourageyou to enhance or refute these findings with your own ASO mad science experiments.

Please share your thoughts in the comments below, and let's deconstruct the ranking formulatogether, one experiment at a time.

https://moz.com/ugc/app-store-rankings-formula-deconstructed-in-5-mad-science-experiments