Getting app Keyword Rankings from Google Play
How to get app keyword rankings from Google Play using R, Rselenium and rvest.
This is the first part of a series of articles on App Store Optimization (ASO).
App Store Optimization (from now ASO), might be described as the process to increase the app positioning and visibility within the App Stores.
There are two major app stores: Google Play and the Apple App Store. In this series of articles I will be focused on exploring ASO for the Google Play.
I won’t get into the ASO basics in this piece since my goal is not to explain why ASO is important (if you are interested in what is ASO and why it is important, you may check this other post).
In this article, I’d rather focus on how to get app Keyword rankings for any given localized search in Googe Play.
Say then that you’ve got a number of search terms (tracked Keywords) from which you’d like to know the first thirty apps ranking for each of them on Google Play.
Docker and Rselenium to land on Google Play
Using R and Rstudio, you could run a docker container with an image of chrome to browse and search your Keywords on Google Play to get the apps ranking for them.
There are many tutorials on how to run a docker container, but I suggest you to go for these settings (in linux):
library(Rselenium)
system('sudo docker run --shm-size=256m -d -p 4445:4444 selenium/standalone-chrome')After that you should create a remote driver to run your chrome browser with Rselenium:
remDr <- remoteDriver(port = 4445L, browserName = "chrome")
print('Remove Driver is up')
Sys.sleep(10)
remDr$open(silent = T)
print('Remove Driver is open')Building your tracked Keyword urls
Eventually, you’d be ready to navigate towards Google Play building your urls based on each Keyword search on your list. See that in this case I’m going further localizing my searches for the Spanish version of the apps present in Google Play.
It’s a good practice, specially if you have a long list of Keyword searches, to add some randomized waiting time between searches. From 1 up to 3 secs seems to work fine.
url <- paste0("https://play.google.com/store/search?q=", app_store_KW_search, "&c=apps&hl=es")
remDr$setWindowSize(2000, 5000)
remDr$navigate(url)
Sys.sleep(runif(1, 1, 3))Retrieving app rankings
At this point, you may get the app elements one by one for each search using Rselenium functions. However, in order to speed up the information retrieval, I reccommend to read the html using rvest and then parse that info using nodes and classes.
library (rvest)
page <- rvest::read_html(remDr$getPageSource()[[1]])
app_names_ranked <- html_nodes(page, "[class='DdYX5']") %>% html_text()
app_id <- html_nodes(page, "[class='Si6A0c Gy4nib']") %>% html_attr("href")Beware that the html might change upon Google’s whim and you should always keep an eye on your output.
Some tips
Let me share with you some tips when building your utilty script for getting app Keyword Rankings:
Wrap the core script within a for loop to go over your Keyword search list.
Run timing tests to understand how long it takes for your machine and settings to retrieve the app rankings for a single search. Then, plan it accordingly within your machine schedule.
Save the rankings, along with Keyword searches, app name, app id, and search date in a data frame, CSV, DDBB, etc. Whatever it suits best to your knowledge and needs.
Schedule the whole script using a (early-am) daily CRON job to track your app(s) positioning over time.
If you are looking forward to visualize this Keyword Search data with the goal to get actionable insights, I will write a piece on that soon. Subscribe and stay tuned.
Should you have any comment or request, do not hesitate to get in contact with me by email at datadventures@substack.com
