SEARCH ENGINE OPTIMIZATION – THE FULL GUIDE
By: Siddharth Gupta
UC Berkeley 2020
International Finance & Data Analytics
Step 1: Identifying Site Bottlenecks
Search Engine Optimization (SEO) has historically been regarded as a qualitative process, where
employees with UI/UX backgrounds try to use their best judgement to understand a user’s flow
through a website and see ways to maximize a website’s Key Performance Indicators (KPI).
Additionally, SEO was always considered an afterthought, usually a process implemented after
the full development of a website.
Now, SEO is a deeply integrated process that needs to be proactively implemented to ensure a
more efficient and quantitatively driven website development. With the advent of Google’s
ever-changing Keyword Ranking algorithm, it is necessary that businesses that drive significant
e-business growth continually.
There are 4 key aspects to well-constructed SEO management: Identifying site bottlenecks (i.e.
slow loading speeds, lack of site interconnectivity, overloaded backends, etc.) through
extensive analysis, implementing a round of changes/edits, quantitatively measuring the
change in UX & SEO, repeating the testing process.
Regardless of the website building platform used, there will always be a way to measure
engagements and find data on the happenings of your website. Identifying where the site is
losing performance and/or user engagement is the first key step to creating targeted solution
1. GOOGLE ANALYTICS
Google Analytics can be directly connected to WordPress-based website, allowing a site
administrator unsurmountable access to data on audience breakdown, user behaviors, and
conversion pipelines. With Google Analytics, a site administrator can find data on which pages
are causing site-wide problems, such as high bounce rate, high site drop-off rates, and slow site
loading speeds. It is key to recognize the data over any qualitative theses because when a
search engine, like Google, is placing websites in an order of relevance to searchers, they will
utilize various quantitative algorithms to create the ranking. Consequently, even if a site
maintains a high aesthetic value, if it is a hindrance to site optimizations, then the aesthetic
elements need to be reevaluated.
2. GOOGLE CHROME DEVTOOLS
For those with more HTML/JS or coding experience, another key tool to utilize is the Google
Chrome DevTools. To open the DevTools, right click on a select webpage, and “inspect” the
page. This will open a number of developer tools. The key tools are the “Network”,
“Performance”, & “Audits” tabs.
a. Network Tab
The “Network” tool allows you to see where the bottlenecks lie in the connectivity to site
servers. For example, a “waiting: TTFB” is highlighting the time the browser has to wait before
receiving its first byte of data from the server. The “Network” tool is excellent for highlighting
network latency/responsiveness issues (time it takes to request from server, server processing,
and responding to client).
b. Performance Tab
The “Performance” tab is key to understanding how all the elements of the page load in, how
long each task takes and its concurrency. Additionally, this tab also highlights the unused bytes
that were loaded from various URL’s. For example, if your webpage loads 89% unused bytes,
then there is an opportunity for ways to reduce that to reduce the server and browser loading
c. Audits Tab
The Audits tab is where a webpage can be “graded” based on a variety of tests to see where
specific pain points are. Also, the Audits tabs specifically highlights which tests were only
satisfactory/did not pass, giving a much deeper insight into targeting certain bottlenecks. There
are 4 key sections that the tool audits: Performance, Accessibility, Best Practices & SEO.
The performance score measures how quickly a site is able to be load the first byte, and also
how quickly the site becomes responsive for the user. These optimizations that are centered
around increasing loading speeds, this score is crucial.
The accessibility score measures how easily readable and engaging the webpage is. For
example, if a button is red and the background is also red, the accessibility score will be poor
because not all the features on the webpage are easily accessible. Additionally, having text that
is too small to read is also reflected poorly on this score because it is not easily accessible to
users that have reading issues. Consequently, the accessibility tab is important for
iii. Best Practices
The best practices score measures a number of auxiliary metrics that are vital to a webpage’s
health, such as cybersecurity, display image sizing, script vulnerabilities, browser errors, etc.
The strength of the best practices tab lies in its ability to measure the security and
vulnerabilities of the site. For example, this tab will highlight all elements on the webpage that
are not utilizing a “https” URL which is vulnerable to external monitoring. For sites that are
engaging with a lot of private user data, the best practices score should be a very high priority.
The SEO score measures the how optimized the webpage is for search engine results ranking. A
lot of the factors include keywords, meta descriptions, link descriptions, etc. For example, if
your webpage is not ranking high on the Google search engine, then it is most likely that the
webpage’s SEO score is low. Thus, for websites that want higher page views and market
exposure, the SEO score is critical.
With the plethora of tools accessible to web developers, site administrators, and SEO
specialists, identifying site bottlenecks is a mix of deep analytical research from a user
standpoint (i.e. Google Analytics), to simulating test situations (i.e. Google Chrom DevTools).
After identifying the key issues within the webpage/web site, the next step is to start
implementing solutions and testing to see what configurations are create the best results.
Step 2: Solution Delivery
Identifying Site Bottlenecks and conducting deep internal research on pain points is the first
step to ensuring a time efficient and optimal website development and SEO. Once a list of
points is made, follow the following list to roll out your implementations.
1. RESEARCH SOLUTION CHANNELS
There are a number of methods that can be used to solve the highlighted bottlenecks your
website is facing, and the first step is to conduct deep research to understand what the most
effective method is needed.
a. 3rd party software/consultations and/or plugins
There is a limitless pool of resources online that highlight suggest they can automate the
optimization process. For websites hosted on WordPress, there are plugins that are capable of
website optimizations. For example, if a website is facing an inundation in large image sizes and
cannot find any manual method to compress/replace the images, Smush Image Compression &
Optimization is an exceptional plugin that can compress images automatically. Another plugin
for WordPress websites that has a lot of versatility is Autoptimize. Users can have the plugin
SOMAmetrics website, Autoptimize was able to increase the Performance and Accessibility
scores by over 40%. Consequently, I would highly recommend Autoptimize to any WordPress
b. Alternative Website Elements/Designs
Although plugins may be an effective alternative given the current status of your website,
sometimes the only way to find a more optimized website is to change certain elements of the
website. For example, Google’s Accessibility tab is very critical of graphical elements, such as
elements that have barely distinguishable colors or unorganized layering of pictures. Thus, it is
necessary to have a person specialized in graphic design to support the rolling out of graphic
changes. Another reason website designs may have to change is due to a lack of interfacing in
c. Manual Code Reconfiguration
The last resort for managing a website that is not optimizable via plugins or alternative website
design changes may require a manual code reconfiguration. For example, a specialized website
more optimized solution. I suggest this as a last resort, as this is a very technical task, and
requires someone very accustomed to working in website development to create the elements
and features you would want from an interactive, yet lean, website.
2. CATEGORIZE & PRIORITIZE THE BOTTLENECKS
Once the research solution(s) are identified for each problem identified, it is necessary to
prioritize which bottlenecks are of critical utmost, versus bottlenecks that are non-critical and
can be implemented at later times. The best way to categorize the bottlenecks is by identifying
which pages the bottlenecks exist on and sorting them by which pages have the highest traffic.
This strategy is based in the philosophy that higher traffic pages should be the first to be fixed
earlier. A different strategy can focus on fixing foundational problems first, and then fixing
aesthetic issues later. Regardless of the strategy, I highly suggest that after creating these
categories, you develop a timeline that these “waves” of changes can be rolled out so ensure
consistent positive edits to the website.
3. COMMUNICATING WITH OTHER BUSINESS UNITS
It is key that throughout the website editing timeline, there are no other compounding
variables that will hinder the measurement of the effectiveness of website changes. For
example, if a wave of changes is made, but then the marketing team rolls out a “buy one get
one free” discount on a webpage, then the data will highlight a significant skew for that one
page. Consequently, during the analysis phase, it will be very difficult to understand what
positive results were driven by the discount versus the website changes. Thus, it is critical that
the website development teamwork with other internal business units to ensure no significant
actions are taking place over the same time of the changes of the website. At larger companies,
this will be significantly hard to avoid, so I suggest that even being aware of the
marketing/auxiliary events and their impact is important so to understand the possible skews in
4. ROLL OUT THE 1ST WAVE OF WEBSITE SOLUTIONS
Once the following steps have been concluded, your website is ready to see the first set of
changes. It is important that all members of the website team are aware of what changes need
to be made, and for what reasons. Once these changes have been implemented, check out our
next article: SEO – Analyzing Next Steps!
Step 3: Measurement & Reiterative Testing
Once you have Identified Site Bottlenecks and completed an in-depth analysis on the Solution
Delivery, now we are in a position to begin collecting data to see how the solutions are
impacting the site and, consequently, meeting business performance goals. Follow the
following guide to get a better understanding of all the steps needed to effectively collect,
analyze and act on all the data you have.
There is a tendency for people to believe that “if the website looks better, then of course it
must be better”, which is a fallacy that is not based in quantitative evidence. It is necessary that
site changes’ effectiveness are measured in the ultimate goal: driving the organizations’ KPIs
which can only be seen by digging very deeply into the data. Additionally, when it comes to
understanding whether more changes need to be made, the data points collected from the site
will be a very strong indicator of whether a change yielded positive or negative results.
There are a number of different ways to extract data points for analysis. The simplest, and most
universal, interface for data is Google Analytics. It has the most comprehensive breakdown of
data that is collected, and provides a number of graphs, charts and visuals that make it very
easy to understand the dataset provided. Many website-building companies, such as GoDaddy
or WordPress, have their own simple data analytics portals. The caveat is that these data
analytics tools are very simple and do not have the versatility that Google Analytics carries.
Additionally, the intuition holds that an optimized for Google should utilize Google Analytics.
2. ACHIEVING STATISTICAL SIGNIFICANCE
Statistical significance is an important concept to fully grasp prior to analyzing data from
changes in your website. According to Investopedia,
“statistical significance refers to the claim that a result from data generated by testing or
experimentation is not likely to occur randomly or by chance but is instead likely to be
attributable to a specific cause.”
Essentially, a website change is statistically significant if the positive/negative change in the KPI
is attributed strongly towards the changes that you made versus random chance or
uncontrollable events. For example, a statistically insignificant scenario would be where a
webpage that highlights new innovations in genome editing sees a drastic increase in page visits
just as there is a very public death caused by a genome editing procedure. In this scenario, any
positive or negative outcome caused by the changes in the website are muddied by the
overwhelming influx of viewers coming in caused by an external variable.
There are 2 ways to ensure that the results you find are statistically significant. The first way, as
mentioned in the Step 2: Solution Delivery article, is to work with other internal business units
to ensure there are not any other major marketing/sales plans timed to be during the same
time as your website updates. The second way is to be critically cognizant of real-world news
from around your key target markets. If you, for the prior example, read in the news that a
genome editing procedure is happening for the first time, it may not be a good time to roll out
3. DATA COLLECTION
Let us look a little deeper into what data points are important to measure successful and
statistically significant results for website changes. For our analytical purposes, we will only be
looking at the Audience Tab of Google Analytics.
a. Overview Tab
The Overview Tab highlights all the data associated with the people that are interacting with
your webpage. I think it is important to set the time frames (top right corner) to a comparison
of time frames (after website changes vs. before website changes). On this page, you will find
the following pieces of information:
i. Users/New Users
The number of users (returning + new) that have visited any webpage on your website. This
metric is important for businesses to better understand if they have a significantly loyal
userbase, or if there is a significant amount of user turnover. Businesses that are B2B will tend
to have high returning customer values, while B2C websites will have a larger “new visitor”
This is the number of times someone has opened a browser and opened more than one
webpage on your website. This also means that if 1 person opens more than one webpage,
closes it, and then opens another browser and opens more than one webpage on your website,
that counts as 2 different sessions.
iii. Number of Sessions per User
This is quite self-explanatory, as it is the average number of sessions a user is having. If the
number is closer to 1 session per user, then that means that either the user is finding their
information within their first search of your website, and do not have any further need, or they
did not find the necessary information and exited the page. In either scenario, the website did
not do anything to cause the user to come back after their first use.
This metric measure how many different webpages were opened across all sessions and all
users. Generally speaking, this number does not drive much meaning, but the next metric is
where more meaning is derived.
v. Pages / Session
Similar to number of sessions per user, this measure, on average, how many webpages are
opened per session. This metric is important for a number of reasons. If the highlighted value is
closer to 1, then this means people are either finding the information they need immediately
and not being persuaded to see more content on other webpages, or they are entering the first
page and realizing that this was not the content they were looking for, and are immediately
vi. Average Session Duration
Similar to Pages / Session, Average Session Duration is critical to understanding a user’s
sustained engagement with your site. Determining what is an acceptable amount of time on
your website is dependent on the type and length of content that you have listed. One
circumstance that is a key giveaway that users are immediately moving off the website without
engaging with your content is if the session duration value is under 20 seconds. Otherwise, the
most effective bootstrapped method to measure the general time to engage with your content
is to actually assume the role of a user and time how long it takes to engage with each
webpage. If it takes 10 minutes to read your website top-to-bottom, and you are seeing that
the average session duration is 5 minutes, then you can infer that generally people are only
seeing 50% of the content that you have made available.
vii. Bounce Rate
The bounce rate is a single-page session on your website. The bounce rate is one of the most
subjective statistics on Google Analytics, because sometimes a high bounce rate is not
necessarily a bad outcome. For websites that have a home page that grants access to all the
other content-intensive pages, having a high bounce rate on that page means that people are
leaving without ever accessing your content. But, for websites that are designed to have a
minimal variety of webpages and is tailored towards a single scroll (i.e. blogs, teaser sites, etc.)
will have a high bounce rate and that is perfectly normal.
b. Demographics Tab
There are 3 key breakdowns that are great insights for marketing teams, but not a lot of
information on data that supports website performance. The Demographics tab highlights the
age ranges and genders that are engaging with your site.
c. Interests Tab
Similar to the Demographics tab, the Interests tab is more insightful for marketing teams, as it
highlights the type of people and their interests that are engaging with the site. For example,
Google creates “affinity groups”, such as “Shopper/Value Shoppers”, based on their analysis of
a user’s data. In terms of web performance, this may be interesting to look into, as this might
provide a better understanding on how to develop the aesthetic and graphics to engage with
the correct segments.
d. Geo Tab
Similar to the last 2 tabs, the Geo tab has more data on users, more specifically what language
they have as their default, and where your users are searching from across the world. In my
opinion, the Geo tab is more relevant than the Demographic & Interest tabs because language
and location are key to understand how to optimize. For example, if a website is written
entirely in English, but a strong majority of readers have their default language set as Arabic,
then there is a critical need to adapt the language, and thus the format, of the website.
Additionally, if there is a significant part of the website’s traffic coming from audiences abroad,
like India, then more research needs to be done to ensure that the website is optimized for
e. Behavior Tab
This tab measures various metrics comparing new users versus returning users. These data
points in this tab may be important when considering who is the specific market segment that
is engaging with the website. For example, if you own a website where returning customers are
key (i.e. ecommerce platform for clothing), and the data shows that there are more “single
session” users than “multiple session” users, then changes must be made. Similarly, this tab
also gives insight into the average session duration in more detail, breaking it down into
segments. Thus, if you have measured that to read the entire website takes approx. 10 minutes,
and the average session time is skewed closer to the “less than 30 seconds” categories, then it
is clear that your users are not sticking around to maximize your content they consume.
f. Technology Tab
This tab is very important for website developers, as it gives insight into the various applications
that users are using to access your site. This tab is especially important for website developers
that are not using a 3rd party website maker, such as WordPress or Squarespace, as these
platforms tend to have some form of basic mobile site adaptive process. It is key that someone
with extensive experience designing and creating sites that are adaptive and dynamic to both
desktop and mobile platforms is analyzing this data.
Google Analytics provides an endless supply of data points, and every website, based on the
problems they identified and the solutions that have been rolled out, only requires certain data
points to understand the changes in website effectiveness.
With a better understanding of statistical significance and data collection methods, website
developers can connect with industry experts if more support is needed. Our team here at
SOMAmetrics is deeply engaging with clients that are looking for better and more effective
website performance, and we are proud to provide that with the highest level of customer
Step 4: Data Analysis
After Identifying Site Bottlenecks, Delivering Solutions, and understanding Measurement tools,
we are now in a position to begin the deeper data analysis and providing tangible meaning to
the data presented.
To first understand if a metric is under or overperforming, benchmarks must first be set. A
benchmark is a carefully calculated value per metric that is used as a standard, or a “what we
achieve on a regular day without influence”. Benchmarks are key to understanding how
successful a solution delivery is, as it gives us a standard to work with. Benchmarks also ensure
that we are comparing apples to apples when looking at data. For example, a deficient
benchmark would be to use toy sales in the last week before Christmas to compare against full-
year sales: the last week of Christmas has the highest volume of sales for toys, and to compare
the rest of the year to that benchmark would make it seem that the rest of the year “does not
perform to that same level”, which could be an unnecessarily negative outcome.
There are a lot of ways to create benchmarks dependent on what we are measuring. For our
purposes, as we are looking for performance improvements for a site element, webpage or the
site entirely. Also, we are looking for key business KPIs, such as page views, clicks, average
session duration, conversions, etc. Subsequently, we need to look over a period of time to
create an average or a median metric benchmark that will reflect a business’s prior success. For
businesses that have seen significant growth in a short period of time, it may be difficult to look
at, for example, a year-long web report and take a rough average. So, we will break down
benchmark creation based on two methods: long-term historical benchmarking, and forecasted
1. Long-run Historical Benchmarking
Long-run historical benchmarking is a method of benchmarking that looks at a lot of past web
performance data to create benchmarks. This method tends to work with websites that have
existed with web data capabilities for a reasonable period of time, websites that have seen
steady (not growing or declining) web performance, or sites that do not directly drive the
majority of business growth/sales. To create an annual performance benchmark (as one of the
simplest benchmarks), a company can compare of number of years’ data and create an
average/median benchmark for web metrics. It is key for an annual benchmark to be compared
against other years to account for any external variables that affected web performance. For
example, if one year one certain month received an unusual/unplanned amount of page views
or conversions, then looking at longer time frames ensures that such discrepancies do not skew
the benchmark too heavily. Long-run historical benchmarking is also very useful for creating
month-over-month benchmarks. For example, if you wish to see what your benchmark page
views are for the month of December, then, using the long-run historical benchmarking
methods, you will compare December 2019’s total page views with December 2018’s,
December 2017’s, and so on for as many years of data that are possible. This method is the
easiest method to create benchmarks for web performance because they do not require
intensive calculations or models, and these benchmarks can easily be created even just by
looking at graphs.
But there are some caveats to this technique. First, the amount of usable data to create a
benchmark is highly constrained by the growth of the business’ website. If a business is focused
around e-commerce or it is a major channel of business growth, then the business will
obviously make constant efforts to increase web performance through marketing efforts. Thus,
creating a flat benchmark using yearly data will not be accurate. Thus, the first main constraint
is the type of product/service that the business provides. This benchmarking model works best
for B2B, landing pages exclusively designed to redirect customers to physical POS locations, etc.
The second caveat for this model is that if the first round of solutions deliveries there is a
positive change in web performance, then the first version of the benchmark will no longer be
relevant, and another benchmark will need to be developed. But, because the “new
performance” values are relatively new and there is not enough data to measure through an
entire year or possibly even for a month. Thus, if there is a significant change in performance
after the creation of the first benchmark, new data will need to be analyzed to create a new
benchmark. The benefit to this model is that it is a very easy and non-technical process to
create new benchmarks.
2. Forecasted Growth Rate Benchmarking
Forecasted growth rate benchmarking is another method to create benchmarks for web
performance. It is vastly different to long-run historical benchmarking in that this model does
not require very much prior data, it is a highly technical and quantitative calculation process,
and is suited for a very different host of sites. For example, companies with constant site
performance growth, B2C businesses and sites that drive the majority of business growth/sales
will benefit more from using this model. Forecasted growth rate benchmarking uses external
business KPIs (i.e sales volume, revenue, engaged users, etc.) to create a forecasted growth rate
for future site KPIs which then acts as a benchmark.
The reasoning behind using external business KPIs as a rough correlation to drive website
performance is dependent on the type of business (as stipulated above): if an e-commerce site
is seeing an increase in volume being sold, then there will be higher web performance. The
specific KPI that will show high correlation with changes in website performance is dependent
on business intelligence analysis, and businesses will need to invest their own time into finding
the most relevant KPI.
Once a KPI is selected, we need to take a look at as much past data as possible exists for this
KPI. For our study, we will look at revenue as the business KPI. Follow the steps below to
develop a model and to calculate a KPI growth rate:
1. Acquire as much historical data as possible for the selected KPI
2. Draft a table in Microsoft Excel with all the organized datapoints
3. Use the “Create graph” function to create a scatter plot with the KPI on the y-axis
4. Insert a trendline with the following conditions:
a. The trendline cannot be a “logarithmic” or a “moving average” function
b. The trendline cannot be a polynomial function with an order greater than 2
c. The R2 must be greater than 0.95
d. The trendline equation must be viewable
Please note that although this process permits the use of exponential, polynomial, and a
number of other different types of functions, I highly recommend only using a linear or a
quadratic function, because the calculations will be significantly simpler and business KPI
graphs tend not to look like very complicated functions.
Now, with a trendline that now closely mirrors the empirical data, we can use the trendline
equation to compute a KPI growth rate. By taking the function’s derivative, we can find the KPI
growth rate. For example, if the function for my revenue model is: Revenue = 2*(Time) + 100,
then: Revenue growth rate = 2. For functions that have an order greater than two, you will
notice that the independent variable (i.e. time) will still exist in the KPI growth rate, and that is
With this KPI growth rate, we have developed a benchmark that your solution deliveries should
outperform. There are some major caveats to this model. First, creating an assumption that a
business KPI and web performance is a very large assumption, and there is very little research
to prove this assumption can be made soundly. Second, this model only works for companies
that are B2C and depend on online sales to drive growth.
Now that we have understood the two foremost methods of creating benchmarks, we can now
try to assess meaning from the solution delivery data. If we can see that, after the time that the
solutions were implemented, that there was a clear positive change in performance, then it can
be concluded that the solution delivery works. But in many cases, there are 2 drawbacks to
trying to analyze website performance data.
First, it is very rare to see a clear change in site performance simply by optimizing a website’s
front and back end. For a user to click into a website, they must intrinsically be interested in
what the website has to offer. So, even with a highly optimized site, a developer cannot
influence the number of incoming traffic off of search engines like Google.
Second, it is very difficult to see clear changes in performance if the website has very low page
views in the first place. For example, if a webpage has a benchmark of 5 views a day, and 2 days
after the introduction of a few website solutions there are 15 page views, that is over 300% the
benchmark. Arguably, that is a very significant increase in performance. But, with such low page
views, 15 views could easily have been an anomaly or an error. On the other hand, if the
benchmark was 5,000-page views a week, and the next week there were 15,000 page views a
week, then clearly something is working well.
The key to understanding the data is to ensure that any changes in performance are sustained
and consistent over a reasonable period of time. If you have any questions on any of the
concepts or ideas referred to in these articles, please feel to reach out to Siddharth Gupta at