Uploaded by Quentin Watelet

eCommerce Metrics eBook

advertisement
How to prioritize metrics
as an e-commerce CTO
DANNY MILES CTO, Dollar Shave Club
ABOUT THE AUTHOR
Danny Miles joined Dollar Shave Club as the company’s first Chief
Technology Officer in 2017. Previously, Danny served as the Vice President
of Direct-To-Consumer Technology at Nike, and has more than twenty years
experience in architecting, implementing and launching leading-edge technical solutions.
As CTO, I am accountable for the reliability and performance
of Dollar Shave Club’s site. When I get to the office in the morning, I open up several tabs that display my metric dashboards.
This morning ritual usually takes me about 10
minutes. I do this not to find problems but to
get situational awareness to start my day. Site
reliability and performance is just one of my re-
02 — 12
sponsibilities, but it’s a foundational area that the
whole business depends on. The graphic below
illustrates why site reliability and performance is
so important.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
1
SITE RELIABILITY AND PERFORMANCE
2
ON-SITE USER BEHAVIOR METRICS
(AKA WEB ANALYTICS)
3
UPPER MARKETING FUNNEL METRICS
4
BUSINESS PERFORMANCE METRICS
If the site isn’t doing well, users won’t be moving
through the on-site sales funnel and converting
(impacting the metrics in ring 2). Then, big drivers of traffic to our site like Google and Facebook
will see that we aren’t converting, because we
report that information back to them (impacting
the metrics in ring 3). In addition, any ad spend
will have been wasted. Then, traffic will dry up
and business metrics (ring 4) will take a big hit.
That’s why site reliability and performance is such
a critical area to the business: any issues get significantly amplified beyond just the loss of a sale/
conversion.
Incidentally, I’m also accountable for the metrics
03 — 12
in the order shown in the graphic. Therefore, I
check my metrics in this order during my morning
routine. Sometimes I’ll see anomalies and trends
that might not get picked up by alerts. If I do see
something that doesn’t look right or something I
don’t understand, I’ll ping a member of the team.
But, generally, first thing in the morning is not the
time to be discovering performance issues: this is
a level-setting exercise to begin my day.
In the sections below, I dive into each of my metric areas and describe the specific metrics I look
at in each ring during my morning routine.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Top site reliability and performance metrics
I could get quickly bogged down in system performance data but, as CTO, that’s not my best
use of time. Because I sit between the engineering team and other business functions, I need
metrics that tell me whether we’re meeting our
obligations to the business and hitting our goals.
Over the years, I’ve zeroed in on some key metrics
in the site reliability and performance space that
matter to me most. Below, I describe the metrics I
check regularly.
01 APPLICATION-LEVEL METRICS
I want to see some key application-level metrics. It’s important to go through the basics of the e-commerce application and make sure it’s serving users (and users are behaving on it) as expected.
Active sessions
We care a lot about this metric. This is the stateful
part of the site where people are actually logged
in and creating carts. If active sessions drops off
you know something is wrong.
Carts created
If you’re not seeing people create carts or orders
for a period of time, that’s a signal that something’s amiss.
Orders processed
This tells me we’re able to make money and also
tells me at what rate we’re making money.
I consider security to be a part of site reliability and performance engineering. While the following metrics
provide indications of site functionality, I also use them to assess security issues.
04 — 12
Logins per day
Both failed and successful. This number should be
fairly predictable. Unexpected changes may indicate a security threat or breach.
Payment declines
If I get a spike in the number of payment declines, it
could be an outage in our payment processing service, a spike in fraud activity, or a code or API break.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Changes to account
This is a big one from a fraud perspective. Account
takeover threats have become fairly common in the
direct-to-consumer e-commerce space. Members
don't frequently change primary account fields like
their email, password, address, or payment information. This is especially the case on a subscription
site where members configure most of their settings
at sign-up. If account changes spike above normal
you may have a malicious actor on your hands.
While these metrics are important to the business they’re also important to the tech team because they
can signal technical issues. Next, I’ll go a level deeper into system-level metrics.
02 SYSTEM-LEVEL METRICS
Uptime
We have a “Sunglasses Man” who pops up every
once in a while telling customers the site’s not
available. Our goal is to never see Sunglasses
Man. We should be getting alerts about problems
way before Sunglasses Man needs to show up.
My view on site downtime is a bit different than
what I typically hear, but I think it reflects the
reality of our digital business. For me, downtime
isn’t as much about losing new orders and conversions, although it does certainly affect those
metrics. In my experience, the biggest impact of
downtime on an e-commerce site is the lost marketing dollars and wasted effort driving traffic to
the site. Marketing campaigns drive our business.
The worst thing that can happen is an outage
during a big campaign we’re running around
March Madness or, when I was at Nike, during the
Olympics. If our site isn’t available or performant
100 percent during those times the impact can
ripple through partners, search engines, and ad
servers. Not only have you wasted your ad spend
05 — 12
but you are promoting poor quality of your brand
while your digital presence is down.
The impact of downtime through the big ad platforms that drive traffic to our site, like Facebook
and Google, lasts beyond just the time you are
offline. We report back in real time to Facebook,
and their algorithms get tuned based on whether
people are landing on our site and completing our
checkout. If we have an outage on our site or if we
stop reporting back to our ad partners, their algorithms may stop serving impressions and it will
take days to re-tune and scale the traffic back up.
Outside of our programmatic buying platforms,
the lead times for TV and radio spots are not
something you can halt during an outage so you
are literally throwing money away at that point.
So, downtime has a lot of ripple effects. It’s not
just losing new customers and orders—it’s all the
investment you’re putting into marketing, search,
media, and social.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Requests & Traffic
I want to understand the load on the infrastructure and whether it matches up with expectations. I discussed several traffic metrics above. There are a few more technical traffic metrics that I look at:
Rate limits
I want to see if we’re triggering any of our rate
limits on key APIs. For example, if a login endpoint
is spiking over 1,000 times per second from a specific network or client, that’s potentially a sign of
malicious activity or an application error.
Unique users on the site
How many unique users are on the site and does
it match up with expectations? I have week-overweek and year-over-year comparisons for this
metric, as well as a predicted range.
Requests served from edge
versus origin (cache hits
versus cache misses)
I look a lot at traffic served at the edge, via our
CDN, versus our core infrastructure. I
want to make sure requests are hitting the
cache. If too many requests are coming
all the way to your origin web server for
images and other content, then something may
not be right and your end users could be experience slow performance. A lot of activity should be
caught at the edge via CDN, which helps if you’re
experiencing a DDoS attack.
Latency
The site needs to be fast, otherwise users will become frustrated and leave the site before they
complete their purchases. I primarily focus on two performance metrics:
Response times
06 — 12
For any API or any endpoint, I want to know what
are the response times. Response times can directly correlate to your conversion rates. In my experience, when response times start to deteriorate,
they don't usually improve without intervention.
Before you get to a point where your website’s
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
not up, you’re usually going to see a slowdown
somewhere. It’s very rare that the site just goes
‘click’ and it’s off. More often than not, things start
to slow down and time out. And then it ultimately
results in not being able to load the application.
Page load time
I focus a lot on the time it takes for pages to
actually be useful to the user, which is the time it
takes to load and render a piece of content and
for the page to be interactive. Page load time is a
measure of speed from the user’s vantage point,
and this information is captured by monitoring
the client side. The time it takes to transact or
get through an API is more likely to degrade after
we’ve pushed changes into production. This
would in turn have business impact. I recommend
having an “always on” offense when it comes to
speed as the continous deployment of new code,
content, pixels, and tracking can quickly shift
your site from a 2-3 second load time to a 7-8
second load time where you’ll see your conversion
rate go off a cliff.
I like to further break down page load time by device and by browser to ensure that users are having
similar experiences. I’ve definitely experienced situations where certain device types and browsers have
prolonged load times and we’ve had to engineer device-specific fixes, especially now that most brands'
traffic is coming from mobile devices that are less performant than desktops and tablets.
Errors
Errors will always occur on the site, but I really
need to be aware of any systemic flaws that are
blocking users from viewing content, using features, and making purchases. Are we getting malformed requests or behaviors in the request that
don’t make sense? I check 8-10 error rates which
capture key user requests and Tier 1 services.
07 — 12
Then I check error message patterns to
see any underlying issues.
I just described the site reliability metrics I pay
the most attention to as CTO. Next, I describe
some of the “outer ring” metrics I pay the most
attention to.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Metrics the engineering team
supports but does not own
While platform reliability and performance is
my team’s responsibility, there are several other
metric areas where my team has some degree of
shared accountability with other teams. I check
these metrics next. This will vary slightly from
one e-commerce firm to another and from one
CTO to another. Here’s a peek at what matters to
me at Dollar Shave Club.
CTO + Digital Product: On-site user behavior metrics
(Web Analytics)
As Dollar Shave Club’s CTO, I manage our software development lifecycle. I spend a lot of time
dealing with outstanding tickets, defects, and
building out new features and functionality.
However, the Digital Team, headed up by a peer
of mine, establishes the roadmap and vision for
the content and functionality of the site. My team
implements that vision. So, the look, feel, and
functionality of the site is a partnership between
my team and the Digital Team.
08 — 12
Our collective goal is to move users seamlessly
through the site towards a purchase. This is
all about pulling the right levers through tests,
design, content, and features to increase AOV
[average order value] and conversion rates. Many
people associate this layer of monitoring with
web analytics. Here are some of the metrics that
matter to me in this layer. If any of these metrics
deviate from expectations there may be a technical issue (but not necessarily).
Time on site
This is active browsing time on the site.
Bounce rate
This is the percentage of visitors who land on a
page and leave the site without proceeding to another page on the site.
Number of pages visited
This is the number of pages visited per session.
Product/content views
This tracks what users are looking at and how often.
Cart adds/removals
This tracks products added and removed from
carts across experiences.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Time to conversion
This is the amount of time it took for the user to
make a purchase after landing on the site.
Abandonment rate
This is the percentage of site visitors who left the
site without making a purchase. Also important to
ask is: Are people abandoning the site in particular
places? This will help us reduce leakage and keep
users moving towards a purchase.
A/B Testing Groups
We are typically running multiple feature and content experiments across the site. Comparing the
metrics above by testing group is key.
All of these metrics are a level up from the purely
technical because they involve UX, content, and
design. We’re constantly adding new site instrumentation to capture user behavior data. Every
time the company comes up with a new theme or
campaign there’s work to do. Also, new features
that my team is responsible for developing constantly impact these metrics.
CTO + Marketing Partnership: Upper marketing funnel metrics
The upper marketing funnel takes place largely
off the site, and is primarily owned by the
Marketing Team. However, the Technology Team
supports the Marketing Team in two main ways:
setting up robust instrumentation to capture data
and marketing analytics platforms.
SETTING UP INSTRUMENTATION AND MAINTAINING DATA PIPELINES
As I explained above, it’s critical to ensure our
data pipelines back to referrers like Google,
Facebook, and other large drivers of site traffic
09 — 12
are operational. Any number of things can disrupt
reporting back to ad platforms. The two most
common issues are:
Misfiring pixels
The Javascript we load in the browser from a site
like Facebook isn’t working or running correctly.
Misconfigured pages
Deployed pages, for example new landing pages,
on our site aren’t reporting accurately.
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
The way to find problems with reporting back to
referrers is to set up robust alerts in your web
analytics platform. For example, if I’m serving ad
impressions, but not converting at normal or expected rates, there’s likely something wrong with
the site or the instrumentation. I would immedi-
ately want to stop spending money on ads until
the problem is resolved. As a last resort, Google
and Facebook will stop serving ads and driving
traffic to the site, but I’d prefer to know well before that happens.
ANALYTICS: MULTI-TOUCH ATTRIBUTION
Assuming everything is reporting back as designed, we still have a lot of work to do to help
optimize our ad spend through analytics. One of
the tough things to capture and the most elusive
in the e-commerce space is “multi-touch attribution” (MTA). This is about understanding what it
costs to get someone to our site. It’s the holy grail
for marketing when we can truly understand what
impressions visitors saw on Facebook, organic
search, and paid search and how they got to the
Referrer tracking
site. Did this person see a billboard, TV commercial, or Facebook ad? Where did they come
from and how much did we spend to get them to
our site? This is a complicated instrumentation
and analytics exercise. It is the responsibility
of the digital and marketing teams to drive this
work, but my team supports them technically to
stitch all the vendor data together with our onsite
analytics.
I use this to closely monitor where traffic is coming from, such as paid ads, organic (direct) search,
or our social and email campaigns.
CTO + Operations Partnership: Post-funnel metrics
My job isn’t over when the customer converts on
the site. I also need to make sure we’re performing after the order is taken and doing a proper
handoff to fulfillment. That means making sure
the orders are processed and flowed to the
warehouse and that credit cards were billed
Orders sent to warehouse
(expected and actual)
10 — 12
properly and on time. For a subscription business,
we process orders for customers automatically
on a recurring basis though large batch jobs that
happen every night. Here are the primary metrics
I track in this area:
I pay close attention to the number of orders I expect to send to the warehouse, and the number actually received by the warehouse. If we miss a day,
that’s bad because people are waiting idle at the
warehouse one day and have a huge backlog the
next day (sometimes requiring extra resources to
get everything shipped out).
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
Payments processed
(expected and actual)
Because we are a subscription business, there are
some unique things I pay attention to. The worst
day of the year for us is February 28 because
we need to process payments and orders for all
subscriptions for the 28th, 29th, 30th, and 31st.
As a subscription business, people expect to be
charged and receive email confirmations consistently each month. If we start moving around the
day, that creates a poor customer experience. I
pay close attention to the number of payments I
expect to process, the number actually processed,
and the number that failed to process. Then we
need to dig into the number that failed to process
and find out why.
Similarly, holidays also add complexity to our
operations. Any issues with order or payment processing have ripple effects for fulfillment, operations, and customer service.
Business performance metrics where technology has an impact
The technology team plays a key role supporting
business performance metrics, but so does virtually every other team. These metrics are lagging
indicators of technology issues: I do not want
to see technology issues reflected in business
metrics. Technology issues should be detected
11 — 12
well before they materially affect the business. If
we do our part on the areas I described above, the
business performance metrics will follow. Here
are the top business metrics the tech team ultimately supports:
Average order value
(AOV)
The average dollar value for orders in a given time
period. Numerous technology issues can affect
AOV, from site availability to site speed to errors
to payment processing.
Order rate/Ship rate
(orders per hour/day)
AOV tell us that we’re making money. Orders per
hour/day tells us if we’re making money at the
rate we should be. Again, numerous technology
problems can slow down our order rate.
Conversion rate
The percentage of users who complete a certain
action (usually a purchase). In my experience,
conversion rate is tricky to talk about because it
depends on the team and the context. A marketer
might
HOW TO PRIORITIZE METRIC S AS AN E- COMMERCE C TO
might care about the conversion rate of a single
email. Others might care about the conversion
rate of all site visitors (and even then there are
different methods: some people include bounces,
others do not). The key is to get on the same page
quickly about what version of conversion rate
you’re talking about and what assumptions you’re
making.
Churn/cancellation rate
We closely monitor the rate of customer cancellations and try to correlate cancellations to root
causes. Technology issues leading to poor customer experiences could very well be a cause of
churn if your recurring processes and systems are
not reliable and predictable.
Return rate
We closely monitor the order return rate and
try to correlate cancellations to root causes.
Technology issues can sometimes interfere with
getting an order right.
Technology problems create friction that will drive these business metrics in the wrong direction. This is
why I’m obsessive about my platform reliability and performance metrics.
Conclusion
CTOs from e-commerce and retail firms swim
in a sea of business and system health data.
These leaders are ultimately responsible for very
granular technical performance metrics (like
disk space), but they also play a vital role supporting key business performance metrics (like
1 2 — 12
conversion rate). That’s why I wrote this article:
to offer what I think is an efficient framework for
thinking about and prioritizing metrics, so that
e-commerce CTOs can check what matters in 10
minutes, and then go about their days.
Datadog empowers organizations
to easily monitor and secure their
cloud-scale infrastructure and
applications. Our SaaS-based
observability platform provides
real-time visibility into high-scale,
dynamic IT environments. With 600+
out-of-the-box integrations, Dev,
Sec, and Ops teams can simplify and
automate their cloud operations,
securely deliver software, and ensure
exceptional digital experiences.
Join the thousands of companies
worldwide who trust Datadog to
accelerate their go-to-market efforts
and stay ahead of the competition.
START YOUR FREE TRIAL TODAY
Download