How Historical Data Should Drive Capacity Planning

Here’s a question: How did you decide how many session hosts your AVD environment needs?

If you’re like most organisations, the answer involves some combination of:

Vendor recommendations — “Microsoft suggests X users per host”
Initial estimates — “We have 500 users, so let’s start with 50 hosts”
Buffer for safety — “Better add 20% just in case”
Trial and error — “Users complained, so we added more”

This approach isn’t wrong, exactly. It gets you to a working deployment. But it almost certainly leaves you overpaying for capacity you don’t need.

There’s a better way: let your historical data tell you what you actually need.

The Problem with Estimation

Estimation-based capacity planning has a fundamental flaw: it’s based on assumptions, not reality.

Common assumptions that lead to overspending:

All users work the same hours (they don’t)
Peak usage is consistent day-to-day (it isn’t)
Everyone needs resources simultaneously (they don’t)
Usage patterns are static over time (they change)

When you provision based on assumptions, you inevitably provision for worst-case scenarios. And worst-case scenarios rarely happen.

The result: Session hosts sitting idle, consuming budget while delivering no value.

What Historical Data Reveals

Once you start collecting and analysing actual usage data, patterns emerge that assumptions could never predict.

Daily Patterns

Real usage data typically shows:

6am  - 8am:   5-10% of peak capacity
8am  - 9am:   Rapid ramp to 60-70%
9am  - 11am:  Peak usage (100%)
11am - 1pm:   Slight dip (80-90%) — lunch breaks
1pm  - 3pm:   Return to peak (95-100%)
3pm  - 5pm:   Gradual decline (70-50%)
5pm  - 8pm:   Tail off (30-10%)
8pm  - 6am:   Minimal usage (5-10%)

If you’re provisioning for 100% capacity 24/7, you’re paying for resources used at full capacity for maybe 6 hours a day.

Weekly Patterns

Dig deeper and weekly patterns emerge:

Monday: Higher than average — catching up after the weekend
Tuesday-Thursday: Consistent peak usage
Friday: Often lower — early finishes, remote working
Weekend: Minimal (unless you have shift workers)

A static scaling schedule treats every weekday the same. Historical data shows they’re not.

Monthly Patterns

Look at month-level data and you’ll find:

Month-end: Finance teams work extended hours
Quarter-end: Even more pronounced
Holiday periods: Significantly reduced usage
Summer months: Often lower overall

Seasonal Patterns

Zoom out further:

Back-to-school period: Education sectors see spikes
Retail peak seasons: November-December changes everything
Summer holidays: Extended reduced capacity needs
Industry-specific cycles: Tax season, audit periods, etc.

From Data to Decisions

Collecting data is one thing. Turning it into actionable capacity decisions is another.

Step 1: Establish Baselines

Before you can optimise, you need to understand your current state:

What’s your actual peak usage? Not theoretical — actual
When does peak usage occur? Time of day, day of week
How much variation exists? Is peak consistent or sporadic?
What’s your minimum usage? Nights, weekends, holidays

Step 2: Identify Patterns

Look for recurring patterns in your data:

Predictable spikes — Same time every day/week
Gradual trends — Usage growing or declining over time
Anomalies — One-off events vs regular occurrences
Correlations — Usage linked to external factors

Step 3: Model Capacity Requirements

With patterns identified, model what capacity you actually need:

Scenario: 100-person finance team

Traditional approach:
- 100 users ÷ 10 users/host = 10 hosts
- Running 24/7 = 7,200 host-hours/month
- Cost: £X per month

Data-driven approach:
- Peak concurrent usage: 85 users (8.5 hosts needed)
- Average usage: 45 users (4.5 hosts needed)
- Off-hours usage: 5 users (1 host needed)
- Weighted average: ~4,000 host-hours/month
- Cost: ~55% of traditional approach

Step 4: Implement Dynamic Scaling

Historical data enables intelligent scaling rules:

Rules derived from 6 months of usage data:

Weekdays:
- 6am: Scale to 20% capacity (early arrivals)
- 8am: Scale to 80% capacity (main arrival)
- 9am: Scale to 100% capacity (peak start)
- 3pm: Begin gradual scale-down
- 6pm: Scale to 30% capacity
- 10pm: Scale to minimum

Fridays:
- Apply weekday rules but cap at 80% peak
- Begin scale-down at 2pm instead of 3pm

Month-end (last 3 working days):
- Extend peak hours to 8pm
- Maintain 50% capacity until 10pm

Step 5: Continuously Refine

Historical data isn’t a one-time analysis. Patterns change:

New hires shift usage patterns
Remote working policies evolve
Seasonal variations differ year-to-year
Business changes affect when people work

Continuous monitoring keeps your capacity planning accurate.

The Metrics That Matter

Not all data is equally valuable. Focus on these metrics:

Concurrent Sessions

The number of active sessions at any given time. This directly determines host requirements.

What to track:

Peak concurrent sessions (daily, weekly, monthly)
Average concurrent sessions
Minimum concurrent sessions
Rate of change (how quickly sessions ramp up/down)

Session Duration

How long users stay connected affects capacity planning differently than raw session counts.

What to track:

Average session length
Distribution of session lengths
Abandoned sessions (connected but inactive)

Host Utilisation

How efficiently are your hosts being used?

What to track:

CPU utilisation across hosts
Memory utilisation
Sessions per host vs capacity
Hosts sitting idle

User Behaviour

Understanding how users work helps predict future needs.

What to track:

Login times by user/department
Application usage patterns
Geographic distribution
Device types

Common Pitfalls

Pitfall 1: Insufficient Data History

A week of data isn’t enough. You need:

At least a month for daily patterns
A quarter for monthly patterns
A year for seasonal patterns

Pitfall 2: Ignoring Outliers

That day when usage spiked to 150%? Don’t ignore it. Understand why it happened:

Was it a one-off event? (Ignore for planning)
Does it recur predictably? (Plan for it)
Could it happen unexpectedly again? (Build in buffer)

Pitfall 3: Over-Optimising

Data-driven doesn’t mean cutting to the bone. Leave appropriate headroom:

User experience matters
Unexpected demand happens
Scaling takes time

A 10-15% buffer above predicted peak is reasonable.

Pitfall 4: Set and Forget

Historical patterns change. What was true 6 months ago might not be true today. Regular review is essential.

How The Smart Scaler Uses Historical Data

We’re building The Smart Scaler to be genuinely data-driven:

Automatic data collection — We gather usage metrics continuously
Pattern recognition — Our algorithms identify recurring patterns
Predictive modelling — We anticipate demand before it hits
Continuous learning — The system gets smarter over time

Coming soon, we’ll be introducing features that let you visualise your historical patterns, understand where you’re over-provisioning, and automatically adjust scaling rules based on actual usage.

Getting Started

You don’t need sophisticated tools to start using historical data:

Enable Azure diagnostics — Start collecting usage metrics
Export to Log Analytics — Central location for analysis
Build basic dashboards — Visualise patterns
Review weekly — Look for patterns and anomalies
Adjust scaling rules — Apply what you learn

Or, let The Smart Scaler do the heavy lifting for you.

Ready to move from guesswork to data-driven capacity planning? Start your free trial and see what your usage data reveals.