How Historical Data Should Drive Capacity Planning
Here’s a question: How did you decide how many session hosts your AVD environment needs?
If you’re like most organisations, the answer involves some combination of:
- Vendor recommendations — “Microsoft suggests X users per host”
- Initial estimates — “We have 500 users, so let’s start with 50 hosts”
- Buffer for safety — “Better add 20% just in case”
- Trial and error — “Users complained, so we added more”
This approach isn’t wrong, exactly. It gets you to a working deployment. But it almost certainly leaves you overpaying for capacity you don’t need.
There’s a better way: let your historical data tell you what you actually need.
The Problem with Estimation
Estimation-based capacity planning has a fundamental flaw: it’s based on assumptions, not reality.
Common assumptions that lead to overspending:
- All users work the same hours (they don’t)
- Peak usage is consistent day-to-day (it isn’t)
- Everyone needs resources simultaneously (they don’t)
- Usage patterns are static over time (they change)
When you provision based on assumptions, you inevitably provision for worst-case scenarios. And worst-case scenarios rarely happen.
The result: Session hosts sitting idle, consuming budget while delivering no value.
What Historical Data Reveals
Once you start collecting and analysing actual usage data, patterns emerge that assumptions could never predict.
Daily Patterns
Real usage data typically shows:
6am - 8am: 5-10% of peak capacity
8am - 9am: Rapid ramp to 60-70%
9am - 11am: Peak usage (100%)
11am - 1pm: Slight dip (80-90%) — lunch breaks
1pm - 3pm: Return to peak (95-100%)
3pm - 5pm: Gradual decline (70-50%)
5pm - 8pm: Tail off (30-10%)
8pm - 6am: Minimal usage (5-10%)
If you’re provisioning for 100% capacity 24/7, you’re paying for resources used at full capacity for maybe 6 hours a day.
Weekly Patterns
Dig deeper and weekly patterns emerge:
- Monday: Higher than average — catching up after the weekend
- Tuesday-Thursday: Consistent peak usage
- Friday: Often lower — early finishes, remote working
- Weekend: Minimal (unless you have shift workers)
A static scaling schedule treats every weekday the same. Historical data shows they’re not.
Monthly Patterns
Look at month-level data and you’ll find:
- Month-end: Finance teams work extended hours
- Quarter-end: Even more pronounced
- Holiday periods: Significantly reduced usage
- Summer months: Often lower overall
Seasonal Patterns
Zoom out further:
- Back-to-school period: Education sectors see spikes
- Retail peak seasons: November-December changes everything
- Summer holidays: Extended reduced capacity needs
- Industry-specific cycles: Tax season, audit periods, etc.
From Data to Decisions
Collecting data is one thing. Turning it into actionable capacity decisions is another.
Step 1: Establish Baselines
Before you can optimise, you need to understand your current state:
- What’s your actual peak usage? Not theoretical — actual
- When does peak usage occur? Time of day, day of week
- How much variation exists? Is peak consistent or sporadic?
- What’s your minimum usage? Nights, weekends, holidays
Step 2: Identify Patterns
Look for recurring patterns in your data:
- Predictable spikes — Same time every day/week
- Gradual trends — Usage growing or declining over time
- Anomalies — One-off events vs regular occurrences
- Correlations — Usage linked to external factors
Step 3: Model Capacity Requirements
With patterns identified, model what capacity you actually need:
Scenario: 100-person finance team
Traditional approach:
- 100 users ÷ 10 users/host = 10 hosts
- Running 24/7 = 7,200 host-hours/month
- Cost: £X per month
Data-driven approach:
- Peak concurrent usage: 85 users (8.5 hosts needed)
- Average usage: 45 users (4.5 hosts needed)
- Off-hours usage: 5 users (1 host needed)
- Weighted average: ~4,000 host-hours/month
- Cost: ~55% of traditional approach
Step 4: Implement Dynamic Scaling
Historical data enables intelligent scaling rules:
Rules derived from 6 months of usage data:
Weekdays:
- 6am: Scale to 20% capacity (early arrivals)
- 8am: Scale to 80% capacity (main arrival)
- 9am: Scale to 100% capacity (peak start)
- 3pm: Begin gradual scale-down
- 6pm: Scale to 30% capacity
- 10pm: Scale to minimum
Fridays:
- Apply weekday rules but cap at 80% peak
- Begin scale-down at 2pm instead of 3pm
Month-end (last 3 working days):
- Extend peak hours to 8pm
- Maintain 50% capacity until 10pm
Step 5: Continuously Refine
Historical data isn’t a one-time analysis. Patterns change:
- New hires shift usage patterns
- Remote working policies evolve
- Seasonal variations differ year-to-year
- Business changes affect when people work
Continuous monitoring keeps your capacity planning accurate.
The Metrics That Matter
Not all data is equally valuable. Focus on these metrics:
Concurrent Sessions
The number of active sessions at any given time. This directly determines host requirements.
What to track:
- Peak concurrent sessions (daily, weekly, monthly)
- Average concurrent sessions
- Minimum concurrent sessions
- Rate of change (how quickly sessions ramp up/down)
Session Duration
How long users stay connected affects capacity planning differently than raw session counts.
What to track:
- Average session length
- Distribution of session lengths
- Abandoned sessions (connected but inactive)
Host Utilisation
How efficiently are your hosts being used?
What to track:
- CPU utilisation across hosts
- Memory utilisation
- Sessions per host vs capacity
- Hosts sitting idle
User Behaviour
Understanding how users work helps predict future needs.
What to track:
- Login times by user/department
- Application usage patterns
- Geographic distribution
- Device types
Common Pitfalls
Pitfall 1: Insufficient Data History
A week of data isn’t enough. You need:
- At least a month for daily patterns
- A quarter for monthly patterns
- A year for seasonal patterns
Pitfall 2: Ignoring Outliers
That day when usage spiked to 150%? Don’t ignore it. Understand why it happened:
- Was it a one-off event? (Ignore for planning)
- Does it recur predictably? (Plan for it)
- Could it happen unexpectedly again? (Build in buffer)
Pitfall 3: Over-Optimising
Data-driven doesn’t mean cutting to the bone. Leave appropriate headroom:
- User experience matters
- Unexpected demand happens
- Scaling takes time
A 10-15% buffer above predicted peak is reasonable.
Pitfall 4: Set and Forget
Historical patterns change. What was true 6 months ago might not be true today. Regular review is essential.
How The Smart Scaler Uses Historical Data
We’re building The Smart Scaler to be genuinely data-driven:
- Automatic data collection — We gather usage metrics continuously
- Pattern recognition — Our algorithms identify recurring patterns
- Predictive modelling — We anticipate demand before it hits
- Continuous learning — The system gets smarter over time
Coming soon, we’ll be introducing features that let you visualise your historical patterns, understand where you’re over-provisioning, and automatically adjust scaling rules based on actual usage.
Getting Started
You don’t need sophisticated tools to start using historical data:
- Enable Azure diagnostics — Start collecting usage metrics
- Export to Log Analytics — Central location for analysis
- Build basic dashboards — Visualise patterns
- Review weekly — Look for patterns and anomalies
- Adjust scaling rules — Apply what you learn
Or, let The Smart Scaler do the heavy lifting for you.
Ready to move from guesswork to data-driven capacity planning? Start your free trial and see what your usage data reveals.