Speech Privacy Index (SPI): How to Measure and Achieve Acoustic Privacy

Acoustic privacy is the ability to have a conversation without being overheard by unintended listeners. In buildings, this is not a subjective quality — it is a measurable, quantifiable performance metric. The Speech Privacy Index (SPI) provides the framework for measuring it, and understanding SPI is essential for designing offices, healthcare facilities, legal practices, and any space where confidential communication occurs.

This guide covers what SPI is, how it is calculated, the three physical elements that determine it, how to classify privacy performance, and the design strategies that achieve each privacy class.

What the Speech Privacy Index Measures

The Speech Privacy Index (SPI) is defined in ASTM E1130, "Standard Test Method for Objective Measurement of Speech Privacy in Open Plan Spaces Using Articulation Index." Despite the name referencing open plan spaces, the underlying methodology applies to any scenario where speech intelligibility between a talker and a listener determines privacy performance.

SPI quantifies the degree to which speech from a source location is unintelligible at a receiver location. It is expressed as a percentage from 0% to 100%:

SPI = 0%: Speech is perfectly intelligible. No privacy whatsoever.
SPI = 100%: Speech is completely unintelligible. Perfect privacy.

The relationship between SPI and the Articulation Index (AI) is straightforward:

SPI = (1 - AI) x 100

Where AI is the Articulation Index, a value from 0.00 to 1.00 that quantifies the proportion of speech information that reaches the listener in a usable form. An AI of 0.00 means no speech information reaches the listener (perfect privacy); an AI of 1.00 means all speech information reaches the listener (zero privacy).

Relationship to STI

The Speech Transmission Index (STI), defined in IEC 60268-16, is a related but distinct metric. Both AI and STI quantify speech intelligibility, but they use different calculation methods:

AI sums the signal-to-noise ratio contributions across 20 one-third octave bands from 200 Hz to 5000 Hz, weighted by their importance to speech intelligibility. It was developed by French and Steinberg at Bell Laboratories in 1947.
STI uses the modulation transfer function (MTF) approach, evaluating how well the temporal envelope of speech is preserved through the acoustic path. STI accounts for both noise and reverberation effects.

For most practical room acoustics applications, AI and STI give comparable results. The approximate relationship is:

STI = 0.10 corresponds to AI = 0.05 (essentially unintelligible)
STI = 0.50 corresponds to AI = 0.30 (poor intelligibility, partial privacy)
STI = 0.75 corresponds to AI = 0.65 (good intelligibility, poor privacy)

SPI is specifically derived from AI, not STI. However, because ASTM E1130 is focused on open plan spaces where reverberation effects are relatively small, the practical difference between AI-based and STI-based privacy assessment is minor.

The Three Elements of Speech Privacy

Speech privacy is determined by the interaction of three independent physical quantities. Changing any one of them changes the SPI. Understanding this three-element model is the key to designing for privacy.

Element 1: Source Level (Signal)

The source level is the sound pressure level of speech at the talker's mouth. Normal conversational speech produces approximately:

Casual speech: 54 dBA at 1 metre
Normal speech: 60 dBA at 1 metre
Raised speech: 66 dBA at 1 metre
Loud speech: 72 dBA at 1 metre
Shouting: 78+ dBA at 1 metre

The source level sets the starting point. Louder speech requires more path loss and/or higher background noise to achieve the same privacy.

In practice, the source level is the element the designer has least control over. People speak at the level their environment demands. In a very quiet room (NC-20), people naturally lower their voices — the Lombard effect works in reverse. In a noisy room (NC-45), people raise their voices to be heard, partially defeating the masking benefit of the higher background noise.

ASTM E1130 specifies a standard source spectrum for privacy calculations, representing normal-effort male speech. This standardised source allows objective comparison between different acoustic environments.

Element 2: Path Loss (Attenuation)

Path loss is the total reduction in sound level between the talker and the listener. It includes:

Geometric spreading: Sound level decreases by 6 dB per doubling of distance in free field conditions (inverse square law). In a room, reflections from surfaces partially compensate for geometric spreading, reducing the effective attenuation to 3-5 dB per distance doubling.
Partition transmission loss: If a wall, door, or ceiling separates the talker and listener, the partition's Sound Transmission Class (STC) or weighted sound reduction index (Rw) determines how much sound is blocked. A typical office partition provides 35-55 dB of transmission loss depending on construction.
Screen and furniture attenuation: In open plan offices, workstation screens (1.2-1.8 m high) provide 5-15 dB of attenuation at speech frequencies, depending on height, material, and the presence of an absorptive ceiling.
Ceiling attenuation: Sound travelling over the top of a partition through a shared ceiling plenum is attenuated by the ceiling system's Ceiling Attenuation Class (CAC). A standard lay-in mineral fibre ceiling tile has a CAC of 25-35. A high-performance ceiling can achieve CAC 40+.

The total path loss determines the speech level at the listener's position. For a listener 4 metres from the talker in an open plan office with absorptive ceiling and 1.4 m screens, the path loss might be 18-22 dB, yielding a received speech level of 38-42 dBA for normal-effort speech.

Element 3: Background Noise Level (Masking)

The background noise level at the listener's position determines the signal-to-noise ratio and therefore the intelligibility of the received speech. Background noise sources include:

HVAC system noise: Continuous broadband noise from air handling units, diffusers, and fan coil units. Typically the primary masking source in mechanically ventilated buildings.
Electronic sound masking: Loudspeaker systems that generate controlled, spatially uniform broadband noise tuned to mask speech frequencies. The masking spectrum is typically shaped to follow the NC-35 to NC-40 contour with enhanced energy in the 500-2000 Hz range where speech is most intelligible.
Activity noise: Noise from other occupants, equipment, and external sources. This is not controlled by the designer and varies with occupancy patterns.

The critical insight is that privacy improves as the signal-to-noise ratio at the listener decreases. There are only two ways to reduce the signal-to-noise ratio: reduce the signal (increase path loss) or increase the noise (add masking). Both contribute equally to the SPI calculation.

The Privacy Equation

Combining the three elements:

Received signal level = Source level - Path loss

Signal-to-noise ratio = Received signal level - Background noise level

AI = f(S/N across frequency bands)

SPI = (1 - AI) x 100

A worked example illustrates the interaction:

Parameter	Value
Source level (normal speech at 1 m)	60 dBA
Path loss (4 m in open plan, screens, absorptive ceiling)	20 dB
Received speech level at listener	40 dBA
Background noise level (HVAC + masking)	42 dBA
Signal-to-noise ratio	-2 dB
Approximate AI	0.08
SPI	92%

In this example, the received speech level (40 dBA) is actually below the background noise level (42 dBA), giving a negative signal-to-noise ratio. At a -2 dB S/N ratio, very little speech information is usable by the listener, resulting in an SPI of 92% — normal privacy. If the background noise were reduced to 35 dBA (removing the masking system), the S/N ratio would become +5 dB, the AI would increase to approximately 0.25, and the SPI would drop to 75% — poor privacy.

Privacy Classification

ASTM E1130 and the related ASTM E2638 (Standard Test Method for Objective Measurement of the Speech Privacy Provided by a Closed Room) define three privacy classes based on the SPI value.

Confidential Privacy (SPI greater than 95%)

At SPI values above 95%, speech from the source location is virtually unintelligible at the receiver location. An eavesdropper cannot reconstruct the content of a conversation even with sustained effort. This level is required for:

Executive offices where strategic business discussions occur
Legal consultation rooms where attorney-client privilege must be maintained
Healthcare examination rooms where HIPAA-protected health information is discussed
HR interview rooms where personnel matters are discussed
Government secure areas (SCIFs) where classified information is handled

Achieving confidential privacy typically requires a combination of high-STC partitions (STC 45+), sealed door assemblies (STC 35+), no shared ceiling plenum (full-height partitions or high-CAC ceilings), and electronic sound masking at the receiver location.

In practice, the weakest link in a confidential privacy installation is almost always the door. A standard hollow-core office door has an STC of 20-25. Replacing it with a solid-core door with acoustic seals and an automatic door bottom raises the STC to 35-40. This single change can improve the SPI by 10-15 percentage points.

Normal Privacy (SPI 80% to 95%)

At SPI values between 80% and 95%, occasional words may be intelligible, but the listener cannot follow the thread of a conversation. This is acceptable for most office environments where complete confidentiality is not required but where conversations should not be a sustained distraction.

Normal privacy is the target for:

Standard private offices and small meeting rooms
Open plan offices with proper acoustic treatment
Classrooms adjacent to other classrooms
Hotel rooms adjacent to other hotel rooms

This level is achievable in open plan offices with absorptive ceilings (NRC 0.85+), 1.4 m workstation screens with absorptive surfaces, and background noise at NC-35 to NC-40 (either from HVAC systems or electronic masking).

Poor Privacy (SPI less than 80%)

Below SPI 80%, conversations are partially to fully intelligible. The listener can follow the general content of a conversation, identify speakers, and potentially comprehend sensitive information. This level is unacceptable for any space where speech privacy is a design requirement.

Poor privacy is the default condition in most open plan offices that lack acoustic treatment. A typical open plan office with a hard ceiling (NRC 0.10), no workstation screens, and low background noise (NC-25) will have an SPI of 40-60% — meaning that conversations are intelligible at distances of 10+ metres and the entire floor plate functions as a single acoustic zone.

Sound Masking: The Privacy Multiplier

Electronic sound masking is the most cost-effective tool for improving speech privacy in existing buildings. A masking system consists of loudspeakers installed above the ceiling (or integrated into furniture) that produce a controlled, spatially uniform noise field tuned to mask speech frequencies.

How Masking Works

Sound masking does not cancel speech. It adds background noise that reduces the signal-to-noise ratio at the listener's position. The masking signal is carefully shaped:

Spectrum: The masking noise follows a contour similar to NC-35 or NC-40 but with slightly elevated levels in the 500-2000 Hz range, which corresponds to the frequency range that carries the most speech intelligibility information.
Level: Typical masking levels range from 42-48 dBA, depending on the privacy target and the existing background noise level.
Uniformity: The masking must be spatially uniform, with no more than plus or minus 2 dB variation across the floor plate. If there are quiet spots, they become zones where conversations can be overheard.

Masking System Types

Plenum-based systems use loudspeakers installed in the ceiling plenum, facing upward toward the structural deck. The sound reflects off the deck and transmits through the ceiling tiles, creating a diffuse, even sound field. This is the most common approach in commercial offices and provides the best uniformity.

Direct-field systems use loudspeakers mounted on or below the ceiling, facing downward into the occupied space. These systems are used when there is no ceiling plenum (exposed structure) or when the ceiling tiles have very high sound transmission loss (limiting plenum-based system effectiveness).

Furniture-integrated systems place masking emitters in workstation partitions or desk accessories. These provide localised masking and are useful in retrofit situations where ceiling access is limited.

Masking Level Selection

The masking level must be high enough to provide the target SPI improvement but low enough that occupants do not find the masking noise itself annoying or fatiguing. The following guidelines apply:

Existing Background Noise	Masking Target Level	Expected SPI Improvement
NC-25 (very quiet)	44-46 dBA	+15 to +25 SPI points
NC-30 (quiet)	44-46 dBA	+10 to +20 SPI points
NC-35 (moderate)	45-47 dBA	+5 to +15 SPI points
NC-40 (already moderate)	Not recommended	Marginal benefit, increased annoyance

The diminishing returns at higher background noise levels are important. In a room that is already NC-40, adding masking on top of the existing HVAC noise pushes the total background above 48 dBA, which many occupants find uncomfortably loud for sustained work. The better strategy in an NC-40 environment is to increase path loss (better screens, absorptive ceiling) rather than adding more noise.

Masking Cost

Electronic sound masking systems typically cost:

Hardware: £5-12 per square metre of floor area (loudspeakers, zoning modules, control units)
Installation: £3-8 per square metre (cabling, commissioning, tuning)
Total: £8-20 per square metre

For a 1,000 m2 open plan office, a sound masking system costs approximately £8,000-20,000 installed and commissioned. Compared to the cost of acoustic ceiling upgrades (£30-75 per m2) or full-height partition construction (£150-300 per linear metre), masking is often the most economical privacy improvement.

Healthcare: HIPAA and Acoustic Privacy

The Health Insurance Portability and Accountability Act (HIPAA) in the United States requires that covered entities implement safeguards to protect the privacy of protected health information (PHI). While HIPAA does not specify acoustic performance metrics directly, the "reasonable safeguards" requirement has been interpreted by the Department of Health and Human Services (HHS) to include measures that prevent inadvertent disclosure of PHI through overheard speech.

Healthcare Privacy Requirements

The Facility Guidelines Institute (FGI) Guidelines for Design and Construction of Hospitals and Outpatient Facilities specify acoustic requirements for healthcare spaces:

Space Type	STC Requirement	SPI Target
Examination room to examination room	STC 45	80-90% (normal)
Examination room to corridor	STC 40	80-90% (normal)
Examination room to waiting area	STC 50	90-95% (confidential)
Psychiatric consultation	STC 50	>95% (confidential)
Pharmacy consultation	STC 45	80-90% (normal)

The most common HIPAA privacy failure in healthcare facilities is at the check-in and registration desk. Patients state their name, date of birth, reason for visit, and insurance information at a counter that is often less than 2 metres from the waiting area. Without acoustic countermeasures, every waiting patient hears this information clearly.

Solutions include:

Physical barriers: Glass or acrylic partitions between check-in positions and the waiting area, with speech ports or intercom systems for communication.
Sound masking: Ceiling-mounted masking in the waiting area, tuned to 44-46 dBA, which raises the background noise enough to render speech at 2-3 metres unintelligible.
Layout: Orienting the check-in counter so that the patient faces away from the waiting area, directing speech energy toward the wall rather than toward waiting patients.
White noise machines: Desktop white noise generators at each check-in position, producing localised masking at the source. Less effective than ceiling masking but simpler to implement.

Measuring SPI in the Field

Equipment Required

Class 1 or Class 2 sound level meter with octave band or one-third octave band analysis (ANSI S1.4 or IEC 61672)
Omnidirectional loudspeaker (calibrated, with known directivity) to simulate the talker, or a real talker speaking at a controlled effort level
Pink noise or speech-shaped noise source for the loudspeaker
Calibrator (94 dB or 114 dB pistonphone)

Measurement Procedure (ASTM E1130)

Set up the source: Place the omnidirectional loudspeaker at the talker's position, at head height (1.2 m seated, 1.5 m standing), driven with pink noise at a level calibrated to produce 60 dBA at 1 metre (simulating normal speech effort).

Measure the signal: At the listener's position, measure the octave band spectrum (125 Hz to 8000 Hz) with the source operating. This gives the received signal level at each frequency band.

Measure the background noise: Turn off the test source and measure the octave band spectrum of the background noise (HVAC + masking) at the same listener position. Ensure the measurement captures a representative sample (at least 30 seconds of stable noise).

Calculate the signal-to-noise ratio: At each octave band, subtract the background noise level from the received signal level to obtain the S/N ratio.

Calculate AI: Apply the AI weighting factors to each one-third octave band S/N ratio (or, if using octave bands, apply the appropriate conversion). Sum the weighted contributions to obtain the AI value.

Calculate SPI: SPI = (1 - AI) x 100.

Common Measurement Pitfalls

Measuring during occupied hours: Occupant noise (conversation, footfall) inflates the background noise measurement, artificially improving the apparent SPI. Measurements should be taken outside occupied hours with only HVAC and masking systems operating.
Incorrect source level calibration: If the test source produces 65 dBA at 1 metre instead of 60 dBA, the measured SPI will be approximately 5-8 points lower than the true normal-speech SPI.
Ignoring flanking paths: Measuring SPI across a partition without considering flanking through the ceiling plenum, under the door, or through back-to-back electrical outlets will overestimate the privacy if flanking is present, or miss real flanking that compromises privacy.

Design Strategies by Privacy Class

Achieving Confidential Privacy (SPI greater than 95%)

The following combination is typically required:

Partition: STC 50+ (e.g., double stud gypsum wall with resilient channels and mineral wool insulation)
Door: STC 35+ with acoustic seals on all four edges and automatic door bottom
Ceiling: Full-height partition to structure, or ceiling with CAC 40+ if partition terminates at ceiling line
Background noise: NC-35 minimum, preferably with electronic masking at 44-46 dBA
Penetrations: All electrical outlets, data boxes, and service penetrations acoustically sealed
Perimeter: Acoustic sealant at all partition perimeter joints (floor, walls, ceiling)

Achieving Normal Privacy (SPI 80-95%)

Partition: STC 40-45 (standard stud wall with gypsum board both sides and mineral wool insulation)
Door: STC 28-33 (solid-core door with perimeter seals but no automatic door bottom)
Ceiling: CAC 30+ if partition terminates at ceiling line
Background noise: NC-30 to NC-35, masking optional but beneficial
Open plan alternative: NRC 0.85+ ceiling, 1.4 m absorptive screens, masking at 44-46 dBA

Improving Poor Privacy (SPI less than 80%)

If an existing space has poor privacy, the most effective interventions in order of cost-effectiveness are:

Add sound masking (£8-20/m2) — typically the cheapest single intervention that provides the largest SPI improvement
Replace ceiling tiles with NRC 0.85+ product (£30-60/m2) — reduces reverberant field contribution to received speech level
Add workstation screens at 1.4-1.8 m height with absorptive faces (£100-200 per screen)
Seal doors — add acoustic perimeter seals and automatic door bottoms to existing doors (£50-150 per door)
Upgrade partitions — add a layer of gypsum board to existing walls (£15-25/m2) or install resilient channels and additional insulation

The order matters. Masking and ceiling upgrades provide the best return on investment because they improve privacy across the entire floor plate simultaneously, while partition and door upgrades only improve privacy for individual rooms.

Summary

Speech privacy is not an abstract quality — it is a measurable performance metric defined by the Speech Privacy Index. The three elements that determine SPI (source level, path loss, and background noise) are all under the designer's control to varying degrees. The key principles are:

Path loss and masking work together: You need both high attenuation and adequate background noise. One without the other is insufficient.
Sound masking is the most cost-effective privacy tool in existing buildings, providing 10-25 SPI points improvement at a fraction of the cost of construction upgrades.
The weakest link controls the result: A £5,000 partition is worthless if the door has a 10 mm air gap at the bottom. Privacy is a system performance metric — every element in the transmission path matters.
Privacy requirements vary by use: Confidential (SPI greater than 95%) for healthcare and legal, Normal (SPI 80-95%) for standard offices, and even poor privacy (SPI less than 80%) is acceptable in social spaces like cafeterias and lobbies.
Measure to verify: Design calculations predict performance; field measurements confirm it. ASTM E1130 provides the standardised methodology.