3.3 Cost-Benefit Analysis and Best Practices
Time
and money are finite. After you complete your
risk assessment, you will have a long list of risks—far more
than you can possibly address or defend against. You now need a way
of ranking these risks to decide which you need to mitigate through
technical means, which you will insure against, and which you will
simply accept. Traditionally, the decision of which risks to address
and which to accept was done using a cost-benefit
analysis, a process of assigning cost to each possible
loss, determining the cost of defending against it, determining the
probability that the loss will occur, and then determining if the
cost of defending against the risk outweighs the benefit. (See
Cost-Benefit Examples
sidebar for some examples.)
Risk assessment and cost-benefit analyses generate a lot of numbers,
making the process seem quite scientific and mathematical. In
practice, however, putting together these numbers can be a
time-consuming and expensive process, and the result is numbers that
are frequently soft or inaccurate. That's why the
approach of defining best practices has become
increasingly popular, as we'll discuss in a later
section.
3.3.1 The Cost of Loss
Determining the cost of loss can be very
difficult. A simple cost calculation considers the cost of repairing
or replacing a particular item. A more sophisticated cost calculation
can consider the cost of out-of-service equipment, the cost of added
training, the cost of additional procedures resulting from a loss,
the cost to a company's reputation, and even the
cost to a company's clients. Generally speaking,
including more factors in your cost calculation will increase your
effort, but will also increase the accuracy of your calculations.
For most purposes, you do not need to assign an exact value to each
possible risk. Normally, assigning a cost range to each item is
sufficient. For instance, the loss of a dozen blank diskettes may be
classed as "under $500," while a
destructive fire in your computer room might be classed as
"over $1,000,000." Some items may
actually fall into the category
"irreparable/irreplaceable"; these
could include loss of your entire accounts-due database or the death
of a key employee.
You may want to assign these costs based on a finer scale of loss
than simply "lost/not lost." For
instance, you might want to assign separate costs for each of the
following categories (these are not in any order):
Non-availability over a short term (< 7-10 days)
Non-availability over a medium term (1-2 weeks)
Non-availability over a long term (more than 2 weeks)
Permanent loss or destruction
Accidental partial loss or damage
Deliberate partial loss or damage
Unauthorized disclosure within the organization
Unauthorized disclosure to some outsiders
Unauthorized full disclosure to outsiders, competitors, and the press
Replacement or recovery cost
3.3.2 The Probability of a Loss
After you have identified the
threats, you need to estimate the likelihood of each occurring. These
threats may be easiest to estimate on a year-by-year basis.
Quantifying the threat of a risk is hard work. You can obtain some
estimates from third parties, such as
insurance companies. If the event
happens on a regular basis, you can estimate it based on your
records. Industry organizations may have collected statistics or
published reports. You can also base your estimates on educated
guesses extrapolated from past experience. For instance:
Your power company can provide an official estimate of the likelihood
that your building would suffer a power outage during the next year.
They may also be able to quantify the risk of an outage lasting a few
seconds versus the risk of an outage lasting minutes or hours.
Your insurance carrier can
provide you with actuarial data on the probability of death of key
personnel based on age, health, smoker/nonsmoker status, weight,
height, and other issues.
Your personnel records can be used to
estimate the probability of key computing employees quitting.
Past experience and best guess can be used to estimate the
probability of a serious bug being discovered in your software during
the next year (100% for some software platforms).
If you expect something to happen more than once per year, then
record the number of times that you expect it to happen. Thus, you
may expect a serious earthquake only once every 100 years (for a
per-year probability of 1% in your list), but you may expect three
serious bugs in Microsoft's Internet Information
Server (IIS) to be discovered during the next month (for an adjusted
probability of 3,600%).
3.3.3 The Cost of Prevention
Finally, you need to calculate the
cost of preventing each kind of loss.
For instance, the cost to recover from a momentary power failure is
probably only that of personnel
"downtime" and the time necessary
to reboot. However, the cost of prevention may be that of buying and
installing a UPS system.
Costs need to be amortized over the expected lifetime of your
approaches, as appropriate. Deriving these costs may reveal secondary
costs and credits that should also be factored in. For instance,
installing a better fire-suppression system may result in a yearly
decrease in your fire insurance premiums and give you a tax benefit
for capital depreciation. But spending money on a fire-suppression
system means that the money is not available for other purposes, such
as increased employee training or even investments.
Suppose you have a 0.5% chance of a single power outage lasting more
than a few seconds in any given year. The expected loss as a result
of personnel not being able to work is $25,000, and the cost of
recovery (handling reboots and disk checks) is expected to be another
$10,000 in downtime and personnel costs. Thus, the expected loss and
recovery cost per year is (25,000 + 10,000) x .005 = $175.
If the cost of a UPS system that can handle all your needs is
$150,000, and it has an expected lifetime of 10 years, then the cost
of avoidance is $15,000 per year. Clearly, investing in a UPS system
at this location is not cost-effective. On the other hand, reducing
the time required for disk checking by switching to a journaling
filesystem might well be worth the time required to make the change.
As another example, suppose that the compromise of a password by any
employee could result in an
outsider gaining access to trade secret information worth $1,000,000.
There is no recovery possible, because the trade secret status would
be compromised, and once lost, it cannot be regained. You have 50
employees who access your network while traveling, and the
probability of any one of them accidentally disclosing the password
(for example, having it "sniffed"
over the Internet; see Chapter 11) is 2%. Thus, the
probability of at least one password being disclosed during the year
is 63.6%. The expected loss is (1,000,000
+ 0) x .636 = $636,000. If the cost of avoidance is buying
a $75 one-time password card for each user (see Chapter 8), plus a
$20,000 software cost, and the system is good for five years, then
the avoidance cost is (50 x 75 + 20,000) / 5 = $4,750 per
year. Buying such a system would clearly be cost-effective.
|
3.3.4 Adding Up the Numbers
At the conclusion
of this exercise, you should have a multidimensional matrix
consisting of assets, risks, and possible losses. For each loss, you
should know its probability, the predicted loss, and the amount of
money required to defend against the loss. If you are very precise,
you will also have a probability that your defense will prove
inadequate.
The process of determining if each defense should or should not be
employed is now straightforward. You do this by multiplying each
expected loss by the probability of its occurring as a result of each
threat. Sort these in descending order, and compare each cost of
occurrence to its cost of defense.
This comparison results in a prioritized list of things you should
address. The list may be surprising. Your goal should be to avoid
expensive, probable losses before worrying about less likely,
low-damage threats. In many environments, fire and loss of
key personnel are much more likely to occur, and are more damaging
than a break-in over the network. Surprisingly, however,
it is break-ins that seem to occupy the attention and budget of most
managers. This practice is simply not cost-effective, nor does it
provide the highest levels of trust in your overall system.
To figure out what you should do, take the figures that you have
gathered for avoidance and recovery to determine how best to address
your high-priority items. The way to do this is to add the cost of
recovery to the expected average loss, and multiply that by the
probability of occurrence. Then, compare the final product with the
yearly cost of avoidance. If the cost of avoidance is lower than the
risk you are defending against, you would be advised to invest in the
avoidance strategy if you have sufficient financial resources. If the
cost of avoidance is higher than the risk that you are defending
against, then consider doing nothing until after other threats have
been dealt with.
You can identify and reduce risks,
but you can never eliminate risk entirely.
For example, you may purchase a UPS to reduce the risk of a power
failure damaging your data. But the UPS may fail when you need it.
The power interruption may outlast your battery capacity. The
cleaning crew may have unplugged it last week to use the outlet for
their floor polisher.
A careful risk assessment will identify these
secondary
risks and help you plan for them as well. You might, for
instance, purchase a second UPS. But, of course, both units could
fail at the same time. There might even be an interaction between the
two units that you did not foresee when you installed them. The
likelihood of a power failure gets smaller and smaller as you buy
more backup power supplies and test the system, but it never becomes
zero.
Risk assessment can help you protect yourself and your organization
against human risks as well as natural ones. For example, you can use
risk assessment to help protect yourself against computer break-ins,
by identifying the risks and planning accordingly. But, as with power
failures, you cannot completely eliminate the chance of someone
breaking in to your computer.
This fact is fundamental to computer security: no matter how secure
you make a computer, it can always be broken into given sufficient
resources, time, motivation, and money, especially when coupled with
random chance.
Even systems that are certified according to the
Common
Criteria (successor to the Department of Defense's
"Orange Book," the
Trusted Computer Systems Evaluation Criteria)
are vulnerable to break-ins. One reason is that these systems are
sometimes not administered correctly. Another reason is that some
people using them may be willing to take bribes to violate security.
Computer access controls do no good if they're not
administered properly, exactly as the lock on a building will do no
good if it is the night watchman who is stealing office equipment at
2:00 a.m.
People are often the weakest link in a security system. The most
secure computer system in the world is wide open if the
system administrator cooperates with
those who wish to break into the machine. People can be compromised
with money, threats, or ideological appeals. People can also make
mistakes—such as accidentally sending email containing account
passwords to the wrong person.
Indeed, people are usually cheaper and easier to compromise than
advanced technological safeguards.
|
3.3.5 Best Practices
Risk
analysis has a long and successful history
in the fields of public safety and civil engineering. Consider the
construction of a suspension bridge. It's a
relatively straightforward matter to determine how much stress cars,
trucks, and severe weather will place on the
bridge's cables. Knowing the anticipated stress, an
engineer can compute the chance that the bridge will collapse over
the course of its life given certain design and construction choices.
Given the bridge's width, length, height,
anticipated traffic, and other factors, an engineer can compute the
projected destruction to life, property, and commuting patterns that
would result from the bridge's failure. All of this
information can be used to calculate cost-effective design decisions
and a reasonable maintenance schedule for the
bridge's owners to follow.
The application of risk analysis to the field of computer security
has been less successful. Risk analysis depends on the ability to
gauge the expected use of an asset, assess the likelihood of each
risk to the asset, identify the factors that enable those risks, and
calculate the potential impact of various choices—figures that
are devilishly hard to pin down. How do you calculate the risk that
an attacker will be able to obtain system administrator privileges on
your web server? Does this risk increase over time, as new security
vulnerabilities are discovered, or does it decrease over time, as the
vulnerabilities are publicized and corrected? Does a well-maintained
system become less secure or more secure over time? And how do you
calculate the likely damages of a successful penetration? Few
statistical, scientific studies have been performed on these
questions. Many people think they know the answers to these
questions, but research has shown that most people badly estimate
risk based on personal experience.
Because of the difficulty inherent in risk analysis, another approach
for securing computers called best practices
or due
care, has emerged in recent years. This approach consists
of a series of recommendations, procedures, and policies that are
generally accepted within the community of security practitioners to
give organizations a reasonable level of overall security and risk
mitigation at a reasonable cost. Best practices can be thought of as
"rules of thumb" for implementing
sound security measures.
The best practices approach is not without its problems. The biggest
problem is that there really is no one set of "best
practices" that is applicable to all sites and
users. The best practices for a site that manages financial
information might have similarities to the best practices for a site
that publishes a community newsletter, but the financial site would
likely have additional security measures.
Following best practices does not assure that your system will not
suffer a security-related incident. Most best practices require that
an organization's security office monitor the
Internet for news of new attacks and download patches from vendors
when they are made available. But even if you follow this regimen, an attacker might
still be able to use a novel, unpublished attack to compromise your
computer system. And if the person monitoring security announcements
goes on vacation, then the attackers will have a lead on your process
of installing needed patches.
The very idea that tens of thousands of organizations could or even
should implement the "best"
techniques available to secure their computers is problematical. The
"best" techniques available are
simply not appropriate or cost-effective for all organizations. Many
organizations that claim to be following best practices are actually
adopting the minimum standards commonly used for securing systems. In
practice, most best practices really aren't.
We recommend a combination of risk analysis and best practices.
Starting from a body of best practices, an educated designer should
evaluate risks and trade-offs, and pick reasonable solutions for a
particular configuration and management. For instance, servers should
be hosted on isolated machines, and configured with an operating
system and software providing the minimally required functionality.
The operators should be vigilant for changes, keep up to date on
patches, and prepare for the unexpected. Doing this well takes a
solid understanding of how the system works, and what happens when it
doesn't work. This is the approach that we will
explain in the chapters that follow.
3.3.6 Convincing Management
Security is not free. The more elaborate
your security measures become, the more expensive they become.
Systems that are more secure may also be more difficult to use,
although this need not always be the case. Security can
also get in the way of "power
users" who wish to exercise many difficult and
sometimes dangerous operations without authentication or
accountability. Some of these power users can be politically powerful
within your organization.
After you have completed your risk assessment and cost-benefit
analysis, you will need to convince your
organization's management of the need to act upon
the information. Normally, you would formulate a policy that is then
officially adopted. Frequently, this process is an uphill battle.
Fortunately, it does not have to be.
The goal of risk assessment and cost-benefit analysis is to
prioritize your actions and spending on security. If your business
plan is such that you should not have an uninsured risk of more than
$10,000 per year, you can use your risk analysis to determine what
needs to be spent to achieve this goal. Your analysis can also be a
guide as to what to do first, then second, and can identify which
things you should relegate to later years.
Another benefit of risk assessment is that it helps to justify to
management that you need additional resources for security. Most
managers and directors know little about computers, but they do
understand risk and cost/benefit analysis. If you can show that your
organization is currently facing an exposure to risk that could total
$20,000,000 per year (add up all the expected losses plus recovery
costs for what is currently in place), then this estimate might help
convince management to fund some additional personnel and resources.
On the other hand, going to management with a vague
"We're really likely to see several
break-ins on the Internet after the next CERT/CC
announcement" is unlikely to produce anything other
than mild concern (if that).
|