Archive

Archive for September, 2009

Reliability, Availability, and Scalability

September 30th, 2009 西坪 No comments

Reliability, according the Wikipedia:

In general, reliability (systemic def.) is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances.

The IEEE defines it as “. . . the ability of a system or component to perform its required functions under stated conditions for a specified period of time.”

On the page, it also tells:

Reliability may refer to:
Data reliability, a property of some disk arrays in computer storage

Availability, according to Wikipedia:

In telecommunications and reliability theory, the term availability has the following meanings:

1. The degree to which a system, subsystem, or equipment is operable and in a committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time. Simply put, availability is the proportion of time a system is in a functioning condition.

Note 1: The conditions determining operability and committability must be specified.

Note 2: Expressed mathematically, availability is 1 minus the unavailability.

2. The ratio of (a) the total time a functional unit is capable of being used during a given interval to (b) the length of the interval.

Note 1: An example of availability is 100/168 if the unit is capable of being used for 100 hours in a week.

Note 2: Typical availability objectives are specified either in decimal fractions, such as 0.9998, or sometimes in a logarithmic unit called nines, which corresponds roughly to a number of nines following the decimal point, such as “five nines” for 0.99999 reliability.

There is another page about High Avaliability on Wikipedia:

High availability is a system design protocol and associated implementation that ensures a certain degree of operational continuity during a given measurement period.

Users want their systems, for example wrist watches, hospitals, airplanes or computers, to be ready to serve them at all times. Availability refers to the ability of the user community to access the system, whether to submit new work, update or alter existing work, or collect the results of previous work. If a user cannot access the system, it is said to be unavailable. Generally, the term downtime is used to refer to periods when a system is unavailable.

Scalability, according to Wikipedia:

In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged.

Methods of adding more resources for a particular application fall into two broad categories: Scale Up (Vertically) and Scale Out (Horizontally). Scale Out could be the better choices in many larger situations.

References:

  1. System Reliability and Availability
  2. Reliability and Availability Basics
Categories: $Architecture & design Tags: