Cloud Buyers' Requirements Questionnaire : QA: QoS Aspects

 

Objective

Describe the requirements for the various aspects of Quality of Service (QoS).

QoS aspects encompass Reliability, Availability, and Serviceability (RAS), scalability, timeliness, and security – sometimes referred to as the “non-functional requirements” (NFRs) or the “ilities”.

QoS characterization includes the periodicity of each aspect, and relative prioritization of aspects at key time periods. The correlation of QoS and functional requirements includes periodicity.

QA.1: What are the tangible and intangible business impacts of the solution not achieving the minimum SLAs?

Example Responses

  1. Tangible: short-term and long-term decrease in revenue (for example, customers turn to alternatives) and/or increase in expenses (for example, regulatory penalties, or new customer acquisition)
  2. Intangible: loss of reputation

QA.2: What are your QoS requirements related to “Can I get it and keep It running”?

The aspects in this category include consumability, manageability, serviceability, agility, flexibility, and adaptability.

Consumability includes adoptability and usability.

Flexibility includes the ability to swap or change as allowed by the technology and the contract.

Example Responses

  1. A development project must be able to obtain a new test environment within 1 day and sustain the use of this facility for a defined period of, for example, 1 to 2 months’ usage.
  2. We must be able to provision a new customer within 30 minutes and meet the requirements for that customer account validation.

QA.3: What are your QoS requirements related to “Is it running”?

The aspects in this category include availability, fault tolerance, recoverability, stability, reliability, and dependability.

Availability is typically measured in 9s. A “Five 9s” system is up 99.999% of the time – a little over five minutes per year downtime. Planned, scheduled outages for maintenance are typically excluded.

Fault tolerance avoids service disruption. A fully fault-tolerant design has no SPOFs, and often accommodates multiple failures within a service window – survivability of planned, unscheduled outages. Software fault tolerance designs include exception handling and task rollback.

Recoverability is measured in terms of recovery time objective(RTO ) and recovery point objective (RPO). RTO determines how quickly the system needs to be fully operational; RPO determines how much data loss can be tolerated.

Reliability is typically measured as mean time to failure (MTTF) and mean time between failures (MTBF) or number of failures in a billion hours (FITS). MTTF is used along with mean time to repair (MTTR) to calculate the MTBF.

Example Responses

  1. “Five nines” availability, “eleven nines” availability
  2. No scheduled maintenance between 09:00 and 17:00 Mon-Fri
  3. System restart to take <15 minutes

QA.4: What are your QoS requirements related to “How is it running”?

The aspects in this category include system performance optimization: balancing scalability and throughput with the timeliness aspects of latency, predictability (determinism), and synchronicity.

Latency varies from less than 1ms to more than 150ms (milliseconds to minutes).

Micro-level predictability is often defined as “hard” (a Mission Control system, for example) or “soft” (a Plant Control system).

Macro-level predictability:

  • Planned and scheduled; for example, Financial Management mini-peaks at end of quarter and major peak at end of fiscal year
  • Planned and unscheduled; for example, Pharmaceutical certification on average five times a year, Florida hurricane response
  • Unplanned and unscheduled; for example, Air traffic control response to ash cloud

Throughput may be described in a range of 1 to 100,000 transactions per second, or “moderate” to “very high”. 1500 orders per second is moderate; the number of phone calls on Mother's Day is very high.

Synchronicity descriptors include tight tolerance, workload balancing, fairness, and time period correlated.

Example Responses

  1. 95% of responses to take <500 ms
  2. Normal maximum of 1,000 orders/sec but up to 10,000 orders/sec in 6 weeks before Christmas

QA.5: What are your QoS requirements related to “Is it running in spec”?

“In spec” means is the solution operating in accordance with the business’s mission and policies. The aspects in this category include governance and low impact footprint.

Governance includes prioritization of services and transactions, compliance, access (security).

Low impact footprint includes social responsibility and sustainability (for example, carbon-neutrality and water usage).

Example Responses

  1. The defined service is within the catalog menu specification.
  2. The defined service uses the specific small, medium, or large service scope specification options to meet Operating Level Agreements (OLA) for the use of that service.
  3. Changes to the specification are managed within a governance change control program.
  4. Pay and conditions for operations staff must meet legal requirements for the countries where the staff are located.
  5. Operations staff to have no access to customer data.

QA.6 What is the relative weighting of your QoS requirements?

Identify which QoS requirements must be met, and for which you are willing to negotiate less stringent terms.

Example Responses

  1. The peak availability and throughput requirements must be met, but the latency and low impact footprint requirements may be diminished.

 

 

 

 

The Open Group
Platinum Members
HP IBM Oracle Philips