What is a benchmark
A benchmark is generally defined as a set of published (Performance Testing) outcomes for a given benchmarking standard i.e. TPC, SPECjapp, RPE2, etc. Benchmarking standards have been traditionally put together by industry organizations (SPEC www.spec.org, TPC www.tpc.org, etc.) with the objective of giving manufacturers the ability to prove the suitability of their kit for a given workload while giving the end customer a view of what the performance of a system looks like for a given workload. Benchmarks are generally well defined Performance Tests which are conducted within very controlled environments (Note the use of the word controlled environments. Most benchmarking standards very clearly specify the following -
- The nature of workload (transactions, batch, etc.) that needs to be generated during the test runs
- The intensity or rate of the workload (transactions, batch, etc.) that needs to be generated during the test runs
- The rate and depth of information disclosure required from the vendor performing the benchmark
Why performance benchmarks are required
Benchmarking standards evolved for some of the following reasons -
- To give vendors an opportunity to prove the scalability and performance of their kit
- To give customers the opportunity to compare the system performance for different types of hardware
- To give customers the opportunity to compare the price and features along with performance of the underlying system
Benchmarking standards were supposed to fill the vacuum with regards to availability of data that allowed customers to compare kit from different manufacturers along the lines of price, features and performance. Without performance benchmarks it is very difficult if not impossible for customers to validate the suitability of a given system for a particular customer workload. The only approach in lieu of benchmarking standards and published benchmarks would be to purchase the kit or perform a detailed Proof Of Concept with each of the manufacturer for the intended workload.
However, for all the good that benchmarking organizations along with the benchmarking standards offer there are some real downsides too that customers should be well aware of.
What are some of the challenges with existing benchmarks
Benchmarking standards evolved over the last decade with the intention of giving vendors the opportunity to brag the suitability of their kit for a given workload while giving customers the ability to compare performance for different types of kit available in the market place. The challenges with some of the existing benchmarking standards are -
- Applications and application configuration used by vendors to generate these benchmarking results are rarely available for public review
- Data and results from the performance benchmarks conducted by the various vendors are rarely available for public review
- The workload chosen by the various vendors and defined by the benchmark standards rarely reflect the true workload any customer would see in their environments
- These benchmarks are executed by vendors in a highly controlled environment with heavily specked top end hardware for a given hardware range. Production environments rarely include top end hardware for any given hardware range. Most customer applications to not need top end hardware for the computing requirements.
- Data generated by vendors touting the suitability of their systems for a given workload in many ways is a conflict of interest. There is rarely third party review of the benchmarking setup and publicly available benchmarking results for purposes of review.
- The configuration of the systems and the software is rarely available for public review along with details of the benchmarking results
As you have seen Benchmarking Standards like SPEC and TPC evolved for the right reasons. However the lack of transparency with regards to details of the results of these benchmarks combined with the fact that thereâ€™s conflict of interest when the benchmarks are run by vendors of the hardware led to the need for a third party organization to set their own standard.
What is RPE2
RPE2 or Relative Performance Estimate 2 is a benchmarking standard defined by Ideas International (now owned by Gartner) that is aimed at helping clients understand and compare various characteristics of computer systems along the lines of pricing, hardware/software features and most importantly the overall system performance. These benchmarking results along with the comparisons help customers make a more informed investment decisions and identify kit suitable for a given type of workload.
Ideas International (the original designer of RPE2) developed RPE2 with the objective of providing customers a comparative performance view for different systems from various different hardware manufacturers. RPE and RPE2 was put together by Ideas International (now Gartner) to address the challenges mentioned in the previous section i.e. challenges with regards to lack of transparency with regards to published benchmarking results including the conflict of interest when you have vendors responsible for execution of their own benchmarks and publishing of the results.
A key point to note from Gartner’s perspective is that RPE2 is a theoretical performance estimate and not an actual observed measurement of server performance. It is largely based on published benchmark results and relative performance ratings from server manufacturers.
What makes up an RPE2 benchmark
The original version of RPE developed by Ideas International (and now owned by Gartner) consisted purely of any single light weight OLTP (Online Transaction Processing) workload. This was later revised to include a composite workload that consisted of a combination of many different benchmarking standards. Gartner defines RPE2 as follows -
RPE2 is a composite benchmark, meaning that server performance characteristics are captured and calibrated against multiple workload profiles represented by a mix of industry benchmarks that have the widest technology coverage. The published or estimated performance points for each server processor option are aggregated by calculating a geometric mean value. In the standard RPE2, all components are weighted equally to prevent RPE2 from skewing toward a single benchmark or workload type. Other weighting options are described below.
A composite mix benchmark offers the following advantages:
- The multiple components represent a broader range of workloads and server architecture characteristics.
- Multiple components enable the impact of benchmark life cycles to be managed in a less disruptive manner; benchmark substitution can be handled within the existing framework, and the overall spectrum of results can be kept broadly consistent.
- Multiple components increase the likelihood that more absolute performance values contribute directly to the composite.
- Multiple components enable the incorporation of additional components and mitigate the enforced loss of a single component.
The initial RPE2 benchmark set was selected from all available industry and ISV benchmarks based on how complete their coverage was for the major manufacturers and server architectures, and the extent of their published results. The current RPE2 set includes the following six benchmark inputs in its calculation:
As additional performance points for missing technologies appear in other existing benchmarks, or if new industry benchmarks are developed that potentially satisfy our selection criteria, they will also be considered for inclusion within the RPE2 composite.
Gartner also goes on to mention that In 2010, RPE2 was expanded to include Workload Extensions. Workload Extensions use different weightings of the constituent benchmark components of RPE2 in order to highlight performance within specific workload profiles. The following RPE2 Workload Extensions were created:
- RPE2-ERP, highlighting the SAP SD Two-Tier component
- RPE2-Java, highlighting the SPECjbb2005 component
- RPE2-OLTP, highlighting the TPC-C component
- RPE2-Compute-Intensive, highlighting the SPEC CPU2006 components
History of RPE and Ideas International
Here is what Wikipedia has to say about Ideas International the entity responsible for the RPE and RPE2, â€œFounded in 1981,Â Ideas InternationalÂ (IDEAS) is an IT analyst company specializing in technology insight and comparisons of competitive server and storage technology. Acquired by Gartner in June 2012. Clients include IT end-users and IT vendors. IDEAS is the creator of RPE2, a computer benchmark that compares the relative performance of servers.
In 1986, IDEAS began distributing information about IT products in a printed volume called Competitive Profiles, which contained different pages (profiles) for each product. IDEAS began trading as a public company on the Australian Stock Exchange in 2001. In 2004, IDEAS acquired D.H. Brown Associates, a research company based in the United States focused on in-depth analysis of computing technologies. In 2009, the Australian Information Industry Association named IDEAS as the New South Wales winner of its iAward under the category Sustainability and Green IT. The following year in 2010 IDEAS won the New South Wales ICT Exporter of the Year iAward.
You can read more about the history of Ideas International at Wikipedia.
Where can I find more information on RPE2
For more information on RPE and RPE2 you can visit the following links -
Trevor Warren is passionate about challenging the status-quo and finding reasons to innovate. Over the past 16 years he has been delivering complex systems, has worked with very large clients across the world and constantly is looking for opportunities to bring about change. Trevor constantly strives to combine his passion for delivering outcomes with his ability to build long lasting professional relationships. You can learn more about the work he does at LinkedIn. You can download a copy of his CV at VisualCV. Visit the Github page for details of the projects he’s been hacking with.