Ampere Computing Logo
Contact Sales
Ampere Computing Logo
ARM Native Processors deliver higher performance results

Hadoop on OCI Workload Brief

Big Data Hadoop on Oracle Cloud Ampere A1 instance
Jun 2022

Ampere - Empowering What’s Next

Oracle Cloud Infrastructure (OCI) offers Ampere® Altra® compute instances on the new Cloud Native Ampere A1 platform. The Ampere A1 platform can be deployed as bare metal servers or flexible VM shapes, giving customers full control of their entire cloud stack. The Ampere A1 VM shapes provide flexible sizing from 1-80 cores and 1-64 GB of memory per core, along with several key benefits such as deterministic performance, linear scalability, and a secure architecture with the best price-performance in the market.

The Apache Hadoop framework is designed for distributed processing of large data sets intended to scale out from a single server to thousands of machines, each offering local computation, storage, or both. When implemented in a cluster, the software has built-in resiliency to handle a failed server or a failed component in a server. Hadoop consists of four main modules, HDFS (Hadoop Distributed File System), YARN (Yet Another Resource Negotiator), Map Reduce and Hadoop Common. Applications collect data in various formats and seed it to the cluster. The name node, which is the center piece of HDFS file system, has metadata information of all chunks of data and keeps the directory tree of all files in the file system and tracks where across the cluster the file data is kept. A MapR (Map Reduce) job runs against this data in HDFS across data nodes.

All the above tasks are computationally intensive, and the entire cluster is better implemented on high performance components. The data pulled from HDFS, demands high-performance storage, is coordinated across different servers in the cluster, demanding a high-speed network and must be quickly processed by thousands of tasks until ultimately aggregated by reducers to compose the final output.

Hadoop on OCI Ampere A1 Flex VM

Oracle Cloud Infrastructure uses Ampere Altra processors with an industry leading 80 cores per CPU for the Ampere A1 shapes. All cores are capable of running at the maximum frequency of 3.0 GHz consistently. Utilizing Ampere low power design and OCIs high performance infrastructure, Ampere A1 shapes offer the best price-performance in the cloud.

OCI’s A1 compute provides superior price-performance for big data applications when compared to its x86 peers. A1 shapes with Ampere Arm processors are a recommended choice for Hadoop applications due to the predictable and highly scalable nature of the architecture.

In this solution brief, we compare the performance of OCI A1 (Ampere Altra) VM’s with OCI’s S3 Standard (Intel Icelake), E3 (AMD Rome) and E4 (AMD Milan) flex VM’s running Hadoop TeraSort.

Key Benefits

Consistency and Predictability: Ampere Altra processors are designed for cloud native usage, providing consistent and predictable performance for Hadoop solutions.

Scalable: With an innovative scale-out architecture, Ampere Altra processor’s high core count and compelling single-threaded performance combined with consistent frequency on all cores delivers up to 15-20% better performance on OCI Ampere AI compute shapes making them ideal for big data workloads.

Power Efficient: Industry-leading energy efficiency allows Ampere Altra processors to hit competitive levels of raw performance while consuming much lower power than the competition and hence a lower carbon footprint.

Ampere Altra
  • 80 64-bit CPU cores up to 3.00 GHz
  • 64 KB L1 I-cache, 64 KB L1 D-cache per core
  • 1 MB L2 cache per core
  • 32 MB System Level Cache (SLC)
  • 2x full-width (128b) SIMD
  • Coherent mesh-based interconnect

Memory

  • 8x 72-bit DDR4-3200 channels
  • ECC and DDR4 RAS
  • Up to 16 DIMMs and 4 TB addressable memory

Connectivity

  • 128 lanes of PCIe Gen4
  • Coherent multi-socket support
  • 4 x16 CCIX lanes

Technology & Functionality

  • Arm v8.2+, SBSA Level 4
  • Advanced Power Management

Performance

  • SPECrate®2017 Integer Estimated: 300
Hadoop on OCI Architecture

Hadoop on OCI.jpg

Benchmarking Configuration

Virtual machines were provisioned in a private network space as depicted above. Hadoop 3.3.1 (with aarch64 binaries) was installed on the test bed. We used Intel HiBench benchmark tool on each of these VM’s to generate a 250GB dataset. Hadoop TeraSort benchmark was run on these VM’s to capture throughput measured in MB/s.

  • A single VM was spun across each of the architectures with the configuration as outlined in this table.
  • All the virtual machines had identical configurations of CPU cores/threads, memory, and storage.
  • The storage bandwidth was limited to 1000 MB/s across all the VM’s. The maximum bandwidth for an x86 VM with 8 OCPU’s is 8 Gb/s on OCI. An A1 instance with 16 OCPU’s receives a max bandwidth of 16 Gb/s. The A1 instance was throttled to 8 Gb/s in our benchmark, to keep it at par with x86 VM’s.
  • Few changes, like disabling transparent huge pages and reducing VM swappiness, were altered on the guest operating system.
  • A few configuration parameters in Hadoop were tuned to maximize the utilization of CPU, memory, and storage.

VM and Hadoop Configuration

S3FlexE3FlexE4FlexA1Flex
OCPU88816
Cores/Threads8/168/168/1616/16
Mem96G96G96G96G
Archx86_64x86_64x86_64aarch64
KernelOracle Linux 8.5
StorageiSCSi 2 x 500G luns, VPU 50, 2 x 480 MBPS
JDKOracle JDK 8 EPP

Hadoop and Yarn Configuration

dfs.block.size - 256M

yarn.scheduler.minimum-allocation-mb - 1024

yarn.scheduler.maximum-allocation-mb - 65536

yarn.scheduler.minimum-allocation-vcores - 1

yarn.scheduler.maximum-allocation-vcores - 15

yarn.nodemanager.resource.cpu-vcores - 16

yarn.nodemanager.resource.memory-mb - 94208

mapreduce.map.memory.mb - 1024

mapreduce.reduce.memory.mb - 3072M

mapred.reduce.parallel.copies - 16

mapreduce.reduce.shuffle.parallelcopies - 16

mapreduce.map.java.opts - 2048M

Benchmark

Hadoop TeraSort

Intel HiBench benchmark tool was used on each of the VM’s to generate a 250GB dataset. Hadoop TeraSort benchmark was run on these VM’s and the TeraSort output in MBPS was captured.

Relative Hadoop Terasort Performance on OCI
Relative Price Per Performance For Hadoop on OCI VMs
Observations

1. The CPU utilization was hovering around 80% making this a fair comparison under high load conditions. 2. The disk utilization of iSCSI LUNs was around 90%, also near capacity. 3. A1 VM’s performed well compared to the legacy x86 shapes. The above graphs were plotted by taking s3flex as the baseline reference point. 4. Ampere A1 instances price per performance was observed to be 60% better than Intel and 10-15% better than AMD shapes.

Note: Price-performance was calculated from OCI Compute pricing list, for 16 core VM’s and 96G Memory (Oct 2022). Storage Costs were calculated from OCI Storage pricing sheet for 2x500GB iSCSI luns at 50 VPU ( 480 MB/s).

Conclusions

Oracle OCI A1 instances with Ampere Altra processors provide high performance for big data solutions like Hadoop. The performance advantage on the Ampere shapes combined with the price advantage provides a up to 60% higher value when using OCI Ampere A1 shapes for Hadoop workloads.

For More Information

OCI Ampere A1 compute instances

Footnotes

All data and information contained herein is for informational purposes only and Ampere reserves the right to change it without notice. This document may contain technical inaccuracies, omissions and typographical errors, and Ampere is under no obligation to update or correct this information. Ampere makes no representations or warranties of any kind, including but not limited to express or implied guarantees of noninfringement, merchantability, or fitness for a particular purpose, and assumes no liability of any kind. All information is provided “AS IS.” This document is not an offer or a binding commitment by Ampere. Use of the products contemplated herein requires the subsequent negotiation and execution of a definitive agreement or is subject to Ampere’s Terms and Conditions for the Sale of Goods.

System configurations, components, software versions, and testing environments that differ from those used in Ampere’s tests may result in different measurements than those obtained by Ampere.

©2022 Ampere Computing. All Rights Reserved. Ampere, Ampere Computing, Altra and the ‘A’ logo are all registered trademarks or trademarks of Ampere Computing. Arm is a registered trademark of Arm Limited (or its subsidiaries). All other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

Ampere Computing® / 4655 Great America Parkway, Suite 601 / Santa Clara, CA 95054 / amperecomputing.com

Created At : July 11th 2022, 10:04:53 pm
Last Updated At : July 30th 2024, 10:01:14 pm
Ampere Logo

Ampere Computing LLC

4655 Great America Parkway Suite 601

Santa Clara, CA 95054

image
image
image
image
image
 |  |  |  |  |  | 
© 2024 Ampere Computing LLC. All rights reserved. Ampere, Altra and the A and Ampere logos are registered trademarks or trademarks of Ampere Computing.
This site is running on Ampere Altra Processors.