We have seen how to setup the benchmarking environment and two simple scenarios (Scenario 1, Scenario 2) in the previous blogposts of this series. This time we are going to add Hadoop to the mix to test a workload with significant memory, CPU and I/O impact.
Scenario 3
The image used for this test is an Ubuntu Xenial 16.04.1 LTS (cloud image), which is available for download here (the image used for those tests is the Daily Build [20161119]). Hadoop has been configured to run in standalone mode (Single Node Cluster), with commands being sent via SSH from Rally over a tenant network. The official documentation on how to setup an Apache Hadoop Cluster 2.7.2 (latest stable version) can be found here.
This test consists of:
- booting a VM from an image where Hadoop is already installed and configured. The flavor being used for each VM has 4096MB RAM, 40GB disk and 2 vCPUs.
- waiting until the VM becomes active and is ready to process Hadoop jobs
- executing three different Hadoop jobs:
- TeraGen -> a map/reduce program to generate the data
- TeraSort -> samples the input data generated by TeraGen and uses map/reduce to sort the data
- TeraValidate -> a map/reduce program that validates if the output from TeraGen is properly sorted
- deleting the VM
Part 1 – One VM in parallel
For the beginning, we started with one VM and the size of the input data for TeraGen set to 10.000.000 (number of 100-byte rows).
- Results for KVM with Xenial Ubuntu 16.04.1 LTS (default kernel version 4.4.0-45-generic) as host operating system:
- Results for Hyper-V with Windows Server 2012 R2 as host operating system:
- Results for Hyper-V with Windows Server 2016 as host operating system:
Remarks for the results obtained so far: the average time is approximately the same in case of Hyper-V, with KVM being slightly slower than Hyper-V.
Part 2 – 10 VMs in paralellel
The following results have been obtained by running the tests with a load of 10 VMs in parallel.
The size of the input data for TeraGen is still set at 10.000.000 (number of 100-byte rows):
- Results for KVM with Xenial Ubuntu 16.04.1 LTS (default kernel version 4.4.0-45-generic) as host operating system:
- Results for Hyper-V with Windows Server 2012 R2 as host operating system:
- Results for Hyper-V with Windows Server 2016 as host operating system:
After increasing the workload of the compute nodes by adding more parallel iterations to the test, we can notice that Hyper-V is slightly faster than KVM on average time. The differences in this case are anyway quite negligible. Time for some conclusions? Yes, in the next blog post!