Heterogeneous Swapping for Cluster Computers E. Ryerson Lehman-Borer and Alec Pillsbury (Newhall)
Nswap2L: Heterogeneous Swapping for Cluster Computers
Tia Newhall, E. Ryerson Lehman-Borer, and Alec Pillsbury
Most super computers these days are of a variety known as cluster computers. Cluster computers are built by combining many small, mass produced computers, called nodes, and interconnecting them with a fast network so that they can communicate quickly and easily. Large jobs can be completed quickly by breaking them into small parts and assigning each node a part. Unfortunately, it is difficult to keep the workload even between all the nodes--it is often the case that some nodes will use all their main memory (RAM) while other nodes in the network have plenty of free RAM. Typically, when a computer fills up all its RAM, it will swap some memory from RAM onto a hard disk. This is convenient because hard disk space is much more cheap and abundant than RAM. The drawback is that hard drives are much slower, which makes swapping to them a costly operation. Nswap provides an alternate method of swapping. Rather than send excess memory to the hard disk, Nswap finds a node in the network whose RAM has free space and sends the memory to be held there. Since network speed is much faster than disk speed, this is a better solution.
Nswap2L builds on Nswap by choosing whether to swap memory to network RAM, hard disk, flash drive, or some other storage device. This summer our main focus has been completing the implementation of Nswap2L. One area we spent a lot of time on was the prefetch path, which moves pages internally between devices. One reason to do this is that we could move it to a device with faster read access times. Another reason is that we want to benefit from read parallelism by spreading pages out over several devices so we can read from all of them at once. Nswap2L must decide when to prefetch, which devices to prefetch between, how many pages, and which pages.
When testing placement policies and prefetch policies, it is very useful for us to have an interface to Nswap2L which gives us feedback about how the system is performing, an allows us to change system parameters such as which policies to use or what size nswap cache to use without recompiling. To do this, we implemented an interface that lets us call special Nswap2L functions from user level by reading and writing to special files in the /sys directory.
Literature cited: Tia Newhall and Douglas Woos, Proceedings of IEEE Cluster Conference, Austin, TX, September 2011