Why A New OS

Why a whole new OS is required

Today’s cloud-based applications run a heavyweight stack: the hypervisor, which divides the hardware resources among virtual machines; the operating system, which divides the virtual machine’s resources among applications; and the application server, which divides the application’s resources among the end users.

 

Clearly, there is a lot of duplication going on. Each layer adds its own overhead in an attempt to abstract away and hide the problems caused by the lower layer. The result is inefficient and complex.

 

Enter OSv - the operating system for the cloud. On the one hand, designed and optimized to run on a hypervisor. On the other hand, designed to run an application stack without getting in the way. Designed for the cloud.

It’s cloudy out there and we love it

The public cloud era opens new horizons and opportunities for hi-tech businesses. Virtualization is dominant, resources are evergreen and agility is the key. The role of the OS in this form changes. The need to massively scale out forces developers to run multiple copies of identical Virtual Machines (VMs). VMs are the new process that needs to be lightweight, blazing fast, scalable and cheap. DevOps and PaaS solutions bypass the OS and allow developers to deploy their code directly to the cloud. All these goodies are rapidly penetrating the enterprise space with matching private cloud capabilities and virtual appliances that act as bridges to the public cloud

Virtualization 2.0

VMware had brought virtualization into the x86 world (recall that virtualization was introduced by IBM’s mainframes in the 70s). While VMware has done a fantastic job shifting the enterprise from physical to virtual, it stopped there. Amazon Web Services had wider vision; Amazon do not use terminology such as virtual machine. Amazon sees the entire user workload and solves scaling issues.

 

The right(tm) way - the new world is compose out of the smallest building blocks possible, running in clusters of multiple VMs. Common components are NoSQL databases, MemCacheD, front end webservers, backend webservers, etc. One single application is hosted within the virtual machine. That application uses a fraction of the guest OS capabilities - there is no hardware as the hypervisor owns that, there are no real users as this is a server, there are no other apps to schedule. Users pay for CPU cycles they don’t really need and need to maintain and tune a full blown generic OS.

 

Unlike new infrastructure such as hypervisors, NoSQL, PaaS, etc, the operating system hasn’t changed much. The same OS image that powers physical machines, from tiny embedded devices to room-size top500 supercomputers, is also used in the cloud.

Typical cloud workloads run application servers using Java, Ruby, Python and JavaScript (node.js). This historical evolution is not ideal - Java, Linux and the hypervisor implement parallel/duplicated mechanisms for protection and abstraction. These mechanisms are redundant when combined, and impose a large overhead in terms of CPU and memory.

 

The management efforts needed to maintain Linux are extensive. Thousands of packages, multiple security updates, complex tuning and specialists to manage. It doesn’t stop in the OS level. JVM workloads require manual tuning in a variety of ways and VM templates and instances needs to be software managed by tools like Puppet and Chef.

 

These tools help to manage multiple OS configurations but if we’ll examine their operation on a single OS instance, we’ll discover that their roots go to the Unix OS of the 1970s. They’re based on pushing human configuration files and strings to /etc files. Typical OSs still do not have a fully automated API. OSv takes a simpler approach, with a common REST API for all configuration and data collection.