You might have heard about Docker a lot because it’s become a buzzword in our Dev community today. But if you don’t know about it then don’t worry, today in this blog I will clear your all question about docker. I wanted to keep this an introductory and beginner-friendly blog about Docker, so I will cover everything about what, why, and how part of it. After reading this blog you will be able to answer what is docker, why we need it, what is the difference between containerization and virtualization, and many more. I hope you will enjoy it 😀
Why we need docker?
Before we learn about docker, we first need to understand why we need it and what problem it solves. So, Have you ever been through a situation where you developed software for your client? And after optimization and cross-verification that your software is working fine, you passed it to your client but it doesn’t run on your client computer, even though you checked it multiple times in your system before passing it. This is because of an incompatibility issue which generally happens when your software dependencies are not fulfilled in the client system like the libraries, framework and other os configuration you used is not matched with the one in your client system.
And this is where docker comes into the picture, docker takes care of all the dependencies and configuration of any software and packages them into a Docker run container. Which makes your software portable and provides the same environment to your software as it was when you created it and which ensures that it will run on any computer despite their OS types, system configuration, and anything.
What is docker?
So, let’s see what Docker is? Docker is an open-source tool designed to help in creating containers and container-based applications in the form of docker containers. In other words, it is a containerization platform that simplifies the building, shipment, and running application. With its help, an application can be isolated from its underlying infrastructure which makes the software delivery faster.
Also read: what is GitHub’s Gist and how to create it
What is Docker container?
A docker container is a package of your application and all its dependencies like the library, networking, os configuration, runtime, system tools, and settings, etc. It is very lightweight and secure. In simple words it replicates the complete environment, your application would require to run. Which then can be shared with anyone with any OS and system configuration and it will run as smoothly as it was in your system.
Suppose you created a Java application and you used Tomcat in it for that you installed Tomcat in your and set up an environment with it. Now after the development you passed it to to the testing team, now to test that the testing team also downloaded and set up the environment with Tomcat, after passing the testing round your software finally got ready for deployment in production server but again to run it they will need to set up the environment with Tomcat all again. During these, there are mainly two problems.
- You need to set up the environment every time it goes to the next computer/server
- There could be a software running or incompatibility issue if in testing or production server we installed a different version of Tomcat, which will further waste our time.
How containerization solves the issue?
As we saw in the above case we need to install and setup the server every time we run/test on a different machine, so what containerization will do is it that it will package your application along with the required server (which in our case is Tomcat), library, system setting and any other configuration required for our application then with all that it will create a Tomcat docker image (You can think of an image as the blueprint of a docker’s container configuration through which we can create multiple containers) which then can we shared with your testing team and production server.
Now you might be thinking, hmm okay then what’s new here it was also possible with virtualization, why do we need to use docker? It is because it was not efficient as compared to containerization, which is another question and we will talk about it, but before that let’s see what virtualization is?
What is virtualization?
Virtualization is a technology that helps you create a virtual machine(you can think of a computer inside a computer that will have its own OS but will utilize your computer/server resources) inside your system(computer, server), now you might be thinking why someone will require to create a virtual machine inside their system.
Also read: Complete deep learning guide without math for beginners
Why we needed virtualization?
Before virtualization, we needed to create one individual server for a single task (which got a term called one app, one server). Imagine buying a whole new server for just one task, means suppose if you are running a business and you require to manage, mails, database, files, and web, etc then you will have to buy one separate server for every single task, where you each task will mostly use about 30-40% of the server (not utilizes the full resource of server) in which it was installed on. This is a nightmare for your saving account.
How does it work?
Virtualization is done with the help of software/programs called hypervisors which divide and monitor the physical resources (your computer/server resources like ram, CPU, storage, etc) with the virtual environment. If you’re having a hard time understanding hypervisors, then you can think of it as your mom who divides the cookies with you and your small brother and monitor you that you are not fighting with each other similar to that it hypervisor also takes cares of the virtual machine and, so basically, the hypervisor is a Virtual machine’s mom! XD.
The hypervisor is directly installed in your OS or server and then it takes your physical resources and divides it with your virtual environments as needed. This virtual environment (guest machine) is completely like a new computer where you have one OS installed, have ram, CPU storage, etc and you can do whatever you want as if it were your real computer, that’s also the reason why it’s called a virtual machine.
If you are interested in learning more about virtualization, then you can read it here
Containerization vs virtualization?
As we discussed earlier, virtualization creates a virtual machine inside your system which has its own OS installed and takes the resources from your base machine. So, the problem with that was that it sure does solve the problem of One app-One server but it makes another problem which is One app-One VM (don’t search this on the web, I just named it 😁) and has three major problem
- Size: Takes a huge amount of space and doesn’t utilize it
- Time: Takes too much time to start
- Integration: Hard to integrate
Size: → Virtual machine(VM) is not flexible with ram and storage usage like suppose if you allocated 5GB of ram to a VM instance and if it using 3GB of ram only then your 2GB ram will be gone in complete waste because the memory once allocated to the VM is blocked and can not be reallocated. This is a huge limitation and this is where docker takes the flag. In docker the amount of memory allocation is equal to the memory required by your application so, there’s no scene of memory wastage.
Time: → Starting a VM is just like starting your physical computer, it starts from scratch it will load all binary and libraries (binary: computer-readable code in binary format, that control the CPU and processor directly with bits, libraries: libraries are functions usable by various programs) which makes it slower and time-consuming. Whereas Docker don’t create VM and OS since they run on host OS so they are very faster than VM
Integration: → In VM you can install a limited number of tools in a single VM, so you need to create a new VM instance for each tool. This could be an infrastructure problem and expensive and here also docker wins, it allows you to create many instances of tools you want, all runs in the same or in different containers which can interact with one another just by running a few commands. Also, it’s easy to create multiple copies of these containers.
Advantages of Docker?
Till the point you already have concluded lots of advantages of docker but still let’s see what are the advantages of using docker
- Isolation and throttling: Docker containers keep apps isolated from other apps and from the underlying system. This helps in tracking the system resources consumption by any single application and also it makes a cleaner software stack.
- Portability: As you saw, portability is the main benefit of Docker. Any machine that supports the container’s runtime environment can run a docker’s container. So, docker helps you focus on your software works without worrying about the compatibility issue.
- Composability: This is also one of the cool features of docker. Composability makes it possible to compose several business applications (like a web server, a database, an in-memory cache) into a functional unit with easily changeable parts. every application is provided by different containers and can be modified, swapped, updated, and maintained easily and independently of the others
- Orchestration and scaling: Container orchestration tools provide a framework for managing containers and microservices architecture at scale. Some of the container orchestration tools for container lifecycle management are swarm, Apache Mesos
- Version controls & Rollback: Docker image files are made up of a series of layers which all are combined into a single image; these layers are created when the image changes or a user specifies a command, such as run or copy. These layers are used by docker for building and accelerating the building process of new containers. Every time there’s a new change, you essentially have a built-in changelog, providing you with full control over your container images. So, you can roll it back to the previous version, whenever you want
- Rapid deployment: Deployment in docker is very fast and smooth, it creates containers for each process which you can quickly share with new apps. And deployment times are substantially shorter in docker since it doesn’t need to boot to add or move a container. Also because of shorter deployment times, it is very easy and cost-effective to create and destroy data created by your containers
Disadvantages of docker
Everything got its pros and cons, so let’s see what are the cons of docker and when you should not use it
- Speed: The speed of running your application is limited in docker, it isn’t fast as running it directly on your bare-metal server or any virtual machine
- Security: Docker is not secure enough when it comes to preventing it from arbitrary malicious code from affecting other containers or the host OS. You can achieve a high level of security with a docker container if you configured it properly. But even after a proper configuration, it will be less secure than a VM and Bare metal isolation
- GUI Application: Applications with GUI requirements are not suited with requirements, in general, Docker was designed for isolated containers with console-based applications. However, there are few ways like “X11 forwarding”, through which you can make it possible to run a graphical interface inside a Docker container.
- Data: All docker files are created inside a container and stored in a writable container layer. These writable containers are connected to the host machine and it can be difficult to retrieve the data from the container if a different process needs it. So, if you need to move the data elsewhere, it won’t be easy and if the container somehow gets shut down then all the data stored inside a container will be lost forever. So, if your data is valuable then you have to use an additional tool for your data security
- Cross-platform compatibility: If an application designed to run in a docker container on a windows machine then, it won’t run in any other OS like Linux or macOS and vice versa. However, this is not an issue in a virtual machine
What is Docker Hub?
Okay, so if you made it till here then congrats, and here is one bonus topic for you, “docker hub”. Docker Hub is similar to GitHub, it is a hosted repository service by docker, where you can share and download docker container image files. There are already tons of container images shared in the docker hub which you can download and try in your system. Similar to GitHub docker hub also has some features like a private repository, push and pull container images, and more.
So, this was all about Docker, it uses and it pros and cons, if you want to know more about it then you can read it here
Data Scientist with 3+ years of experience in building data-intensive applications in diverse industries. Proficient in predictive modeling, computer vision, natural language processing, data visualization etc. Aside from being a data scientist, I am also a blogger and photographer.