Profile photo for Neelanjana Dutta

In short, the answer is, yes. But let's look at some details first.

First thing first
Docker is a very light-weight application engine that deploys VM-like containers that shares system level resources to allow easy deploy and multi-tenancy. But it has its own network and process space, as well as a layered union mount file system. It's all written in Go. The three main components are:

  • docker client,
  • docker daemon or server (REST API)
  • docker containers.


How big is the docker container?
The docker container rootfs (lossely speaking, the operating system FS layer) and tmpfs can be of any size depending on the service you are dockerinzing. A small python app can be under 1MB or an full blown service can range around 16G or whatever it is configured to be.

The docker image can be a few hundred megabytes, if that is what you are asking. Usually it takes fractions of a second to launch a container from an image.

Can I put a data analytics product in it
Like I said, yes, you can. But let's break the problem down here.

  • Services - Docker centers around Service Oriented Architecture, or SOA. How will you you reorganize your application into micro-level, self-sufficient services that can communicate with each other? Let's say you have a web app. You need the web engine server (WARs) to be dockerized and there are plenty of examples on the internet to do this. You sure have a database instance, and that can be in a container. Then say you have a few daemons running for something - you need to make a call on where to put them. In short, the key design principle is to identify the services to dockerize. Then maybe start with writing your own dockerfile for one component and get the ball rolling from there.
  • Networking - Docker solves the port-conflicts in multi-tenancy by dynamic mapping of ports. Each docker container has configurable and statically mapped ports exposed to the user that maps to physical ports in the system (a process abstracted by docker). Docker containers also have IPs assigned that are not discoverable outside the host. In case of service colocation being absent, you might also need the host IPs or configure the docker containers with unique discoverable IPs.
  • The data - Docker does not work with all the filesystems, so based on how your files and other data is stored, it might become a tall order. But in general, you can expose a volume on the host, or even dockerize a volume, and make it available to dockerized services. So yes, data can be ported too.
  • Handling - You might want a resource manager like YARN to allocate container. Zookeeper or Consul can take care of failover. Consul has built in support for configuration management too.


The prices you pay

There is no free lunch! Now I am not talking about limitations of Docker in general, such as monitoring, automation etc, but some insight into design and development challenges probable to occur while dockerizing an application. I have written a more detailed answer on this topic: Neelanjana Dutta's answer to What are the technical key challenges you face when using Docker containers?

  • Service discovery - As you might have guessed already, where there is multi-tenancy, there is the problem of discovery. That's an added overhead to begin with, but in long run, it ensures good design, maintenance and performance. Also,there is another good news - as docker is becoming the new big thing, service discovery apps are jumping onto making themselves work for it. I have personally used Zookeerper and Consul, and heard about etcd. Consul works just fine and very curtailed towards SOA and docker.
  • Missing features - As evident in the very active docker Git community, there are a ton of feature requests and work in progress. For instance, container self registration, and self inspect. Currently, there are off-the-shelf clients that will serve the purpose.
  • Multiple processes - Let's say you have a monitoring daemon that continuously monitors all your services. You need it in all the containers with all your services then. But docker is yet maturing the multi-process aspect in a container. It is very good with microservices but not that great yet with a hybrid of SOA and legacy. Again, there are off-the-shelf solutions such as a supervisor daemon that can help you with this.


Edit: Added some more details.

View question
About · Careers · Privacy · Terms · Contact · Languages · Your Ad Choices · Press ·
© Quora, Inc. 2025