An architecture of a web application can be as simple as one server with a LAMP stack, or something as complex as hundreds of machines running clusters of different stacks.
I came across this question on quora a few days ago – What does a web application architecture include?
The OP elaborated saying that he has a few keywords in his head, but has a hard time placing them in the big picture.
At first I thought to myself,
“man, this guy must be joking… that’s like asking what components go into a space shuttle – there’s tons of stuff!”
The next thought quickly followed:
“No one is going to answer this guy in enough details”.
And that’s when I decide to dedicate ~15-20 minutes to write up a lengthy answer, that hopefully, will satisfy the OP and give him a better understanding of the terms flying around.
One of the reasons I decided to put the effort into as full of answer as I could, is because this isn’t the first time I come across these types of questions. Web architecture and web technologies can be a very confusing topic – especially if you’ve read a tag cloud of buzz words.. things don’t always add up in your head.
I shared my answer with my good friend Tom Goren who suggested I post this on the blog. So thanks to him, this is my answer:
Well first, by looking at your question it seems you have misunderstood the meaning of architecture from a web application point of view. Or maybe we just see it differently. Either way I will try to be as specific as I can both to your question and what I find architecture to be.
I see architecture as flow diagram. From where the user enters all the way down to the CPU of the server and the power cord connected to it. The technologies, methods, and how everything is arranged to form a complete product is what I think about when architecture comes to mind.
I’ll give you a top to bottom approach of the *common* stuff around today, while trying to address some of the stuff your brought into the question yourself so you can see where they stand. It’s going to be long, but bear with me and hopefully you’ll have a firm grasp when you’re done reading (I truly hope).
Here we go.
A Web Application architecture can vary greatly depending on the application at hand, its needs, its behavior and of course, the means at hand.
That are a lot of layers you can stack up when building a web application, some are “mandatory” (like a front end, right? you have be to able to see and interact with something), while others, are optional or on a need basis.
Front-end / Client Side:
From an architecture point of view, these files would be served up from various cache sources on a large web application; meaning the HTML might be served up from various “web accelerators” or reverse proxy caches (like Varnish ), while the CSS and JS files can be served from a different source like a CDN (like Akamai’s, Amazon’s CloudFront or Rackspace’s CDN or any other)
Back-end / Server side:
Below that, you have what’s normally called the “Backend” (or depending how deep your infrastructure is, a front backend: some complex web application might have a backend to render views with a certain logic, and have a “deeper” backend for more complex stuff like business logic).
The “Backend” is where you usually use a programming language (unlike HTML & CSS which are markup languages) such as PHP, Python,Ruby,Java or whatever else you fancy. The backend usually decided how to render the front-end depending on it’s business logic.
This is also where those “Frameworks” you mentioned come in. Think of a framework of a set of prewritten tools and scaffolding to common tasks, meaning you can use various parts of the framework to avoid writing your own. Also, if you do need to write your own solution to something, they can make it easier and simpler for you, making it less necessary to reinvent the wheel.
Each framework is language specific and they vary on how much they impose on the developer.
“Ruby on Rails” is a Ruby framework, “Django” is a Python framework, “ZendFramework” is PHP framework and “Spring” is a Java framework just to name a few.
Before a flame war starts: no framework is better than any other, they all have their advantages and drawbacks [but Django is the best :)]
The backend layer might (I say might, because it really depends on the scale and performance you require) be using a cache layer of it’s own to cache certain data to avoid going “deeper” into the stack. Some caching servers/products are “Memcache” or “Redis” for example [yes redis can also be used as a normal data store]
These are basically processes that your language of choice can communicate with that are able to store and retrieve data in memory, meaning no disk I/O is needed making them very fast to work with.
In order for most web application to function, they need to be able to store data somewhere. That somewhere is usually a database of some sort. Some big name brands and household names are MySQL, PostgreSQL, Oracle,CouchDB,Redis,HBase and the list is pretty endless.
Each one saves data in a different manner and gives different features. Some save tabled data (relational databases) while other saved “paged” data, or even simple key-value pairs.
It really depends on what you need to save and how you want is saved.
Now your code needs to read and write data to the database… this is where the abstraction you were asking about comes into play and this also ties back to the frameworks we talked about earlier.
One way to create an abstraction is to use what’s generally called an ORM (Object-relational mapping) – This layer of code is suppose to separate the code needed that data, from the data source itself – thus enabling your to make changes to one without effecting the other.
One great example of this is Java’s JDO. Also, most modern frameworks supply their own ORMs with pluggable configurations for common database providers – of course each framework’s ORM is for working within that framework alone (generally)
The server – Software:
Now you got everything work, but something needs to be “answering calls” and serving up what you have to offer
Everything could be running on a single machine, or on thousands of separate machines – this really depends on how your architecture is planned, and how much you need it – which is “the scale”.
The server’s role is to accept the “request” of some action , and run it. Once it runs “it” (could be a database query, a dynamic script or a request for a image on the disk), the sends it back. It serves.
For this, like everything, various vendors exist: Apache, Nginx, Tomcat.. the list is endless. You can also play around with combinations of these various servers on various layers depending on your needs
The server – Hardware:
Everything we talked about, has to be somewhere, a CPU somewhere has to process it.
This is basically a computer, not very different from the one you’re using now only it has it’s muscles where it needs them.
It can be psychical or virtual depending on your needs – but it is somewhere, in a data center connected to the internet (hopefully) by the internet backbone.
State-full / Stateless:
This isn’t so much a “is there a tool for”. The way you design and code is what determines if your architecture is stateless or not. This property may also vary between different points of your architecture. For example you might have stateless web servers for your guest visitors, but have state-full requests for your logged in users.
A stateless design is usually easier to scale.
Summary – A final word:
Back to architecture: from all the examples I gave, the technologies you choose and how you put them together is what makes your architecture.
You can mix n’ match different technologies and vendors to create various “stacks”, you can put up multiple sets of different stacks to combine into one big architecture – the possibilities are literally endless.
I hope I was able to put a little order into the chaos.
If you want to read up on the subject, check out these recommended books to get you started: