The only goal for a startup is to grow as fast as possible. In that context, the most difficult part is, as always, generating the customer attraction and retention. When a startup is growing, it becomes absolutely crucial to provide a stable and reliable service. And not every system is able to grow easily. Today I want to share with you the best practices that can 1000X and even more the current capacity of your platform.
“A chain is only as strong as its weakest link”. Like for a chain, when we talk about a website they are several things that can fail once the traffic increases. It could be:
– IN/Out Network traffic
– CPU load
– RAM load
– Database access (local or remote)
– HardDrive IO (usually for database)
– External services
Let’s analyze each one and see what improvement can be made in each area.
The network load can be reach easily if your service provides high bandwidth content like software, video or audio. There are 2 technical ways to improve this kind of performance issue.
The CPU is the best performance issue we can have as we can increase easily the server capacity. Nevertheless, when we talk about a website or an API server, we should not have big CPU load. Web server or API server are build to set and get value to or from the database, change few values and send it back to the customers. Big PDF generation, video transcription, file conversion shouldn’t be performed by a web server but by a dedicated app server. This is a quite classical issue that I can see in growing start-ups.
In the beginning, it’s quite normal to build an app as fast as possible, so we just put the code into the web server. But at some point, when the traffic increases and, for instance, if a file conversion that takes 10 seconds of 100% CPU load, whenever we get 10 conversions in parallel, which is not so much by the way, the server will take 100 seconds to deliver the content, and this is usually more than the classical 60 seconds timeout parameter. The problems start from there. There are 2 ways to avoid these problems.
Note: Before starting to share connections between several servers, you need to make sure to have a shared session system, otherwise customers will be constantly disconnected as soon as there are connection switching from one server to another. The best way is to share this session into a dedicated database. Any database will work, but it could be easier to use NoSQL database or temporary database like Redis.
The RAM consumption is quite like the CPU one. In fact a process will mainly consume some CPU and RAM. If your server doesn’t have enough RAM, exactly like for the CPU issue, you should go for sharing the workload between several servers and maybe think over delegating the big RAM consumer processes to a dedicated App server.
This part, I think, is the trickiest one. The database is usually the heart of any information system, the only place where we store data. By that, I mean that a database system is a lot more complicated to be shared across several servers, since the data equals to 0 or to 1 and it’s difficult to update in real-time onto several servers. So the first thing when your app needs to scale is to identify which requests are the most used and for which feature/function. As we discussed previously, the session shouldn’t be on the same database system as the main application. Session generates requests on almost every user’s request and it’s a huge database consumer. In addition, if you application is about managing a huge quantity of simple information like http://goo.gl links, then you shouldn’t them on the main database, but instead, use a dedicated database where the would use NoSQL. Another way is to use cache database like Redis for session, if your application needs to store a lot of temporary information, then a Redis database would be great to decrease your main database’s workload.
Now if you’ve done everything possible to reduce you database workload, then it’s time to think about Database cluster. They are many ways to share a database across several servers by splitting data or even tables across several servers. Whatever you use MySQL, PostgreSQL, SQL server or even better Oracle you will find many options that will require a database expert to set it up.
The other and the last way to increase your database capacity, is obviously to increase the server capacity by using bigger CPUs, RAMs, and with huge Input/Output Access to the hard drive.
Today, in order to build a new website or application, I will recommend you to use as many external services as possible. Weather it is for audio/video conversion, sending emails, etc, many sub-features might be an entire expertise, and using someone else’s expertise is easier than rebuilding a system from scratch, especially when the only things that matter is to serve your customer, and not wasting time and resources on building what someone else have already build. “Don’t reinvent the wheels”.
Nevertheless, pay attention to the subcontractor’s capacity. If a sub-service is unavailable it’s usually your entire service that is impacted, so be sure to use reliable and well known external services. As example, this is one of the reasons why it’s a good idea to use services like Youtube or Vimeo Pro.
This last part is an entire topic, where some experts wrote entire books on it. You just need to know that it’s unfortunately quite easy to overload a web server until it crashes. Just be aware of it. In addition to increasing your service capacity, deploy web application firewall and QOS firewall rules to protect your service.
First of all, thank you very much for taking the time to read this article. It would be amazing if you could let me know if this has been helpful to you and your startup.
What does it make you think of?
Are currently facing some of these common issues?
Whatever the case might be, I’d be more than happy to answer any of your questions or comments, so don’t hesitate, reach out and I’ll answer you as soon as possible.