пятница, 21 ноября 2014 г.

Service API: Throughput and Latency

There are at least two things you should consider when designing scalable Service APIs - throughput and latency. Throughput is the number of request service can process per, for example, second. Latency is time to process single request. It is obvious, thank you, Cap! But how these two are related to scalability?

Well, throughput can be theoretically improved by scaling out your service. Add more instances and your throughput grows. I put "theoretically" because often your multiple instances end up waiting each other on some shared resource, database for example. That's why it is shouted on every corner that scalable architecture should be put into product from the beginning.

Latency is another story. You can't reduce latency by scaling out. Of course if your system is overloaded then latency goes high, you add instances and latency drops down. But you can't put it below some limit no matter how many instances you add. That's what I'm talking about!

So morale of the post: do not add latency "by design" - you can't mitigate this later on by scaling out.

But what about Service APIs? How can we add latency "by design" in Service APIs?
To be continued...

Комментариев нет:

Отправить комментарий

Wider Two Column Modification courtesy of The Blogger Guide