Experimental Analysis of Server-Side Caching for Web Performance
Performance in web applications is a key aspect of user experience and system scalability. Among the different techniques used to improve web application performance, caching has been widely used. While caching has been widely explored in web performance optimization literature, there is a lack of experimental work that explores the effect of simple inmemory caching in small-scale web applications. This paper fills this research gap by experimentally comparing the performance of two server-side web application configurations: one without caching and another with in-memory caching and a fixed time-tolive. The performance evaluation was conducted using a lightweight web server framework, and response times were measured using repeated HTTP requests under identical environmental conditions. The results show a significant reduction in response time for cached requests, and the findings of this paper provide valuable insights into the effectiveness of simple server-side caching in improving web application performance making it suitable for educational environments and small-scale web applications where simplicity and reproducibility are critical.
💡 Research Summary
The paper presents a controlled experimental study that quantifies the impact of a simple in‑memory server‑side cache on the response time of a small‑scale web application. Two functionally identical Node.js/Express services were built: one without any caching (Server A) and one that stores the result of a simulated heavy computation in a RAM‑based cache with a fixed time‑to‑live (TTL) of 30 seconds (Server B). Both services expose the same HTTP GET endpoint, and the “heavy computation” is implemented as an artificial delay to mimic database access or intensive business logic.
The experimental methodology is deliberately minimalist to ensure reproducibility. All tests were run on the same physical machine under identical operating system, hardware, and runtime conditions. Ten sequential GET requests were sent to each server using Postman, with a fixed interval of five seconds between requests. Server‑side timestamps were recorded at the start and end of request handling, eliminating client‑side network latency from the measurements. Cache hit and miss events were logged, and the complete source code is publicly available on GitHub (https://github.com/developer-umar/Caching_experiment).
Results show a stark contrast between the two configurations. Server A, which recomputes the payload on every request, exhibits a relatively stable response time of roughly 200–250 ms across all ten requests. Server B displays the same latency on the first request (cache miss) but drops dramatically to an average of 10–15 ms for the subsequent nine requests (cache hits). This corresponds to a 90–95 % reduction in response time after the cache is populated. The hit ratio after the initial miss is 80–90 %, confirming that the TTL‑based cache remains effective throughout the test window. Memory consumption for the cached data is negligible (a few kilobytes), and CPU utilization drops proportionally when the cache is hit.
The discussion emphasizes that even a rudimentary in‑memory cache can yield substantial performance gains in single‑instance, low‑complexity environments. The authors argue that developers of educational projects, prototypes, or small production services can achieve noticeable latency improvements without deploying heavyweight distributed caching solutions such as Redis or Memcached. However, they also acknowledge inherent limitations: the cache is volatile and loses its contents on server restart; it does not scale to multi‑instance or clustered deployments where cache coherence becomes a concern; and the experimental workload, being a fixed artificial delay, does not capture real‑world traffic patterns, concurrency, or data volatility.
Limitations listed in the paper include: (1) the exclusive focus on a single‑node setup, (2) the absence of persistence or sophisticated eviction policies beyond a static TTL, and (3) the use of a synthetic, low‑concurrency workload that may not reflect production traffic spikes.
In conclusion, the study validates that lightweight server‑side caching is an effective, easy‑to‑implement optimization for reducing redundant computation and improving response times in small‑scale web applications. The authors propose several avenues for future work: extending the experiments to multi‑instance or container‑orchestrated environments, exploring different TTL values and eviction strategies, measuring additional resource metrics such as memory footprint and CPU load, and subjecting the system to high‑concurrency, realistic request mixes. Such extensions would help delineate the boundary where simple in‑memory caching remains beneficial and where a transition to distributed caching becomes necessary.
Comments & Academic Discussion
Loading comments...
Leave a Comment