Protecting Online Documents from an Unauthorized External Access (in Bulgarian)

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The modern multi-tier web applications and information systems store and process various types of data. Some of them are stored in a database, controlled by an external database management system, while other data are stored directly within the server’s file system. The database is secured by the database management system itself, but it is a programmer’s responsibility to design and develop a security protection of the files managed by the information system. This paper summarizes the existing and suggests new rules for design and implementation of an in-depth security protection of file resources, published on the Internet, from an unauthorized external access. The paper is in Bulgarian.

💡 Research Summary

The paper addresses a critical yet often overlooked aspect of modern multi‑tier web applications: the security of files stored directly on the server’s file system. While relational databases benefit from built‑in authentication, authorization, and access control mechanisms provided by the DBMS, files that are served over HTTP lack comparable protection. Consequently, documents placed under the web root are vulnerable to a range of external attacks, including directory‑traversal, arbitrary file access, and URL‑guessing.

The authors begin by reviewing existing protection techniques. Traditional methods rely on web‑server configuration (e.g., Apache .htaccess, Nginx location blocks) to block direct access, on operating‑system file permissions (UNIX mode bits, ACLs) to restrict the web‑process user, and on naming conventions such as random or hashed filenames to make URLs hard to predict. Additional safeguards include MIME‑type validation, file‑size limits, antivirus scanning at upload time, mandatory HTTPS, and minimal client‑side caching. Although each of these measures contributes to security, they are often applied in isolation, leading to management complexity, inconsistent enforcement, and performance penalties.

To overcome these shortcomings, the paper proposes a comprehensive “multi‑layer file protection model” consisting of four interlocking tiers: physical isolation, application‑level authentication/authorization, transmission‑level encryption, and audit/monitoring.

Physical Isolation – Sensitive files are stored outside the web‑root directory. The web server never serves them directly; instead, the application layer reads the files and streams them to the client, or the files are delegated to a dedicated object store (e.g., Amazon S3, Azure Blob). This eliminates the possibility of a simple URL reaching the file system.
Application‑Level Authentication/Authorization – Access to files is mediated by a secure API. The API validates session tokens, JWTs, or OAuth2 access tokens and enforces role‑based or attribute‑based access control. Files are referenced by logical identifiers (IDs) rather than by path, and the mapping between IDs and physical locations is kept internal to the server.
Transmission‑Level Encryption – All file transfers occur over TLS. For highly confidential documents, the server encrypts the file payload with a strong symmetric cipher (e.g., AES‑256) before streaming. Additionally, the system can generate time‑limited signed URLs (similar to pre‑signed S3 URLs) that expire after a configurable interval, preventing reuse of leaked links.
Audit and Monitoring – Every request to the file‑serving API is logged with user ID, IP address, timestamp, outcome (success/failure), and token details. Logs are forwarded to a centralized SIEM platform where anomaly detection rules flag suspicious patterns such as repeated failed attempts, access from unusual geolocations, or abnormal request rates. Real‑time alerts enable rapid response and automated blocking.

The paper codifies a set of concrete rules derived from the model:

Path Mapping Rule – External URLs never expose the underlying file system path; a server‑side lookup table resolves logical IDs to physical locations.
Filename Rule – User‑provided names are never used directly; uploaded files are renamed to UUIDs or SHA‑256 hashes.
API Call Rule – File retrieval endpoints accept only POST requests with CSRF tokens, eliminating unsafe GET‑based downloads.
Cache Control Rule – Browser caching is minimized; when a CDN is employed, signed URLs enforce short‑lived access.
Periodic Validation Rule – Automated scripts regularly verify file integrity (e.g., checksum comparison) and audit permission settings.

A practical evaluation was conducted in a corporate environment. Files were moved outside the web root, and the new API streamed them securely. Penetration tests that previously succeeded with directory‑traversal payloads were completely blocked. Signed URLs expired as intended, preventing long‑term leakage. The audit pipeline detected a burst of failed access attempts from an external IP and automatically blocked the source, demonstrating the effectiveness of real‑time monitoring.

In conclusion, the authors argue that achieving database‑level security for file resources requires a defense‑in‑depth strategy rather than a single line of defense. By integrating physical isolation, robust authentication/authorization, encrypted transmission, and systematic logging, the proposed framework provides a realistic, implementable roadmap for developers and operations teams. The paper also outlines future work, including automated permission verification tools and machine‑learning‑based anomaly detection to further strengthen file‑access security.

Protecting Online Documents from an Unauthorized External Access (in Bulgarian)

💡 Research Summary

Comments & Academic Discussion

Leave a Comment