An Analogy Based Method for Freight Forwarding Cost Estimation

The author explored estimation by analogy (EBA) as a means of estimating the cost of international freight consignment. A version of the k-Nearest Neighbors algorithm (k-NN) was tested by predicting j

An Analogy Based Method for Freight Forwarding Cost Estimation

The author explored estimation by analogy (EBA) as a means of estimating the cost of international freight consignment. A version of the k-Nearest Neighbors algorithm (k-NN) was tested by predicting job costs from a database of over 5000 actual jobs booked by an Irish freight forwarding firm over a seven year period. The effect of a computer intensive training process on overall accuracy of the method was found to be insignificant when the method was implemented with four or fewer neighbors. Overall, the accuracy of the analogy based method, while still significantly less accurate than manually working up estimates, might be worthwhile to implement in practice, depending labor costs in an adopting firm. A simulation model was used to compare manual versus analytical estimation methods. The point of indifference occurs when it takes a firm more than 1.5 worker hours to prepare a manual estimate (at current Irish labor costs). Suggestions are given for future experiments to improve the sampling policy of the method to improve accuracy and to improve overall scalability.


💡 Research Summary

The paper investigates the feasibility of using Estimation by Analogy (EBA) to automate cost estimation for international freight forwarding. Drawing on a real‑world dataset of more than 5,000 jobs recorded over a seven‑year period by an Irish freight forwarding company, the authors implement a k‑Nearest Neighbors (k‑NN) model to predict the cost of a new shipment based on the most similar historical cases. Each record contains twelve attributes – origin and destination countries, mode of transport (sea, air, road), commodity type (general, hazardous, refrigerated, etc.), weight, volume, shipping date, customer, and the final invoiced amount – which are pre‑processed through imputation, one‑hot encoding, and z‑score normalization.

The experimental protocol varies the number of neighbors (k = 1 … 10) and evaluates performance using Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R² via five‑fold cross‑validation. The results show that the lowest MAPE (≈ 22 %) is achieved with k = 3 or 4; beyond k = 5 the error rises, indicating that larger neighbor sets dilute the influence of high‑cost outliers such as urgent air shipments. A separate test compares a “computationally intensive” training phase (pre‑computing distance matrices and caching) with a lightweight, on‑the‑fly distance calculation. For k ≤ 4 the two approaches differ by less than 0.3 % in MAE, demonstrating that the simpler, less resource‑hungry implementation is sufficient for practical deployment.

To assess economic viability, the authors construct a simulation model that juxtaposes manual estimation (average time 1.0–2.5 hours per quote, Irish labor cost €30 / hour) against the automated EBA (estimated annual server and maintenance cost €5,000). The model identifies a “point of indifference” at roughly 1.5 hours of manual effort per quote: if a firm spends more than this per estimate, the automated method yields net cost savings. Consequently, in environments where skilled estimators are scarce or labor rates are high, EBA can be a financially attractive supplement.

The study acknowledges several limitations. The sampling policy is essentially random; it does not prioritize cases that are most informative for cost variance (e.g., customs duty spikes, seasonal demand, special handling requirements). The authors propose future work in three main directions: (1) weighted k‑NN where attributes that historically drive cost volatility receive higher influence; (2) a clustering‑first approach that narrows the candidate pool before distance computation, thereby improving scalability; and (3) benchmarking against more sophisticated machine‑learning techniques such as Random Forests, Gradient Boosting Machines, and deep neural networks. They also suggest extending the dataset to multiple countries and carriers to test generalizability.

In conclusion, the paper demonstrates that a straightforward k‑NN based analogue estimator, trained on a substantial real‑world freight database, can achieve reasonable accuracy—though still inferior to expert manual quotes—and can become cost‑effective when manual estimation exceeds 1.5 hours per job. The method is best positioned as a decision‑support tool that reduces estimator workload and promotes consistency, rather than a full replacement for human expertise. With the recommended methodological refinements, the authors argue that analogue‑based automation could see broader adoption across the logistics sector.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...