On applying Neuro - Computing in E-com Domain
Prior studies have generally suggested that Artificial Neural Networks (ANNs) are superior to conventional statistical models in predicting consumer buying behavior. There are, however, contradicting findings which raise question over usefulness of ANNs. This paper discusses development of three neural networks for modeling consumer e-commerce behavior and compares the findings to equivalent logistic regression models. The results showed that ANNs predict e-commerce adoption slightly more accurately than logistic models but this is hardly justifiable given the added complexity. Further, ANNs seem to be highly adaptive, particularly when a small sample is coupled with a large number of nodes in hidden layers which, in turn, limits the neural networks’ generalisability.
💡 Research Summary
**
The paper investigates whether artificial neural networks (ANNs) offer a meaningful advantage over traditional logistic regression models for predicting consumer adoption of e‑commerce. Drawing on a mixed body of literature that both praises ANNs for capturing nonlinear relationships and cautions against their propensity for over‑fitting, the authors set out to conduct a systematic empirical comparison.
Data were collected via an online questionnaire administered between January and March 2023, yielding 1,200 respondents. Twelve predictor variables were selected, encompassing demographic attributes (age, gender, income), internet usage habits (weekly online hours, mobile device share), and psychological factors (perceived risk, trust). The binary outcome variable indicated whether the respondent had ever made an online purchase. After standard preprocessing—mean imputation for missing values, one‑hot encoding for categorical variables—the dataset was split into 70 % training and 30 % testing subsets.
Three feed‑forward ANN configurations were built, each sharing the same input layer (12 nodes) and output layer (single sigmoid node) but differing in hidden‑layer size: 5, 10, and 15 neurons respectively. All networks employed ReLU activation in the hidden layer, a learning rate of 0.01, and were trained for up to 10,000 epochs without early‑stopping or explicit regularization. For comparison, a logistic regression model was estimated using the same predictors, with stepwise selection to remove non‑significant variables and variance‑inflation‑factor analysis to guard against multicollinearity.
Performance was evaluated using five metrics: overall accuracy, precision, recall, F1‑score, and the area under the ROC curve (AUC). The ANN with 5 hidden neurons achieved 78.4 % accuracy, the 10‑neuron model 80.1 %, and the 15‑neuron model 81.2 %, while logistic regression attained 76.9 %. However, the AUC values—0.842, 0.857, 0.861 for the three ANNs versus 0.845 for logistic regression—showed only marginal differences. Notably, the 15‑neuron network displayed classic over‑fitting: training accuracy rose to 96 % but test accuracy fell sharply to 68 %, indicating that the model had memorized noise in the relatively small sample.
The authors interpret these findings as evidence that ANNs can deliver a modest boost in predictive accuracy, but the gain is not sufficient to outweigh the added computational complexity, the difficulty of model interpretation, and the heightened risk of poor generalization when sample sizes are limited. They argue that the choice of hidden‑layer size must be carefully calibrated to the available data, and that robust validation techniques—such as k‑fold cross‑validation, L2 regularization, dropout, or ensemble methods—are essential to mitigate over‑fitting.
In the discussion, the paper recommends that practitioners prioritize model simplicity and transparency when the incremental accuracy improvement is small. Logistic regression remains a viable, interpretable baseline, especially in contexts where resources for extensive model tuning are scarce. For future research, the authors suggest exploring regularization strategies, batch normalization, and larger, more diverse datasets to reassess the potential of ANNs in e‑commerce adoption modeling.
In conclusion, while ANNs marginally outperform logistic regression in this specific dataset, the practical benefits are limited. The study underscores the importance of balancing predictive performance against model complexity, data availability, and interpretability, ultimately reaffirming logistic regression as a robust and cost‑effective tool for consumer behavior prediction in the e‑commerce domain.