2026-06-04
2026-04-30
2026-02-27
Manuscript received September 18, 2025; revised October 10, 2025; accepted November 10, 2025; published 26, 2026.
Abstract—Designing websites to meet user and client expectations often requires repeated refinement cycles, making the process time-consuming and resource-intensive. This study proposes an automated classification system that categorizes real-world websites based on their salient structural design features to support data-driven automatic website generation. The system integrates Self-Organizing Maps (SOMs) with a novel image-processing pipeline that combines edge detection, gradient analysis, and morphological filtering and image smoothing to extract structural wireframe layouts from website screen captures. Experiments were conducted on three datasets: manually created wireframes, screen captures of top 100 websites, and screen captures of top 1500 websites ranked by SimilarWeb. The analysis revealed seven representative layout archetypes: dashboard interfaces, simple information pages, fixed-width product grids, informational pages with sidebars, basic search interfaces, multi-section content layouts, and tabular data interfaces. The classification quality was evaluated using topographic error, quantization error, Silhouette coefficient, and Davies-Bouldin index, demonstrating consistent and meaningful clustering. Our findings highlight the potential of SOM-based clustering for automatic website template generation, offering a scalable and data-driven foundation for design automation and frontend prototyping. Keywords—website structure analysis, automatic website generation, web design clustering, salient design features, self-organizing maps Cite: Thisaranie Kaluarachchi, Sumedhe Dissanayake, and Manjusri Wickramasinghe, "Clustering Websites by Salient Design Features," Journal of Image and Graphics, Vol. 14, No. 2, pp. 230-258, 2026. Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC-BY-4.0).