Current practice of Web site development does not address explicitly the problems related to multi lingual sites. The same information, as well as the same navigation paths, page formatting and organization, should be provided by the site independently from the chosen language. This is typically ensured by adopting personal conventions on the way pages are named and on their location in the file system. Updates are then performed manually and consistency depends on the ability of the programmers not to miss any impact of the change. In this paper an extension to HTML, called MLHTML (Multi Lingual HTML), is proposed as the target representation of a restructuring process aimed at producing a maintainable and consistent multi lingual Web site. MLHTML centralizes the language dependent variants of a page in a single representation, where shared parts are not duplicated. Existing sites can be migrated to MLHTML my means of the algorithms described in this paper. After classifying the pages according to their language, a page alignment technique is exploited to identify corresponding pages and to eliminate inconsistencies. Transformation into MLHTML can then be achieved automatically
Restructuring Multilingual Web Sites
Tonella, Paolo;Ricca, Filippo;Pianta, Emanuele;Girardi, Christian
2002-01-01
Abstract
Current practice of Web site development does not address explicitly the problems related to multi lingual sites. The same information, as well as the same navigation paths, page formatting and organization, should be provided by the site independently from the chosen language. This is typically ensured by adopting personal conventions on the way pages are named and on their location in the file system. Updates are then performed manually and consistency depends on the ability of the programmers not to miss any impact of the change. In this paper an extension to HTML, called MLHTML (Multi Lingual HTML), is proposed as the target representation of a restructuring process aimed at producing a maintainable and consistent multi lingual Web site. MLHTML centralizes the language dependent variants of a page in a single representation, where shared parts are not duplicated. Existing sites can be migrated to MLHTML my means of the algorithms described in this paper. After classifying the pages according to their language, a page alignment technique is exploited to identify corresponding pages and to eliminate inconsistencies. Transformation into MLHTML can then be achieved automaticallyI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.