InterPARES Definition

n. ~ Information resources that rely on tags or markers, rather than a data model, to indicate different semantics elements.

Other Definitions

  • Janssen 2014 (†453 s.v. "semi-structured data"): Semi-structured data is data that is neither raw data, nor typed data in a conventional database system. It is structured data, but it is not organized in a rational model, like a table or an object-based graph. A lot of data found on the Web can be described as semi-structured. Data integration especially makes use of semi-structured data.
  • Wikipedia (†387 s.v. "semi-structured data"): A form of structured data that does not conform with the formal structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, it is also known as self-describing structure.


  • Janssen 2014 (†453 s.v. "semi-structured data"): Some examples of semi-structured data would be BibTex files or a Standard Generalized Markup Language (SGML) document. Files that are semi-structured may contain rational data made up of records, but that data may not be organized in a recognizable structure. Some fields may be missing or contain information that can't be easily described in a database system. ¶In semi-structured data, the information that is contained within the data is normally associated with a database schema. This is why the information is sometimes called self-describing. (†622)
  • Wikipedia (†387 s.v. "semi-structured data"): In the semi-structured data, the entities belonging to the same class may have different attributes even though they are grouped together, and the attributes' order is not important. Semi-structured data is increasingly occurring since the advent of the Internet where full-text documents and databases are not the only forms of data any more and different applications need a medium for exchanging information. In object-oriented databases, one often finds semi-structured data. (†615)