eXtropia: the open web technology company
Technology | Support | Tutorials | Development | About Us | Users | Contact Us
 ::   Tutorials
 ::   Presentations
Perl & CGI tutorials
 ::   Intro to Perl/CGI and HTML Forms
 ::   Intro to Windows Perl
 ::   Intro to Perl 5
 ::   Intro to Perl
 ::   Intro to Perl Taint mode
 ::   Sherlock Holmes and the Case of the Broken CGI Script
 ::   Writing COM Components in Perl

Java tutorials
 ::   Intro to Java
 ::   Cross Browser Java

Misc technical tutorials
 ::   Intro to The Web Application Development Environment
 ::   Introduction to XML
 ::   Intro to Web Design
 ::   Intro to Web Security
 ::   Databases for Web Developers
 ::   UNIX for Web Developers
 ::   Intro to Adobe Photoshop
 ::   Web Programming 101
 ::   Introduction to Microsoft DNA

Misc non-technical tutorials
 ::   Misc Technopreneurship Docs
 ::   What is a Webmaster?
 ::   What is the open source business model?
 ::   Technical writing
 ::   Small and mid-sized businesses on the Web

Offsite tutorials
 ::   ISAPI Perl Primer
 ::   Serving up web server basics
 ::   Introduction to Java (Parts 1 and 2) in Slovak


Introduction to XML For Web Developers
What is a Markup Language  

Surely, if you have decided to learn about XML, you are probably already quite familiar with the concepts behind HTML (HyperText Markup Language). So let's start from there.

HTML, as its name implies, is a markup language. As such, it is used to markup text. But what exactly does it mean to markup text?

Abstractly, marking up text is a methodology for encoding data with information about itself. Examples of markups (encoded data) are ubiquitous in the real world.

For example, back when you were slogging through high school, you probably used to use a bright yellow highlighter pen to highlight sentences in your schoolbooks (or at last you knew someone who did!). You did so because you thought that the highlighted sentences would be useful to review around exam time and you wanted a quick way to skim through the important points. Just like you, thousands of kids around the world did the exact same thing for the exact same reason.

By highlighting certain bits of text, you were effectively "marking-up" the data. Essentially, you specified that certain sentences (data) were important by marking them in yellow. These sentences became encoded with the fact that they were important.

And what's more, since everyone followed the same standard of marking up, you could easily pick up a used text book and get a good idea just from reading the highlighted sections what were core points of the book.

There are two crucial points to take away from this example. For markups to transmit useful information about data to a pool of users...

  1. a standard must be in place to define what a valid markup is - In the example above, markup is defined as a bit of yellow ink atop text. In HTML a markup is a tag.
  2. a standard must be in place to define what markup means - In the example above, a yellow highlight means the highlighted text represents an important point. In HTML each tag communicates its own layout of formatting meaning.

Markups are also ubiquitous in the world of computers. They are used by word processors to specify formatting and layout, by communications programs to express the meaning of data sent over the wires, by database applications that must associate meaning and relationships with the data they serve, and by multimedia processing programs which must express meta-data about images or sound.

As data is sent through dumb computers and programs, it is essential that the data carries with it information necessary to communicate what the data means and/or what the receiver should do with that data.

Data with no context is meaningless just as an unhighlighted book is bad news around exam time!

HTML is one of the more famous computer markup systems. HTML defines a set of tags that associate formatting rules with bits of text. Documents which have been marked up (which contain plain text as well as the tags that specify the rules for formatting that text) are read by an HTML processing application (a web browser for example) that knows how to display the text according to the rules.

For example, the <B> tag specifies a rule which instructs an HTML processing application to bold a specific bit of text. Similarly, the <CENTER> tag instructs the HTML processing application to center the text.

Thus <CENTER><B>BOLD</B></CENTER> would be displayed by an HTML processing application as


You might imagine a client contact list which could look like the following bit of HTML code:

<LI>Gunther Birznieks
<LI>Client ID: 001
<LI>Company: Bob's Fish Store
<LI>Email: gunther@bobsfishstore.com
<LI>Phone: 662-9999
<LI>Street Address: 1234 4th St.
<LI>City: New York
<LI>State: New York
<LI>Zip: 10024
<LI>Susan Czigonu
<LI>Client ID: 002
<LI>Company: Netscape
<LI>Email: susan@eudora.org
<LI>Phone: 555-1234
<LI>Street Address: 9876 Hazen Blvd.
<LI>City: San Jose
<LI>State: California
<LI>Zip: 90034

The above HTML-encoded data would be displayed by an HTML processing application as:

  • Gunther Birznieks
    • Client ID: 001
    • Company: Bob's Fish Store
    • Email: gunther@bobsfishstore.com
    • Phone: 662-9999
    • Street Address: 1234 4th St.
    • City: New York
    • State: New York
    • Zip: 10024
  • Susan Czigonu
    • Client ID: 002
    • Company: Netscape
    • Email: susan@eudora.org
    • Phone: 555-1234
    • Street Address: 9876 Hazen Blvd.
    • City: San Jose
    • State: California
    • Zip: 90034

Previous Page | Next Page | Table of Contents