Advanced Web Technologies

Blog posts as part of the BSc Internet Application Development programme to discuss my experience while working on the tasks that will be given during the Advanced Web Technologies module.

During this course of 11 weeks, various aspects have been covered about advanced web technologies which are the following:
  • eXtensible Markup Language (XML) - is a markup language used to store and share data. An XML document should consist of a root element and also should contain proper nesting in tags (forming a tree structures). Also, the tags are case sensitive and each element should have a closing tags. For further information as regards an element, attributes (that should be quoted) are used. When these rules are followed, the XML document is well-formed. A valid document should be well-formed and also should follow the restrictions which have been specified in the Document Type Definition (DTD).[1]
  • Document Type Definition (DTD) - A DTD is used to describe the elements of an XML document. A DTD can either be declared inside the XML document or externally. Elements are declared using the ELEMENT declaration, while attributes are declared using the ATTLIST declaration. In a DTD, entities are variables that are used to create shortcut to text.[2] An ENTITY can also be known as escape sequences, used for special characters.[3]
  • XML CDATA - In a Parsed Character Data (PCDATA), data will be parsed by the XML parser. On the other hand, in Character Data (CDATA), data will not be parsed, that is, data will be ignored.[4]
  • XML Encoding - An XML document can consist of characters that are not (American Standard Code for Information Interchange) ASCII. Therefore to eliminate errors, an XML encoding should be specified. Another alternative is by saving the XML documents as Unicode.[5]
  • XML Namespaces - Namespaces are used to avoid conflicts on element names.
  • Cascading Style Sheets (CSS) - Cascading Style Sheets can be used to style both the XML and HTML document. Usually in an XML document, eXtensible Stylesheet Language (XSL) is used for styling.
  • XML Path Language (XPath) - Is used to extract specific information inside the XML document.
  • XLink and Xpointer - XLink is used to embed hyperlinks inside the XML document. To make a hyperlink pointing to particular parts of the XML document, XPointer should be used.[6]
  • XML Schema - An XML schema is an alternative to a DTD, but has more benefits. XML schemas are written using XML and includes data types and namespaces. In an XML schema, complex types (complexType) can contain attributes or child nodes, whereas simple types (simpleType) cannot contain attributes or child elements[7].

XML Editors

A web developer can create XML documents using XML editors instead of a normal text editor. These editors will help the developer creating a well-formed and valid XML document. Two known XML editors are:

Sematic Web Stack

To sum things up, to create Semantic web, the following languages or technologies (forming a stack)[8] are used:

Semantic Web Stack
  • Unicode and Uniform Resource Identifier (URI) - Unicode is used to encode character sets that are international, whereas a URI is used to uniquely identify resources;
  • eXtensible Markup Language (XML) - The XML layer uses namespaces and XML schema definitions to ensure that the syntax used is common. Namespaces and XML schema have been briefly described above;
  • Resource Description Framework (RDF) - RDF is a framework used to represent information on resources;
  • RDF Schema (RDFS) - RDFS is an extension to RDF. It can be used to depict taxonomies (classification) of classes;
  • Web Ontology Language (OWL) - OWL is an extension to RDF and RDFS used to describe logics;
  • Simple Protocol and RDF Query Language (SPARQL) - It is used to query RDF data, RDFS and OWL ontologies;
  • Rules - Rule languages are used to produce rules. Exaples of rule languages are Rule Interchange Format (RIF) and Semantic Web Rule Language (SWRL);
  • Proof and Trust - Results will be trusted when the inputs for the proof are trusted;
  • Cryptography - Digital signatures are used to verify the sources.

Sources:

[1] http://www.w3schools.com/xml/xml_summary.asp
[2] http://www.w3schools.com/dtd/dtd_summary.asp
[3] http://www.w3schools.com/dtd/dtd_entities.asp
[4] http://www.w3schools.com/xml/xml_cdata.asp
[5] http://www.w3schools.com/xml/xml_encoding.asp
[6] http://www.w3schools.com/xlink/xlink_intro.asp
[7] http://www.xaprb.com/blog/2006/03/16/simple-and-complex-types-in-xml-schema/
[8] http://www.obitko.com/tutorials/ontologies-semantic-web/semantic-web-architecture.html

QUICK QUESTIONS

  1. The following passage is to be found in the middle of a particular XML document:
    The heavily-used <service xlink:type = "simple"
    xlink:href ="http://www.thetrams.co.uk/croydon">
    Croydon Tramlink </service> provides a cross
    link to nearby <location>Wimbledon</location>,
    <location>Addington</location> and <location>Beckenham</location>
    .
    What can you say about how the text Croydon Tramlink will be treated by a browser such as Mozilla Firefox?

    The text 'Craydon Tramlink' will be treated as a link, which will be redirected to 'http://www.thetrams.co.uk/croydon'. It will be treated like this because it contains the xlink:href attribute.
  2. It’s possible to provide validation for a class of XML document using a Document Type Definition (.dtd) file, or using an XML schema. The DTD approach is easier. Why might you want to use the XML schema approach?

    As already explained in previous a previous post, a Document Type Definition (DTD) is used to describe how an XML document should be constructed. A DTD is used to validate an XML document. Although the DTD is easier, it also has limitations and therefore an alternative to DTD is an XML schema. Some benefits of XML schema over DTD are the following[1]:
    • XML schemas include data types and namespaces;
    • XML schemas are written using XML;
    • XML schemas are more powerful.

LONGER QUESTIONS

  1. Here is an XML document:
    <?xml version="1.0" encoding="UTF-8"?>
    <book isbn="0836217462">
     <title>
      Being a Dog Is a Full-Time Job
     </title>
     <author>Charles M. Schulz</author>
     <character>
       <name>Snoopy</name>
       <friend-of>Peppermint Patty</friend-of>
       <since>1950-10-04</since>
       <qualification>
        extroverted beagle
       </qualification>
     </character>
     <character>
       <name>Peppermint Patty</name>
       <since>1966-08-22</since>
       <qualification>bold, brash and tomboyish</qualification>
     </character>
    </book>

    An XML schema is to be constructed, which will validate this document and other similar documents. Make notes on the elements etc that this document contains, and record any significant factors about them.

    The datatypes of the above elements are:
    • title - string
    • author - string
    • charater - string
    • name - string
    • friend-of - string
    • since - date
    • qualification - string
    Attribute "isbn" is of string datatype.
    1:  <?xml version="1.0"?>  
    2:  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">  
    3:    <xs:element name="book">  
    4:     <xs:complexType>  
    5:       <xs:sequence>  
    6:        <xs:element name="title" type="xs:string"/>  
    7:        <xs:element name="author" type="xs:string" maxOccurs="unbounded"/>  
    8:         <xs:element name="character" type="xs:string" maxOccurs="unbounded">  
    9:          <xs:complexType>  
    10:           <xs:sequence>  
    11:             <xs:element name="name" type="xs:string"/>  
    12:             <xs:element name="friend-of" type="xs:string" minOccurs="0"/>  
    13:             <xs:element name="since" type="xs:string"/>  
    14:             <xs:element name="qualification" type="xs:string"/>  
    15:           </xs:sequence>  
    16:          </xs:complexType>  
    17:        </xs:element>  
    18:       </xs:sequence>  
    19:       </xs:attribute name="isbn" type="xs:string" use="required"/>  
    20:     </xs:complexType>  
    21:    </xs:element>  
    22:  </xs:schema>  
    
    The XML schema produced above consists of maxOccurs and minOccurs attributes. In the "character" element and the "author" element, the maxOccurs attribute have been set to "unbounded" because there might be many occurences of the "character" element. The element "friend-of" have the minOccurs attribute set to zero because the element is optional.

Question Sheet:

Question Sheet: Lab 10b

Sources:

[1] http://www.w3schools.com/schema/schema_intro.asp
[2] http://www.w3schools.com/schema/schema_example.asp

QUICK QUESTIONS

  1. One of the advantages claimed for the "extended links", that the W3C consortium intended to be part of the XLink language, was that the definition of a particular hyperlink could be located, not in the local resource (the document where the link starts), or the remote resource (the document where the link ends), but in a quite different "third party" document. Why might this be an advantage?

    This might be an advantage because the third-party document is independent from the document that has the link. A linkbase (also known as a link database) is a file that consists of a number of third-party links.[1]
  2. The XLink language provides an attribute for a hyperlink called show – it has several possible values. What is the effect of providing such a link with each of the following attribute values?
    show="replace"
    show="new"
    show="embed"

    Which of these three attribute values is the default?

    The show attribute in the XLink language is used to determine how the object will be displayed. The show attribute consists of the following values:[2]
    • show="new" - This value is used to open the page in a new window;
    • show="replace" - This value is used to replace the current page with the new page;
    • show="embed" - This value is used to put the target inside the current page or inline;
    • show="other" - This value is used to search for an alternative markup inside the page;[3]
    • show="none" - This value contains no instructions.
    The default value of the show attribute is replace.

LONGER QUESTIONS

  1. Here is an XML document:
    <?xml version="1.0"?>
    <!DOCTYPE memo SYSTEM memo.dtd">
    <?xml-stylesheet href="stylesheet02.css" type="text/css"?>
    <memo>
     <heading>memo 1334</heading>
     <date>date: 11 November 09</date>
     <time>time: 09:30</time>
     <sender>from: The Managing Director</sender>
     <addressee>to: Heads of all Departments</addressee>
     <message>I think we should be making wind-turbines. Have a look at this website. Tell me what you think. </message>
    </memo>


    The accompanying .dtd file looks like this:
    <?xml version= "1.0" ?>
    <!DOCTYPE memo [
    <!ELEMENT memo (heading, date, time, sender, addressee, message)>
    <!ELEMENT heading (#PCDATA)>
    <!ELEMENT date (#PCDATA)>
    <!ELEMENT time (#PCDATA)>
    <!ELEMENT sender (#PCDATA)>
    <!ELEMENT addressee (#PCDATA)>
    <!ELEMENT message (#PCDATA)>
    ]>

    At the point where the document says this website, there is supposed to be a hyperlink that
    takes the reader to the website:
    http://engineering.suite101.com/article.cfm/wind_power.

    1. Amend the document, so that the link is in fact there. Make any necessary changes to the .dtd file as well.

      The XML file will look like the following:
        1: <?xml version="1.0"?>    
        2: <!DOCTYPE memo SYSTEM memo.dtd">    
        3: <?xml-stylesheet href="stylesheet02.css" type="text/css"?>    
        4: <memo xmlns:xlink="http://www.w3.org/1999/xlink">    
        5:   <heading>memo 1334</heading>    
        6:   <date>date: 11 November 09</date>    
        7:   <time>time: 09:30</time>    
        8:   <sender>from: The Managing Director</sender>    
        9:   <addressee>to: Heads of all Departments</addressee>    
        10:  <message>I think we should be making wind-turbines. Have a look at <websiteLink xlink:type="simple" xlink:href="http://engineering.suite101.com/article.cfm/wind_power">this website</websiteLink> Tell me what you think. </message>    
        11: </memo>   
      

      The DTD file will look like the following:
       1:  <?xml version= "1.0" ?>   
       2:  <!DOCTYPE memo [   
       3:  <!ELEMENT memo (heading, date, time, sender, addressee, message)>   
       4:  <!ELEMENT heading (#PCDATA)>   
       5:  <!ELEMENT date (#PCDATA)>   
       6:  <!ELEMENT time (#PCDATA)>   
       7:  <!ELEMENT sender (#PCDATA)>   
       8:  <!ELEMENT addressee (#PCDATA)>   
       9:  <!ELEMENT message (#PCDATA)>   
       10: <!ATTLIST memo xmlns:xlink CDATA #FIXED “http://www.w3.org/1999/xlink”>   
       11: <!ATTLIST websiteLink    
       12:    xlink:type (simple | extended) "simple"   
       13:    xlink:href CDATA #REQUIRED>   
       14: ]>   
      
    2. Suppose that the heading of one of the sections in the target website is <A NAME="WE Elec Facts">Wind Energy Electricity Facts</A>, including the tags as shown. What changes would you have to make to the link in the managing director’s memo, to make the hyperlink finish at that point rather than at the wind_power document as a whole?

      To specify a part of a document rather than a whole document, XPointers should be used
      <websiteLink xlink:type="simple"
      xlink:href="http://engineering.suite101.com/article.cfm/wind_power
      #xpointer(name('WE Elec Facts'))">this website</websiteLink>
  2. Here is another XML document:
    <?xml version="1.0"?>
    <!DOCTYPE memo SYSTEM memo.dtd">
    <?xml-stylesheet href="stylesheet02.css" type="text/css"?>
    <memo>
     <heading>memo 1335</heading>
     <date>date: 11 November 09</date>
     <time>time: 09:45</time>
     <sender>from: The Managing Director</sender>
     <addressee>to: Heads of all Departments</addressee>
     <message>I think we should be making solar panels. Have a look at this website. Tell me what you think. </message>
    </memo>

    At the point where the document says this website, there is supposed to be a hyperlink that takes the reader to a suitable website. Find one, and amend the document, so that the link is in fact there. Is it necessary to make any changes to the .dtd file, or can we use the file as you amended it before?

    The <websiteLink> tag will consist the following:
    1:  <websiteLink xlink:type="simple" xlink:href="http://www.solarpages.co.uk/Solar-Panels-Information/">this website</websiteLink>  
    
    The .dtd file that have been amended before can still be used.

Question Sheet:

Question Sheet: Lab 10a

Sources:

[1] Xpath, XLink, XPointer, and XML: a practical guide to Web hyperlinking and transclusion - Pg 61
[2] http://webdesign.about.com/od/xlink/a/how-to-write-xlink.htm
[3] http://www.cafeconleche.org/books/bible2/chapters/ch19.html

QUICK QUESTIONS

  1. Suppose that a CSS file is used to determine how an XML document will appear when viewed in a browser. Suppose that the CSS file contains two rules, one dictating that a particular piece of text will appear in bold type, the other dictating that it will not. What will happen?

    If the CSS is written as specified, that is, the first one dictating that a particular piece of text will appear in bold type and the second one dictating that it will not; the text will not be bold. This is because the latest definition will be applied.
  2. An XML document contains the sentence "The grand old Duke of York, he had 10000 men." Would XPath be able to extract the piece of data "10000" from such a document?

    XPath is used to select nodes from the XML documents using path expressions[1], and therefore could not extract data inside the elements. One possible way to extract the piece of data "10000" is by using the substring() function.[2]
      substring(source-string, start-position, number-of-characters)
      substring("The grand old Duke of York, he had 10000 men.", 35, 5)

LONGER QUESTIONS

  1. Download the following file from the OasisPlus CMT3315 web page for Unit 10 Learning Materials: 'chemElements2.xml'
    However, the file is supposed to be displayed as a table, with five columns. The top row of the table is supposed to be headings for the 5 columns: this row is supposed to have a distinctive background colour. The next 99 rows are supposed to show details of 99 of the chemical elements – these rows are also supposed to have a distinctive background colour, different from the heading.

    1. Open JCreator and open the chemElements2.xml file in it. Add a line to the document that will cause it to be viewed (in the browser) in conjunction with a CSS file called stylesheet01.css

      <?xml-stylesheet href="stylesheet01.css" type="text/css"?>
    2. Write the file, stylesheet01.css, and store it in the same folder. The content of this CSS file should cause the table to appear in the Mozilla Firefox browser, as described above. Make your own decisions about suitable typefaces, borders, background colours, alignment, etc.

      1:  chemElements {text-align:center;display:table; border: thin dotted black; width:50%; margin-left:25%; margin-top:5px; font-family:Arial; font-size: 12px; color:#181818;}  
      2:  tableHead {display:table-row;font-weight:bold;}  
      3:  anumHead, nameHead, symbolHead, mptHead, bptHead{display:table-cell; padding: 5px; border: thin dotted black; background-color:#666633;}  
      4:  element {display:table-row; background-color: #E0E0E0;}  
      5:  anum, name, symbol, mp, bp{display:table-cell; padding: 5px; border: thin dotted black;}  
      
      The above CSS will produce the following table:
    3. chemElements2.xml

  2. Consider the following XML document:
    <?xml version= "1.0" ?>
    <!DOCTYPE book SYSTEM "musicList.dtd">
    <?xml-stylesheet href="stylesheet04.css" type="text/css"?>
    <musicList
    number="2" title="miscellaneous CDs"
    xmlns:cdlist="http://middlesex_press.co.uk/CDcollection ">
      <cd number="711">
       <title>The Best of Ivor Cutler</title >
       <artist>Ivor Cutler</artist >
       <tracks total="19"/>
       <cdlist:refnum> POL767 </ cdlist:refnum >
      </cd>
      <cd number="712">
       <title>Penderecki’s First Symphony</title >
       <artist>Middlesex Symphony Orchestra</artist >
       <tracks total="5"/>
       <cdlist:refnum> DGM987 </ cdlist:refnum >
      </cd>
      <cd number="713">
       <title>Penderecki’s Last Symphony</title >
       <artist>Middlesex Symphony Orchestra</artist >
       <cdlist:refnum> DGM988 </ cdlist:refnum >
       <tracks total="5"/>
      </cd>
      <cd number="714">
       <title>Boris the Spider Rides Again</title >
       <artist>The Renegades</artist >
       <cdlist:refnum> CHR328 </ cdlist:refnum >
       <tracks total="19"/>
      </cd>
    </musicList>

    Provide XPath expressions which will do the following:

    1. Select all the elements subordinate to the root node.
    2. musicList

    3. Select all track elements that have a total attribute with the value of 5.
    4. //tracks[@total=5]

    5. Select all elements that contain the word "Penderecki" in their title.
    6. /musicList/cd/title[contains(., 'Penderecki')]

    7. Select any elements that have titles with greater than 11 characters.
    8. /musicList/cd/title[string-length() > 11]

    9. Select all the siblings of the first cd element
    10. /musicList/cd[1]/*

Question Sheet:

Question Sheet: Lab 9

Sources:

[1] http://www.w3schools.com/xpath/xpath_syntax.asp
[2] XPath: navigating XML with XPath 1.0 and 2.0 : kick start - Pg 119

QUICK QUESTIONS

  1. You have a set of legal documents. Each has four sections: the title, the case, the background, and the judgement, in that order. Each has been made into an XML document by inserting a prolog and suitable tags. You want to write a CSS file that will display these documents using a suitable browser.

    1. Can you write the CSS file in such a way that it will display the title, then the judgement, then the background, then the case?

      Generally in an XML document, XSL (eXtensible Stylesheet Language) is used for styling. XSL is used to show how the XML document will be presented. However CSS can be used to style both the XML and HTML document. To display the title, judgement, background and the case in that order, absolute positioning can be used.
    2. Can you write the CSS file in such a way that it will display just the title, and the judgement?

      To have only the title and the judgement to be displayed one can use the property "display" and the value "none", that is display:none;. Using this declaration, nothing will be displayed.
    3. If the CSS file is called legalWrit.css, what processing instruction should you put in the prolog of the XML document(s)?

      <?xml-stylesheet type="text/css" href="legalWrit.css"?>
  2. What is the difference between a URI and a URL?

    URI stands for Uniform Resource Identifier and consists of a number of characters that determines an Internet Resource. On the other hand, URL stands for Uniform Resource Locator and is a subset of URI. It determines the address of an Internet domain.[1]
  3. Why does the XML language allow namespaces?

    In an XML document, namespaces are used so that there would be no conflicts in the element names.[1] Therefore in XML, a prefix is used inside the element tag, like the following:
     1: <root>   
     2:  <s:student xmlns:s="http://student_research.com/name">   
     3:   <s:name>Stephanie Vella</s:name>   
     4:  </s:student>   
     5:  <m:module xmlns:m="http://student_research.com/module">   
     6:   <m:name>Advanced Web Technologies</m:name>   
     7:  </m:module>   
     8: </root>   
    
    Since there is <name> in both <student> and <module>, namespaces have been included.

LONGER QUESTIONS

  1. Here is a short XML document. Type it out, as a new file in JCreator. Save it under the name memo1.xml in a suitable directory in your file system. Notice that the JCreator editor picks out the different components in different colours, to aid you in detecting errors.

    <?xml version="1.0"?>
    <?xml-stylesheet href="stylesheet01.css" type="text/css"?>
    <!DOCTYPE memo>
    <memo>
       <id>Message: 1334</id>
       <date>18 November 09</date>
       <time>09:30</time>
       <from>From: The Managing Director</from>
       <to>To: Heads of all Departments</to>
       <message>We must increase production. And increasing sales would be no bad thing either.</message>
    </memo>

    Now open another tab in JCreator and type the following style sheet out. Save it under the name stylesheet01.css in the same folder as memo1.xml. Notice that, this time, the editor does not pick out the different components in different colours.
      memo {display: block; margin: 1em;}
      id {display: block; margin: 1em; font-style: italic; font-size:200%}
      date {display: block; margin: 1em;color: "dark blue"; text-align: left;}
      time {display: block; margin: 1em;color: aqua; text-align: left;}
      from, to {display: block; margin: 1em;color: green; text-align: left;}
      message {display: block; margin: 1em;color: blue; text-align: left;}

    Now use the Mozilla Firefox browser to view the file memo1.xml. 


    What was the point of putting "display: block" into the CSS file in each of the 6 lines?

    The point of putting "display:block" is such that the element will be presented as a heading or a paragraph. This means that the element will have a space above and below.[2]
    memo.xml and using CSS
  2. We want the chapter we were working on last week (“Chapter 2: Volcanic winter”) to be displayed on screen in a web browser. Here are some of the features we would like it to have: the font for the text to be Palatino, or failing that Times New Roman, or failing that any serif face. Type size to be 12 pt. The chapter heading to be the same font, but 24 pt and bold and italic and blue. The poem lines to be the same font, but italic. Background colour to be parchment: use the colour #FCFBC4. Both the chapter heading and the main text are to be indented from the left margin by 1 em. The lines of poetry are to be indented from the left margin by 2 ems.

    1. Write a CSS file that will enable the chapter to be displayed in this way. Call it stylesheet4.css

       1:  chapters{   
       2:   font-family: Palatino, "Times New Roman", serif;   
       3:   font-size: 12pt;   
       4:   margin-left: 1em;   
       5:   background-color:#FCFBC4;   
       6:  }   
       7:  chapterHead{   
       8:   font-size:24pt;   
       9:   font-style:italic;   
       10:  font-weight:bold;   
       11:  color:blue;   
       12:  }   
       13: line{   
       14:  font-style:italic;   
       15:  margin-left: 2em;   
       16: }   
      
    2. The above CSS wil produce the following:
    3. Write a different CSS file, with different display properties, and adjust your XML file so that it is displayed using this one instead. Use display properties that seem appropriate to you.

       1:  chapters{   
       2:   font-family:Palatino, "Times New Roman", serif;   
       3:   font-size: 12pt;   
       4:   margin-left: 1em;   
       5:   background-color:#FCFBC4;   
       6:   display:block;   
       7:  }   
       8:  chapterHead{   
       9:   font-size:24pt;   
       10:  font-style:italic;   
       11:  font-weight:bold;   
       12:  color:blue;   
       13:  display:block;   
       14:  }   
       15: line{   
       16:  font-style:italic;   
       17:  margin-left: 2em;   
       18:  display:block;   
       19: }   
      
      The above CSS will produce the following:

Question Sheet:

Question Sheet: Lab 8

Sources:

[1] http://www.w3schools.com/XML/xml_namespaces.asp
[2] http://www.quirksmode.org/css/display.html

QUICK QUESTIONS

  1. People who prepare XML documents sometimes put part of the document in a CDATA section.

    1. Why would they do that?

    2. The text inside the XML document is parsed by the parser. However text inside the CDATA section will be ignored by the parser.[1] That is, the symbols '&' and '>' can be inserted inside the CDATA section.
    3. How is the CDATA section indicated?

    4. A CDATA section will start with " <![CDATA[ " and will end with " ]]> ".[1]
    5. If CDATA sections hadn’t been invented, would there be any other way to achieve the same effect?

    6. If CDATA sections hadn't been invented, one could insert '&amp;' instead of '&' or 'gt;' instead of '>'. One can also put comments inside the XML document because, comments are also ignored by the parser.
  2. What is a parser and what does it have to do with validity?

    An XML parser converts the XML document into an XML Document Object Model (DOM).[2] If the XML document will not be valid, the parsing will not occur.
  3. You write a .dtd file to accompany a class of XML documents. You want one of the elements, with the tag <trinity>, to appear exactly three times within the document element of every document in this class. Is it possible for the .dtd file to specify this?

    No, one cannot specify this in a .dtd file. For the element to be shown:
    • One or more times - (+);
    • Zero or more times - (*);
    • Zero or one time - (?).

LONGER QUESTIONS

  1. The following (found in the question sheet) is one of the documents that featured in last week’s exercises. As mentioned before, this is to be "Chapter 2: Volcanic winter" in a book.

    1. Write a suitable prolog for this document.

    2. 1:  <?xml version="1.0" encoding="UTF-8"?>   
      2:  <!DOCTYPE chapter2 SYSTEM "chapter2.dtd">   
      
    3. Write a .dtd file to act as the Document Type Description for this document. Or modify the one you wrote last week, if you wrote one.

    4. 1:  <!DOCTYPE chapters[   
       2:  <!ELEMENT chapters (chapter+)>   
       3:  <!ELEMENT chapter (text)>   
       4:  <!ELEMENT text (paragraph+)>   
       5:  <!ELEMENT paragraph (index*, poem?)>   
       6:  <!ELEMENT index (#PCDATA)>   
       7:  <!ELEMENT poem (line+)>   
       8:  <!ELEMENT line (#PCDATA)>   
       9:  <!ATTLIST chapter num CDATA #REQUIRED>   
       10: <!ATTLIST chapter title CDATA #REQUIRED>   
       11: ]>   
      
    5. Put tags into the document. Obviously, there must be a document element. But also, the poem needs special treatment (because of the way it will be displayed) and, in fact, each line of the poem needs special treatment (you can spot the places where the lines start, by the capital letters). The mention of the poets at Geneva needs to be identified, because it will feature in the index, and so do the pyroclastic flows and Mount Tambora and Sumbawa and the year without a summer and the famines.

    6.  1:  <?xml version="1.0" encoding="UTF-8"?>   
       2:  <!DOCTYPE chapters SYSTEM "chapter2.dtd">   
       3:  <chapters>   
       4:   <chapter num="2" title="Volcanic Winter">   
       5:    <text>   
       6:     <paragraph>   
       7:     A volcanic winter is very bad news. The worst eruption in recorded history happened at <index>Mount Tambora</index> in 1815. It killed about 71 000 people locally, mainly because the <index>pyroclastic flows</index> killed everyone on the island of <index>Sumbawa</index> and the tsunamis drowned the neighbouring islands, but also because the ash blanketed many other islands and killed the vegetation. It also put about 160 cubic kilometres of dust and ash, and about 150 million tons of sulphuric acid mist, into the sky, which started a volcanic winter throughout the northern hemisphere. The next year was <index>the year without a summer</index>. No spring, no summer – it stayed dark and cold all the year round. This had its upside. In due course, all that ash and mist in the upper atmosphere made for some lovely sunsets, and Turner was inspired to paint this. The Lakeland poets took a holiday at Lake <index>Geneva</index>, and the weather was so horrible that Lord Byron was inspired to write this.   
       8:     </paragraph>   
       9:     <paragraph>   
       10:     <poem>   
       11:     <line>The bright sun was extinguish'd, and the stars</line>   
       12:     <line>Did wander darkling in the eternal space,</line>   
       13:     <line>Rayless, and pathless, and the icy earth</line>   
       14:     <line>Swung blind and blackening in the moonless air;</line>   
       15:     <line>Morn came and went – and came, and brought no day.</line>   
       16:     </poem>   
       17:     </paragraph>   
       18:    <paragraph>   
       19:    Mary Shelley was inspired to write Frankenstein. The downside was that there were <index>famines</index> throughout Europe, India, China and North America, and perhaps 200 000 people died of starvation in Europe alone.   
       20:    </paragraph>   
       21:   </text>   
       22:  </chapter>   
       23: </chapters>   
      

  2. This chapter obviously needs some pictures. You have available the following, and you decide to include them in the chapter, at appropriate places:
    • a picture of Sumbawa, after the volcanic eruption. It’s in a file sumbawa.jpg. Caption: "Sumbawa, after the volcanic eruption".
    • a picture of Lake Geneva, in 1816. It’s in a file Geneva1816.jpg. Caption: "Lake Geneva, during the summer of 1816".
    • a picture of Mary Shelley. It’s in a file MaryShelley.jpg. Caption: "Mary Shelley, author of Frankenstein".
    Amend your two files so that they can cope with these pictures and captions.The following is the DTD file:
     1:  <!DOCTYPE chapter2[   
     2:  <!ELEMENT chapters (chapter+)>   
     3:  <!ELEMENT chapter (text)>   
     4:  <!ELEMENT text (paragraph+)>   
     5:  <!ELEMENT paragraph (index*, poem?, image*)>   
     6:  <!ELEMENT index (#PCDATA)>   
     7:  <!ELEMENT poem (line+)>   
     8:  <!ELEMENT image EMPTY>   
     9:  <!ELEMENT line (#PCDATA)>   
     10: <!ATTLIST chapter num CDATA #REQUIRED>   
     11: <!ATTLIST chapter title CDATA #REQUIRED>   
     12: <!ATTLIST image source CDATA #REQUIRED>   
     13: <!ATTLIST image caption CDATA #REQUIRED>   
     14: <!NOTATION jpg PUBLIC "image/jpeg">   
     15: <!ENTITY sumbawa SYSTEM "sumbawa.jpg" NDATA jpg>   
     16: <!ENTITY geneva SYSTEM "sumbawa.jpg" NDATA jpg>   
     17: <!ENTITY maryShelley SYSTEM "sumbawa.jpg" NDATA jpg>   
     18: ]>   
    
    The following is the XML file:
     1:  <?xml version="1.0" encoding="UTF-8"?>   
     2:  <!DOCTYPE chapter2 SYSTEM "chapter2.dtd">   
     3:  <chapters>   
     4:   <chapter num="2" title="Volcanic Winter">   
     5:    <text>   
     6:     <paragraph>   
     7:     A volcanic winter is very bad news. The worst eruption in recorded history happened at <index>Mount Tambora</index> in 1815. It killed about 71 000 people locally, mainly because the <index>pyroclastic flows</index> killed everyone on the island of <index>Sumbawa</index><image source="sumbawa" caption="Sumbawa, after the volcanic eruption"/> and the tsunamis drowned the neighbouring islands, but also because the ash blanketed many other islands and killed the vegetation. It also put about 160 cubic kilometres of dust and ash, and about 150 million tons of sulphuric acid mist, into the sky, which started a volcanic winter throughout the northern hemisphere. The next year was <index>the year without a summer</index>. No spring, no summer – it stayed dark and cold all the year round. This had its upside. In due course, all that ash and mist in the upper atmosphere made for some lovely sunsets, and Turner was inspired to paint this. The Lakeland poets took a holiday at Lake <index>Geneva</index><image source="geneva" caption="Lake Geneva, during the summer of 1816"/>, and the weather was so horrible that Lord Byron was inspired to write this.   
     8:     </paragraph>   
     9:     <paragraph>   
     10:     <poem>   
     11:     <line>The bright sun was extinguish'd, and the stars</line>   
     12:     <line>Did wander darkling in the eternal space,</line>   
     13:     <line>Rayless, and pathless, and the icy earth</line>   
     14:     <line>Swung blind and blackening in the moonless air;</line>   
     15:     <line>Morn came and went – and came, and brought no day.</line>   
     16:     </poem>   
     17:    </paragraph>   
     18:    <paragraph>   
     19:    Mary Shelley <image source="maryShelley" caption="Mary Shelley, author of Frankenstein"/>was inspired to write Frankenstein. The downside was that there were <index>famines</index> throughout Europe, India, China and North America, and perhaps 200 000 people died of starvation in Europe alone.   
     20:    </paragraph>   
     21:   </text>   
     22:  </chapter>   
     23: </chapters>  
    
    The following were added:
    <image source="sumbawa" caption="Sumbawa, after the volcanic eruption"/>
    <image source="geneva" caption="Lake Geneva, during the summer of 1816"/>
    <image source="maryShelley" caption="Mary Shelley, author of Frankenstein"/>

Question Sheet:

Question Sheet: Lab 7

Sources:

[1] http://www.w3schools.com/xml/xml_cdata.asp
[2] http://www.w3schools.com/xml/xml_parser.asp

QUICK QUESTIONS

  1. What exactly does a DTD do in XML?

    In XML, DTD is used to define the structure of the XML document[1]. In order to have a valid XML document, it should follow the restrictions which have been specified in the DTD.[2]
  2. You’ve written an XML document, with the XML declaration <?xml version= "1.0"?> at the start. You realize that the text contains some arabic characters. Which of the following should you do:
    1. change the XML declaration to <?xml version= "1.0" encoding="ISO 8859-6"?>
    2. change the XML declaration to <?xml version= "1.0" encoding="UTF-8"?>
    3. do nothing: the declaration is fine as it is.

    Do nothing, the declaration is fine as it is. This is because if the encoding attribute is left out, by default it will be UTF-8.
  3. Can you use a binary graphics file in an XML document?

    Yes binary graphics can be used in an XML document. This will be done by using the NDATA keyword and the format code (gif, jpeg etc). Example:[3] <!ENTITY colliepic SYSTEM "lassie.jpg" NDATA JPEG>

LONGER QUESTIONS

  1. I decide to produce a book called "Toba: the worst volcanic eruption of all". I ask 3 colleagues to write three text files entitled: "Chapter 1: The mystery of Lake Toba’s origins". "Chapter 2: Volcanic winter". "Chapter 3: What Toba did to the human race". All three text files are placed into a folder c:\bookproject\chapters on the hard drive on my computer. I insert at the start of each file, and at the end. I name the three files chap1.xml, chap2.xml, and chap3.xml respectively. I draw up the title page, title page verso and contents page of the book like this:


    Toba: the worst volcanic eruption of all

         John Platts

         Jack Brilliant

         Jill Bright

         Joe Clever

    STC Press

    Malta

    Copyright © 2010

    STC Press



    Published by STC Press Ltd., Malta

    ISBN: 978-0-596-52722-0
    Contents

    Chapter 1: The mystery of Lake Toba’s origins

    Chapter 2: Volcanic winter

    Chapter 3: What Toba did to the human race

    Then I construct an XML document that encompasses the whole book.
    1. Provide this XML document
    2. Provide the accompanying .dtd file

    To avoid conflicts on element names, namespaces have been used. The following is the XML file:
    1:   <?xml version="1.0" ?>  
    2:   <!DOCTYPE b:book SYSTEM "book.dtd">  
    3:   <b:book xmlns:b="http://middlesex_press.co.uk/book">  
    4:   <tp:titlePage xmlns:tp="http://middlesex_press.co.uk/title_page">  
    5:     <tp:title>Toba: the worst volcanic eruption of all</tp:title>  
    6:     <tp:authors>  
    7:       <tp:author>  
    8:        <tp:firstName>John</tp:firstName>  
    9:        <tp:lastName>Platts</tp:lastName>  
    10:      </tp:author>  
    11:      <tp:author>  
    12:       <tp:firstName>Jack</tp:firstName>  
    13:       <tp:lastName>Brlliant</tp:lastName>  
    14:      </tp:author>  
    15:      <tp:author>  
    16:       <tp:firstName>Jill</tp:firstName>  
    17:       <tp:lastName>Bright</tp:lastName>  
    18:      </author>  
    19:      <tp:author>  
    20:       <tp:firstName>Joe</tp:firstName>  
    21:       <tp:lastName>Clever</tp:lastName>  
    22:      </tp:author>  
    23:    </tp:authors>  
    24:    <tp:publisher>STC Press</tp:publisher>  
    25:    <tp:location>Malta</tp:location>  
    26:  </tp:titlePage>  
    27:  <tps:titlePageVerso xmlns:tpv="http://middlesex_press.co.uk/title_page_verso>  
    28:    <tps:copyright>Copyright &copy; 2010</tps:copyright>  
    29:    <tps:publisher>STC Press</tps:publisher>  
    30:    <tps:publishedBy>STC Press Ltd, Malta</tps:publishedBy>  
    31:    <tps:ISBN>978-0-596-52722-0</tps:ISBN>  
    32:  </tps:titlePageVerso>  
    33:  <c:contents xmlns:c="http://middlesex_press.co.uk/conents>  
    34:   <c:chapter num="1" title="The mystery of Lake Toba's origins"/>  
    35:   <c:chapter num="2" title="Volcanic winter"/>  
    36:   <c:chapter num="3" title="What Toba did to the human race"/>  
    37:  </c:contents>  
    38:  <ch:chapters xmlns:ch="http://middlesex_press.co.uk/chapters  
    39:   <ch:chapter num="1" title="The mystery of Lake Toba's origins"> &ch1; <ch:chapter>  
    40:   <ch:chapter num="2" title="Volcanic winter"/> &ch2; </ch:chapter>  
    41:   <ch:chapter num="3" title="What Toba did to the human race"/> &ch3; </ch:chapter>  
    42:  </ch:chapters>  
    43:  </b:book>  
    
    The following is the DTD file:
    1:   <!DOCTYPE b:book [  
    2:   <!ELEMENT b:book (tp:titlePage, tpv:titlePageVerso, c:contents)>  
    3:   <!ELEMENT tp:titlePage (tp:title, tp:authors, tp:publisher, tp:location)>  
    4:   <!ELEMENT tpv:titlePageVerso (tpv:copyright, tpv:publisher, tpv:publishedBy, tpv:ISBN)  
    5:   <!ELEMENT c:contents (c:chapter+)>  
    6:   <!ELEMENT ch:chapters(c:chapter+)>  
    7:   <!ELEMENT tp:title (#PCDATA)>  
    8:   <!ELEMENT tp:authors (tp:author+)>  
    9:   <!ELEMENT tp:publisher (#PCDATA)>  
    10:  <!ELEMENT tp:location (#PCDATA)>  
    11:  <!ELEMENT tp:author (tp:firstName, tp:lastName)>  
    12:  <!ELEMENT tp:firstName (#PCDATA)>  
    13:  <!ELEMENT tp:lastName (#PCDATA)>  
    14:  <!ELEMENT tpv:copyright (#PCDATA)>  
    15:  <!ELEMENT tpv:publisher (#PCDATA)>  
    16:  <!ELEMENT tpv:publishedBy (#PCDATA)>  
    17:  <!ELEMENT tpv:ISBN (#PCDATA)>  
    18:  <!ELEMENT c:chapter (#PCDATA)>  
    19:  <!ELEMENT ch:chapter (#PCDATA)>  
    20:  <!ATTLIST b:book xmlns:b CDATA #FIXED "http://middlesex_press.co.uk/book">  
    21:  <!ATTLIST tp:titlePage xmlns:tp CDATA #FIXED "http://middlesex_press.co.uk/title_page">  
    22:  <!ATTLIST tpv:titlePageVerso xmlns:tpv CDATA #FIXED "http://middlesex_press.co.uktitle_page_verso/">  
    23:  <!ATTLIST c:contents xmlns:c CDATA #FIXED "http://middlesex_press.co.uk/contents">  
    24:  <!ATTLIST ch:chapters xmlns:ch CDATA #FIXED "http://middlesex_press.co.uk/chapters">  
    25:  <!ATTLIST c:chapter num CDATA #REQUIRED>  
    26:  <!ATTLIST c:chapter title CDATA #REQUIRED>  
    27:  <!ATTLIST ch:chapter num CDATA #REQUIRED>  
    28:  <!ATTLIST ch:chapter title CDATA #REQUIRED>  
    29:  <!ENTITY ch1 SYSTEM "chap1.xml">  
    30:  <!ENTITY ch2 SYSTEM "chap2.xml">  
    31:  <!ENTITY ch3 SYSTEM "chap3.xml">  
    32:  ]>  
    

Question Sheet:

Question Sheet: Lab 6

Sources:

[1] http://www.w3schools.com/xml/xml_dtd.asp
[2] http://www.informit.com/guides/content.aspx?g=xml&seqNum=223
[3] http://xml.silmaril.ie/graphics.html