XML Interview Questions with Answers Page II


From freshersonline.com

Jump to: navigation, search

Interview Question Home


1. Give some examples of XML DTDs or schemas that you have worked with.

Although XML does not require data to be validated against a DTD, many of the benefits of using the technology are derived from being

able to validate XML documents against business or technical architecture rules. Polling for the list of DTDs that developers have worked

with provides insight to their general exposure to the technology. The ideal candidate will have knowledge of several of the commonly used

DTDs such as FpML, DocBook, HRML, and RDF, as well as experience designing a custom DTD for a particular project where no standard existed.


2. Using XSLT, how would you extract a specific attribute from an element in an XML document?

Successful candidates should recognize this as one of the most basic applications of XSLT. If they are not able to construct a reply similar

to the example below, they should at least be able to identify the components necessary for this operation: xsl:template to match

the appropriate

XML element, xsl:value-of to select the attribute value, and the optional xsl:apply-templates to continue processing the document.


Extract Attributes from XML Data

Example 1.

<xsl:template match="element-name">

Attribute Value:

<xsl:value-of select="@attribute"/>

<xsl:apply-templates/>

</xsl:template>


3. When constructing an XML DTD, how do you create an external entity reference in an attribute value?

Every interview session should have at least one trick question. Although possible when using SGML, XML DTDs don't support defining

external entity references in attribute values. It's more important for the candidate to respond to this question in a logical way than

than the candidate know the somewhat obscure answer.


4. Does XML replace HTML?

No. XML itself does not replace HTML. Instead, it provides an alternative which allows you to define your own set of markup elements.

HTML is expected to remain in common use for some time to come, and the current version of HTML is in XML syntax. XML is designed to

make the writing of DTDs much simpler than with full SGML. (See the question on DTDs for what one is and why you might want one.)


5. Do I have to know HTML or SGML before I learn XML?

No, although it's useful because a lot of XML terminology and practice derives from two decades' experience of SGML.

Be aware that ‘knowing HTML’ is not the same as ‘understanding SGML’. Although HTML was written as an SGML application, browsers ignore

most of it (which is why so many useful things don't work), so just because something is done a certain way in HTML browsers does not

mean it's correct, least of all in XML.


6. Which parts of an XML document are case-sensitive?

All of it, both markup and text. This is significantly different from HTML and most other SGML applications. It was done to allow markup

in non-Latin-alphabet languages, and to obviate problems with case-folding in writing systems which are caseless.

  • Element type names are case-sensitive: you must follow whatever combination of upper- or lower-case you use to define them (either by

first usage or in a DTD or Schema). So you can't say <BODY>…</body>: upper- and lower-case must match; thus <Img/>, <IMG/>, and <img/>

are three different element types;

  • For well-formed XML documents with no DTD, the first occurrence of an element type name defines the casing;
  • Attribute names are also case-sensitive, for example the two width attributes in <PIC width="7in"/> and <PIC WIDTH="6in"/> (if they

occurred in the same file) are separate attributes, because of the different case of width and WIDTH;

  • Attribute values are also case-sensitive. CDATA values (eg Url="MyFile.SGML") always have been, but NAME types (ID and IDREF attributes,

and token list attributes) are now case-sensitive as well;

  • All general and parameter entity names (eg Á), and your data content (text), are case-sensitive as always.


7. How can I make my existing HTML files work in XML?

Either convert them to conform to some new document type (with or without a DTD or Schema) and write a stylesheet to go with them; or

edit them to conform to XHTML.

It is necessary to convert existing HTML files because XML does not permit end-tag minimisation (missing

, etc), unquoted attribute values, and a number of other SGML shortcuts which have been normal in most HTML DTDs. However, many HTML

authoring tools already produce almost (but not quite) well-formed XML

You may be able to convert HTML to XHTML using the Dave Raggett's HTML Tidy program, which can clean up some of the formatting mess left

behind by inadequate HTML editors, and even separate out some of the formatting to a stylesheet, but there is usually still some

hand-editing to do.


8. Is there an XML version of HTML?

Yes, the W3C recommends using XHTML which is ‘a reformulation of HTML 4 in XML 1.0’. This specification defines HTML as an XML application,

and provides three DTDs corresponding to the ones defined by HTML 4.* (Strict, Transitional, and Frameset).

The semantics of the elements and their attributes are as defined in the W3C Recommendation for HTML 4. These semantics provide the foundation

for future extensibility of XHTML. Compatibility with existing HTML browsers is possible by following a small set of guidelines (see the W3C site).


9. If XML is just a subset of SGML, can I use XML files directly with existing SGML tools?

Yes, provided you use up-to-date SGML software which knows about the WebSGML Adaptations TC to ISO 8879 (the features needed to support XML,

such as the variant form for EMPTY elements; some aspects of the SGML Declaration such as NAMECASE GENERAL NO; multiple attribute token list

declarations, etc).

An alternative is to use an SGML DTD to let you create a fully-normalized SGML file, but one which does not use empty elements; and then remove

the DocType Declaration so it becomes a well-formed DTDless XML file. Most SGML tools now handle XML files well, and provide an option switch

between the two standards.


10. What's a Document Type Definition (DTD) and where do I get one?

A DTD is a description in XML Declaration Syntax of a particular type or class of document. It sets out what names are to be used for the

different types of element, where they may occur, and how they all fit together. (A question C.16, Schema does the same thing in XML Document

Syntax, and allows more extensive data-checking.)

For example, if you want a document type to be able to describe Lists which contain Items, the relevant part of your DTD might contain

something like this:

<!ELEMENT List (Item)+>

<!ELEMENT Item (#PCDATA)>

This defines a list as an element type containing one or more items (that's the plus sign); and it defines items as element types containing

just plain text (Parsed Character Data or PCDATA). Validators read the DTD before they read your document so that they can identify where

every element type ought to come and how each relates to the other, so that applications which need to know this in advance (most editors,

search engines, navigators, and databases) can set themselves up correctly. The example above lets you create lists like:


<List>

<Item>Chocolate</Item>

<Item>Music</Item>

<Item>Surfingv</Item>

</List>


(The indentation in the example is just for legibility while editing: it is not required by XML.)

A DTD provides applications with advance notice of what names and structures can be used in a particular document type. Using a DTD and a

validating editor means you can be certain that all documents of that particular type will be constructed and named in a consistent and

conformant manner.

DTDs are not required for processing the tip in question Bwell-formed documents, but they are needed if you want to take advantage of XML's

special attribute types like the built-in ID/IDREF cross-reference mechanism; or the use of default attribute values; or references to

external non-XML files (‘Notations’); or if you simply want a check on document validity before processing.

There are thousands of DTDs already in existence in all kinds of areas (see the SGML/XML Web pages for pointers). Many of them can be downloaded

and used freely; or you can write your own (see the question on creating your own DTD. Old SGML DTDs need to be converted to XML for use with XML

systems: read the question on converting SGML DTDs to XML, but most popular SGML DTDs are already available in XML form.

The alternatives to a DTD are various forms of question C.16, Schema. These provide more extensive validation features than DTDs, including

character data content validation.


11. Does XML let me make up my own tags?

No, it lets you make up names for your own element types. If you think tags and elements are the same thing you are already in considerable

trouble: read the rest of this question carefully.


12. How do I create my own document type?

Document types usually need a formal description, either a DTD or a Schema. Whilst it is possible to process well-formed XML documents without

any such description, trying to create them without one is asking for trouble. A DTD or Schema is used with an XML editor or API interface to

guide and control the construction of the document, making sure the right elements go in the right places.

Creating your own document type therefore begins with an analysis of the class of documents you want to describe: reports, invoices, letters,

configuration files, credit-card verification requests, or whatever. Once you have the structure correct, you write code to express this formally,

using DTD or Schema syntax.


13. How do I write my own DTD?

You need to use the XML Declaration Syntax (very simple: declaration keywords begin with

<!ELEMENT Shopping-List (Item)+>

<!ELEMENT Item (#PCDATA)>


It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one Item

(that's the plus sign) but there may be more than one. It also says that the Item element may contain only parsed character data (PCDATA, ie

text: no further markup).

Because there is no other element which contains Shopping-List, that element is assumed to be the ‘root’ element, which encloses everything

else in the document. You can now use it to create an XML file: give your editor the declarations:

<?xml version="1.0"?>

<!DOCTYPE Shopping-List SYSTEM "shoplist.dtd">


(assuming you put the DTD in that file). Now your editor will let you create files according to the pattern:

<Shopping-List>


<Item>Chocolate</Item>

<Item>Sugar</Item>

<Item>Butter</Item>

</Shopping-List>


It is possible to develop complex and powerful DTDs of great subtlety, but for any significant use you should learn more about document

systems analysis and document type design. See for example Developing SGML DTDs: From Text to Model to Markup (Maler and el Andaloussi,

1995): this was written for SGML but perhaps 95% of it applies to XML as well, as XML is much simpler than full SGML—see the list of

restrictions which shows what has been cut out.


Warning

Incidentally, a DTD file never has a DOCTYPE Declaration in it: that only occurs in an XML document instance (it's what references the DTD).

And a DTD file also never has an XML Declaration at the top either. Unfortunately there is still software around which inserts one or both

of these.


14. Can a root element type be explicitly declared in the DTD?

No. This is done in the document's Document Type Declaration, not in the DTD.


15. How do I get XML into or out of a database?

Ask your database manufacturer: they all provide XML import and export modules to connect XML applications with databases. In some trivial

cases there will be a 1:1 match between field names in the database table and element type names in the XML Schema or DTD, but in most cases

some programming will be required to establish the desired match. This can usually be stored as a procedure so that subsequent uses are simply

commands or calls with the relevant parameters.

In less trivial, but still simple, cases, you could export by writing a report routine that formats the output as an XML document, and you could

import by writing an XSLT transformation that formatted the XML data as a load file.

Personal tools