public abstract class StartTagType extends TagType
A start tag type is any TagType
that starts with the character '<
'
(as with all tag types), but whose second character is not '/
'.
This includes types for many tags which stand alone, without a corresponding end tag, and would not intuitively be categorised as a "start tag". For example, an HTML comment in a document is represented as a single start tag that spans the whole comment, and does not have an end tag at all.
The singleton instances of all the standard start tag types are available in this class as static fields.
Because all StartTagType
instaces must be singletons, the '==
' operator can be used to test for a particular tag type
instead of the equals(Object)
method.
EndTagType
Modifier and Type | Field and Description |
---|---|
static StartTagType |
CDATA_SECTION
The tag type given to a CDATA section
(
<![CDATA[ ... ]]> ). |
static StartTagType |
COMMENT
The tag type given to an HTML comment
(
<!-- ... --> ). |
static StartTagType |
DOCTYPE_DECLARATION
The tag type given to a document type declaration
(
<!DOCTYPE ... > ). |
static StartTagType |
MARKUP_DECLARATION
The tag type given to a markup declaration
(
<!ELEMENT ... > | <!ATTLIST ... > | <!ENTITY ... > | <!NOTATION ... > ). |
static StartTagType |
NORMAL
The tag type given to a normal HTML or XML start tag
(
<name ... > ). |
static StartTagType |
SERVER_COMMON
The tag type given to a common server tag
(
<% ... %> ). |
static StartTagType |
SERVER_COMMON_COMMENT
The tag type given to a common server comment tag
(
<%-- ... --%> ). |
static StartTagType |
SERVER_COMMON_ESCAPED
The tag type given to an escaped common server tag
(
<\% ... %> ). |
static StartTagType |
UNREGISTERED
|
static StartTagType |
XML_DECLARATION
The tag type given to an XML declaration
(
<?xml ... ?> ). |
static StartTagType |
XML_PROCESSING_INSTRUCTION
The tag type given to an XML processing instruction
(
<?PITarget ... ?> ). |
Modifier | Constructor and Description |
---|---|
protected |
StartTagType(java.lang.String description,
java.lang.String startDelimiter,
java.lang.String closingDelimiter,
EndTagType correspondingEndTagType,
boolean isServerTag,
boolean hasAttributes,
boolean isNameAfterPrefixRequired)
Constructs a new
StartTagType object with the specified properties. |
Modifier and Type | Method and Description |
---|---|
boolean |
atEndOfAttributes(Source source,
int pos,
boolean isClosingSlashIgnored)
Indicates whether the specified source document position is at the end of a tag's attributes.
|
protected StartTag |
constructStartTag(Source source,
int begin,
int end,
java.lang.String name,
Attributes attributes)
Internal method for the construction of a
StartTag object if this type. |
EndTagType |
getCorrespondingEndTagType()
|
boolean |
hasAttributes()
Indicates whether a start tag of this type contains attributes.
|
boolean |
isNameAfterPrefixRequired()
Indicates whether a valid XML tag name is required directly after the prefix.
|
protected Attributes |
parseAttributes(Source source,
int startTagBegin,
java.lang.String tagName)
Internal method for the parsing of
Attributes . |
public static final StartTagType UNREGISTERED
< ... >
).
See the documentation of the Tag.isUnregistered()
method for details.
Property | Value |
---|---|
Description | unregistered |
StartDelimiter | <
|
ClosingDelimiter | >
|
IsServerTag | false
|
NamePrefix | (empty string) |
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<"This is not recognised as any of the predefined tag types in this library">
EndTagType.UNREGISTERED
public static final StartTagType NORMAL
<name ... >
).
Property | Value |
---|---|
Description | normal |
StartDelimiter | <
|
ClosingDelimiter | >
|
IsServerTag | false
|
NamePrefix | (empty string) |
CorrespondingEndTagType | EndTagType.NORMAL
|
HasAttributes | true
|
IsNameAfterPrefixRequired | true
|
<div class="NormalDivTag">
public static final StartTagType COMMENT
<!-- ... -->
).
An HTML comment is an area of the source document enclosed by the delimiters
<!--
on the left and -->
on the right.
The HTML 4.01 specification section 3.2.4
states that the end of comment delimiter may contain white space between the "--
" and ">
" characters,
but this library does not recognise end of comment delimiters containing white space.
In the default configuration, any non-server tag appearing within an HTML comment is ignored by the parser. See the documentation of the tag parsing process for more information.
Property | Value |
---|---|
Description | comment |
StartDelimiter | <!--
|
ClosingDelimiter | -->
|
IsServerTag | false
|
NamePrefix | !--
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<!-- This is a comment -->
public static final StartTagType XML_DECLARATION
<?xml ... ?>
).
An XML declaration is often referred to in texts as a special type of processing instruction with the reserved
PITarget name of "xml
".
Technically it is not an XML processing instruction at all, but is still a type of
SGML processing instruction.
According to section 2.8 of the XML 1.0 specification, a valid XML declaration can specify only "version", "encoding" and "standalone" attributes in that order. This library parses the attributes of an XML declaration in the same way as those of a normal tag, without checking that they conform to the specification.
Property | Value |
---|---|
Description | XML declaration |
StartDelimiter | <?xml
|
ClosingDelimiter | ?>
|
IsServerTag | false
|
NamePrefix | ?xml
|
CorrespondingEndTagType | null
|
HasAttributes | true
|
IsNameAfterPrefixRequired | false
|
<?xml version="1.0" encoding="UTF-8"?>
public static final StartTagType XML_PROCESSING_INSTRUCTION
<?PITarget ... ?>
).
An XML processing instruction is a specific form of SGML processing instruction with the following two additional constraints:
?>
' instead of just a single
'>
' character.
<?
' start delimiter).
This library does not include a predefined generic tag type for SGML processing instructions as the only forms in which they are found in HTML documents are the more specific XML processing instruction and the XML declaration, both of which have their own dedicated predefined tag type.
There is no restriction on the contents of an XML processing instruction. In particular, it can not be assumed that the processing instruction contains attributes, in contrast to the XML declaration.
Note that registering the PHPTagTypes.PHP_SHORT
tag type overrides this tag type.
This is because they both have the same start delimiter,
so the one registered latest takes precedence over the other.
See the documentation of the PHPTagTypes
class for more information.
Property | Value |
---|---|
Description | XML processing instruction |
StartDelimiter | <?
|
ClosingDelimiter | ?>
|
IsServerTag | false
|
NamePrefix | ?
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | true
|
<?xml-stylesheet href="standardstyle.css" type="text/css"?>
public static final StartTagType DOCTYPE_DECLARATION
<!DOCTYPE ... >
).
Information about the document type declaration can be found in the HTML 4.01 specification section 7.2, and the XML 1.0 specification section 2.8.
The "!DOCTYPE
" tag name is required to be in upper case in the source document,
but all tag properties are stored in lower case because this library performs all parsing in lower case.
Property | Value |
---|---|
Description | document type declaration |
StartDelimiter | <!doctype
|
ClosingDelimiter | >
|
IsServerTag | false
|
NamePrefix | !doctype
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
public static final StartTagType MARKUP_DECLARATION
<!ELEMENT ... >
| <!ATTLIST ... >
| <!ENTITY ... >
| <!NOTATION ... >
).
The name of a markup declaration tag is must be one of
"!element
", "!attlist
", "!entity
" or "!notation
".
These tag names are required to be in upper case in the source document,
but all tag properties are stored in lower case because this library performs all parsing in lower case.
Markup declarations usually appear inside a document type definition (DTD), which is usually an external document to the HTML or XML document, but they can also appear directly within the document type declaration which is why they must be recognised by the parser.
Property | Value |
---|---|
Description | markup declaration |
StartDelimiter | <!
|
ClosingDelimiter | >
|
IsServerTag | false
|
NamePrefix | !
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | true
|
<!ELEMENT BODY O O (%flow;)* +(INS|DEL) -- document body -->
public static final StartTagType CDATA_SECTION
<![CDATA[ ... ]]>
).
A CDATA section is a specific form of a marked section. This library does not include a predefined generic tag type for marked sections, as the only type of marked sections found in HTML documents are CDATA sections.
The HTML 4.01 specification section B.3.5 and the XML 1.0 specification section 2.7 contain definitions for a CDATA section.
There is inconsistency between the SGML and HTML/XML specifications in the definition of a marked section.
SGML requires the presence of a space between the "<![
" prefix and the keyword, and allows a space after the keyword.
The XML specification forbids these spaces, and the examples given in the HTML specification do not include them either.
This library only recognises CDATA sections that do not include the spaces.
The "![CDATA[
" tag name is required to be in upper case in the source document according to the HTML/XML specifications,
but all tag properties are stored in lower case because this makes it more efficient for the library to perform case-insensitive
parsing of all tags.
In the default configuration, any non-server tag appearing within a CDATA section is ignored by the parser. See the documentation of the tag parsing process for more information.
Property | Value |
---|---|
Description | CDATA section |
StartDelimiter | <![cdata[
|
ClosingDelimiter | ]]>
|
IsServerTag | false
|
NamePrefix | ![cdata[
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<script type="text/javascript">
//<![CDATA[
function min(a,b) {return a<b ? a : b;}
//]]>
</script>
public static final StartTagType SERVER_COMMON
<% ... %>
).
Common server tags include ASP, JSP, PSP, ASP-style PHP, eRuby, and Mason substitution tags.
This tag, the escaped common server tag and the common server comment tag are the only standard tag types that define server tags. They are included as standard tag types because of the common server tag's widespread use in many platforms, including those listed above.
Property | Value |
---|---|
Description | common server tag |
StartDelimiter | <%
|
ClosingDelimiter | %>
|
IsServerTag | true
|
NamePrefix | %
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<%@ include file="header.html" %>
public static final StartTagType SERVER_COMMON_ESCAPED
<\% ... %>
).
Some of the platforms that support the common server tag also support a mechanism to escape that tag by adding a
backslash (\
) before the percent (%
) character.
Although rarely used, this tag type allows the parser to recognise these escaped tags in addition to the common server tag itself.
Property | Value |
---|---|
Description | escaped common server tag |
StartDelimiter | <\%
|
ClosingDelimiter | %>
|
IsServerTag | true
|
NamePrefix | \%
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<\%@ include file="header.html" %>
public static final StartTagType SERVER_COMMON_COMMENT
<%-- ... --%>
).
Some of the platforms that support the common server tag, such as JSP, also support a server based comment tag that allow nested server tags.
Property | Value |
---|---|
Description | common server comment tag |
StartDelimiter | <%--
|
ClosingDelimiter | --%>
|
IsServerTag | true
|
NamePrefix | %--
|
CorrespondingEndTagType | null
|
HasAttributes | false
|
IsNameAfterPrefixRequired | false
|
<%-- this server side comment contains a <%="nested"%> server tag --%>
protected StartTagType(java.lang.String description, java.lang.String startDelimiter, java.lang.String closingDelimiter, EndTagType correspondingEndTagType, boolean isServerTag, boolean hasAttributes, boolean isNameAfterPrefixRequired)
StartTagType
object with the specified properties.
As StartTagType
is an abstract class, this constructor is only called from sub-class constructors.
description
- a description of the new start tag type useful for debugging purposes.startDelimiter
- the start delimiter of the new start tag type.closingDelimiter
- the closing delimiter of the new start tag type.correspondingEndTagType
- the corresponding end tag type of the new start tag type.isServerTag
- indicates whether the new start tag type is a server tag.hasAttributes
- indicates whether the new start tag type has attributes.isNameAfterPrefixRequired
- indicates whether a name is required after the prefix.public final EndTagType getCorrespondingEndTagType()
This can be represented by the following expression that is always true
given an arbitrary element
that has an end tag:
element.
getStartTag()
.
getStartTagType()
.
getCorrespondingEndTagType()
==element.
getEndTag()
.
getEndTagType()
Start Tag Type | Corresponding End Tag Type |
---|---|
UNREGISTERED | null
|
NORMAL | EndTagType.NORMAL
|
COMMENT | null
|
XML_DECLARATION | null
|
XML_PROCESSING_INSTRUCTION | null
|
DOCTYPE_DECLARATION | null
|
MARKUP_DECLARATION | null
|
CDATA_SECTION | null
|
SERVER_COMMON | null
|
SERVER_COMMON_ESCAPED | null
|
SERVER_COMMON_COMMENT | null
|
Element
.EndTagType.getCorrespondingStartTagType()
public final boolean hasAttributes()
The attributes start at the end of the name and continue until the closing delimiter is encountered. If the character sequence representing the closing delimiter occurs within a quoted attribute value it is not recognised as the end of the tag.
The atEndOfAttributes(Source, int pos, boolean isClosingSlashIgnored)
method can be overridden to provide more control
over where the attributes end.
Start Tag Type | Has Attributes |
---|---|
UNREGISTERED | false
|
NORMAL | true
|
COMMENT | false
|
XML_DECLARATION | true
|
XML_PROCESSING_INSTRUCTION | false
|
DOCTYPE_DECLARATION | false
|
MARKUP_DECLARATION | false
|
CDATA_SECTION | false
|
SERVER_COMMON | false
|
SERVER_COMMON_ESCAPED | false
|
SERVER_COMMON_COMMENT | false
|
true
if a start tag of this type contains attributes, otherwise false
.public final boolean isNameAfterPrefixRequired()
If this property is true
, the name of the tag consists of the
prefix followed by an XML tag name.
If this property is false
, the name of the tag consists of only the
prefix.
Start Tag Type | Name After Prefix Required |
---|---|
UNREGISTERED | false
|
NORMAL | true
|
COMMENT | false
|
XML_DECLARATION | false
|
XML_PROCESSING_INSTRUCTION | true
|
DOCTYPE_DECLARATION | false
|
MARKUP_DECLARATION | true
|
CDATA_SECTION | false
|
SERVER_COMMON | false
|
SERVER_COMMON_ESCAPED | false
|
SERVER_COMMON_COMMENT | false
|
true
if a valid XML tag name is required directly after the prefix, otherwise false
.public boolean atEndOfAttributes(Source source, int pos, boolean isClosingSlashIgnored)
This method is called internally while parsing attributes to detect where they should end.
It can be assumed that the specified position is not inside a quoted attribute value.
The default implementation simply compares the parse text at the specified
position with the closing delimiter, and is equivalent to:
source.
getParseText()
.containsAt(
getClosingDelimiter()
,pos)
The isClosingSlashIgnored
parameter is only relevant in the NORMAL
start tag type,
which makes use of it to cater for the '/
' character that can occur before the
closing delimiter in empty-element tags.
It's value is always false
when passed to other start tag types.
source
- the Source
document.pos
- the character position in the source document.isClosingSlashIgnored
- indicates whether the name of the start tag being tested is incompatible with an empty-element tag.true
if the specified source document position is at the end of a tag's attributes, otherwise false
.protected final StartTag constructStartTag(Source source, int begin, int end, java.lang.String name, Attributes attributes)
StartTag
object if this type.
Intended for use from within the constructTagAt(Source, int pos)
method.
protected final Attributes parseAttributes(Source source, int startTagBegin, java.lang.String tagName)
Attributes
.
Intended for use from within the constructTagAt(Source, int pos)
method.
The returned Attributes
segment begins at startTagBegin+1+tagName.length()
,
and ends straight after the last attribute found before the tag's closing delimiter.
Only returns null
if the segment contains a major syntactical error
or more than the default maximum number of
minor syntactical errors.
source
- the Source
document.startTagBegin
- the position in the source document at which the start tag is to begin.tagName
- the name of the start tag to be constructed.Attributes
of the start tag to be constructed, or null
if too many errors occur while parsing.