|
PageMixer API - 3.1 | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--jp.ne.dti.lares.foozy.pagemixer.parser.LoosePageState
Page parser simple implementation.
This implementation only focuses on markup language syntax.
Validity of (1)tag names and (2)data structure are ignored, so this focuses on parsing as many markup language sources as possible, in other words.
This can parse tokens shown below. Some of them are for parsing XHTML, XML, BML and so on.
text
comment
(<!-- -->)
start tag
(<name>) with attributes
end tag
(</name>)
empty tag
(<name/>) with attributes
DOCTYPE
(<!DOCTYPE ... >) with internal subset
CDATA
(<![CDATA[ ... ]]>)
embedded script
(<% ... %gt;) like JSP/ASP and so on
Processing Instruction
(<?name ... ?>)
Constructor Summary | |
LoosePageState()
Constructor. |
|
LoosePageState(SymbolSet symbolSet)
Constructor. |
|
LoosePageState(SymbolSet symbolSet,
boolean ignoreCase)
Constructor. |
Method Summary | |
protected Token |
createCDATAToken()
|
protected Token |
createCommentToken()
|
protected Token |
createDeclToken()
|
protected Token |
createEndTagToken()
|
protected Token |
createPIToken()
|
protected Token |
createScriptToken()
|
protected Token |
createStartTagToken(boolean empty)
|
protected Token |
createTextToken()
|
protected void |
ensureTokenName(java.lang.String name)
|
protected void |
fixAttrName()
|
protected void |
fixAttrValue()
|
protected void |
fixAttrValue(char c)
|
protected void |
fixDeclValue(char c)
|
protected java.lang.String |
fixString()
|
protected Symbol |
fixSymbol(int type)
|
protected void |
fixTokenName(int type)
|
Token |
flush()
Indicate no more character to parse. |
Token |
input(char c)
Parse next character. |
protected void |
push(char c)
Push character to queue. |
void |
reset()
Reset current status. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public LoosePageState()
This is equivalent to
LoosePageState(HTMLSymbolSet.SET, true)
.
public LoosePageState(SymbolSet symbolSet)
This is equivalent to
LoosePageState(symbolSet, true)
.
public LoosePageState(SymbolSet symbolSet, boolean ignoreCase)
Name of
start
/
end
tag tokens and
attributes
are lower-cased to unify as
symbol
,
if you specify true as 'ignoreCase'.
If so, you can use both upper-case and lower-case
for same names in same page.
ATTENTION:
"DOCTYPE
" for
DocTypeToken
and "CDATA
" for
CDATAToken
are upper-cased,
and name of
processing instructions
are parsed case sensitively
always.
symbolSet
- to resolve name of tokens and attributesignoreCase
- whether name of tokens are case insensitive or notMethod Detail |
protected void push(char c)
c
- character to pushfixSymbol(int)
,
fixString()
protected void fixTokenName(int type) throws PageParseException
protected void fixAttrName() throws PageParseException
protected Symbol fixSymbol(int type) throws PageParseException
protected void ensureTokenName(java.lang.String name) throws PageParseException
protected java.lang.String fixString()
protected Token createTextToken()
protected Token createStartTagToken(boolean empty)
protected void fixAttrValue()
protected void fixAttrValue(char c)
protected Token createEndTagToken()
protected Token createCommentToken()
protected Token createDeclToken() throws PageParseException
protected void fixDeclValue(char c)
protected Token createScriptToken()
protected Token createCDATAToken() throws PageParseException
protected Token createPIToken()
public Token input(char c) throws PageParseException
PageState
Input next character in character sequence of page to parse,
and return Token
, if it recognizes.
Almost all Token
, derived class in fact, consists of so many characters,
so this method returns null
frequently.
input
in interface PageState
jp.ne.dti.lares.foozy.pagemixer.parser.PageState
c
- next character in page to parsepublic Token flush() throws PageParseException
PageState
Indicate there is no more character in character sequence
of page to parse, and return
Token
,
if it recognizes.
This method works like "input(EOF)".
flush
in interface PageState
jp.ne.dti.lares.foozy.pagemixer.parser.PageState
public void reset()
PageState
Reset current status of page state to re-use it.
This must not cause PageParseException
though PageState.flush()
may cause.
reset
in interface PageState
|
PageMixer API - 3.1 | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |