PageMixer API - 3.1

jp.ne.dti.lares.foozy.pagemixer.mixer
Class PageParser

java.lang.Object
  |
  +--jp.ne.dti.lares.foozy.pagemixer.mixer.PageParser
All Implemented Interfaces:
Producer.Factory

public class PageParser
extends java.lang.Object
implements Producer.Factory

Parse given HTML file and create producer.

This uses state-full(and shall be thread un-safe) PageState object, so is not thread un-safe.

See main(java.lang.String[]) to use as stand-alone application.


Constructor Summary
PageParser()
          Constructor.
PageParser(PageState pageState)
          Constructor.
 
Method Summary
 Producer create(java.io.InputStream stream, java.lang.String encoding)
          Create "Producer".
static void main(java.lang.String[] args)
          Invoke as stand-alone application.
 Producer parse(java.io.Reader reader)
          Parse page and create token producer.
 Producer parse(java.lang.String filename)
          Parse file corresponded to specified filename.
 Producer parse(java.lang.String filename, java.lang.String encoding)
          Parse file corresponded to specified filename.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PageParser

public PageParser()
Constructor.

Create parser with LoosePageState.


PageParser

public PageParser(PageState pageState)
Constructor.

Create parser with specified PageState.

Parameters:
pageState - state to be used in parsing.
Method Detail

parse

public final Producer parse(java.lang.String filename)
                     throws java.io.IOException,
                            PageParseException
Parse file corresponded to specified filename.

This method creates InputStreamReader to wrap FileInputStream.

This causes to use system default encoding to read file in, so you should use parse(String, String) to specify encoding.

Parameters:
filename - of target HTML file
Returns:
producer to produce token sequence representing given page
Throws:
java.io.IOException - thrown when I/O fail for some reasons
PageParseExeption - thrown when PageState does

parse

public final Producer parse(java.lang.String filename,
                            java.lang.String encoding)
                     throws java.io.IOException,
                            PageParseException
Parse file corresponded to specified filename.

This method creates InputStreamReader to wrap FileInputStream with specified encoding.

Parameters:
filename - of target HTML file
encoding - of target HTML file
Returns:
producer to produce token sequence representing given page
Throws:
java.io.IOException - thrown when I/O fail for some reasons
PageParseExeption - thrown when PageState does

parse

public Producer parse(java.io.Reader reader)
               throws java.io.IOException,
                      PageParseException
Parse page and create token producer.
Parameters:
reader - to read page from
Returns:
producer to produce token sequence representing given page
Throws:
java.io.IOException - thrown when I/O fail for some reasons
PageParseExeption - thrown when PageState does

create

public Producer create(java.io.InputStream stream,
                       java.lang.String encoding)
                throws java.io.IOException,
                       PageParseException
Create "Producer".
Specified by:
create in interface Producer.Factory
Parameters:
stream - to read page source in
encoding - of stream to create "InputStreamReader".
Since:
PageMixer 3.0

main

public static void main(java.lang.String[] args)
Invoke as stand-alone application.
USAGE:java jp.ne.dti.lares.foozy.pagemixer.mixer.PageParser InputFilename OutputFilename [StateClass]

PageMixer API - 3.1