Home of: [Atelier "FUJIGURUMA"] >> [PageMixer hosted by SourceForge.net]

SEE "For Readers of English Version",
or Japanese version of this page

Trim sequence

This section explains how to eliminate part of HTML page in PageMixer framework.

Explanation uses the filter which eliminate between "<span>" and corresponding "</span>" in HTML page.


<p><span class="Auth-Welcome">Welcome !</span></p>

Before processing

<p></p>

After processing

In other words, the filter trims the token sub-sequence between start tag token and corresponding end tag token namded as "span" from whole sequence.

Overview

Class diagram

Class diagram in this section is shown below:

Class diagram
Class diagram (click for large figure)

Classes which you must define are colored, and other are already defined.

Object diagram

Object diagram in this section is shown below:

Object diagram
Object diagram (click for large figure)

Sequence diagram

Sequence diagram in this section is shown below:

Sequence diagram
Sequence diagram (click for large figure)

Class names

In this tutorial, abbreviated class names are used. Complete names are shown below.

Classes of PageMixer framework

NotationFull name
ConsumerContext jp.ne.dti.lares.foozy.pagemixer.mixer.ConsumerContext
HTMLSymbolSet jp.ne.dti.lares.foozy.pagemixer.HTMLSymbolSet
SequenceEditFilter jp.ne.dti.lares.foozy.pagemixer.SequenceEditFilter
SequenceTrimFilter jp.ne.dti.lares.foozy.pagemixer.mixer.SequenceTrimFilter
SequenceWatcher jp.ne.dti.lares.foozy.pagemixer.mixer.SequenceWatcher

Tutorial specific classes

NotationFull name
Bootstrap pagemixer.filter.Bootstrap
WelcomeTrimFilter pagemixer.filter.WelcomeTrimFilter

Create SequenceWatcher

As described above, this section let the filter eliminate between "<span>" and "</span>" of it from HTML page. But, it is not good idea to eliminate all parts between them in given HTML page.

Then, condition of target sub-sequence are:

NOTE:In this tutorial, "class" attribute is used to identify target token or token sub-sequence.

Of course, you can use another attribute, "ID" for example, and you may think "ID" should be used to identify target in HTML document. You may think that "class" attribute should be used only for presentation decoration, too.

But "ID" attribute seems to have meaning of uniqueness in HTML document.

"class" is used to identify target because I do not want you to think "target should appear only once in same HTML document on using PageMixer framework".

This tutorial once described that filterring something and finding target for filterring are treated as different functions in PageMixer framework. And for sub-sequence filtering, these are implemented by "SequenceEditFilter" and "SequenceWatcher".

At first, create SequenceWatcher instance to tell SequenceEditFilter what token is first and last one of target sub-sequence.

In this case, "SequenceWatcher.NameAttr" is used as "SequenceWatcher". It is "Sequence Watcher watching at Name and Attribute of first token".


new SequenceWatcher.NameAttr(HTMLSymbolSet.SET.SPAN,
                             HTMLSymbolSet.SET.CLASS,
                             "Auth-Welcome")

SequenceWatcher to trim between "span"

Concretize SequenceTrimFilter

Then, define derived class from "SequenceTrimFilter" to trim sub-sequence found by specified SequenceWatcher.

SequenceTrimFilter is one of derived classes from SequenceEditFilter specialized to trim sub-sequence, and it also has two variations of derived class.

One of them is SequenceTrimFilter.Whole, to discard whole sub-sequenec. And another is SequenceTrimFilter.Inside, to discard sub-sequence except first and last tokens.

In this section, new filter class is derived one from SequenceTrimFilter.Whole.


public class WelcomeTrimFilter
    extends SequenceTrimFilter.Whole
{
    final
    private Object key_;

    final static
    private HTMLSymbolSet SET = HTMLSymbolSet.SET;

    final static
    private String ATTR_VALUE = "Auth-Welcome";

    ////////////////////////////////////////

    public WelcomeTrimFilter(Object key){
        super(new SequenceWatcher.NameAttr(SET.SPAN,
                                           SET.CLASS,
                                           ATTR_VALUE));

        key_ = key;
    }

    /////////////////////////////////
    // Concretization of class Whole

    protected boolean shouldTrim(ConsumerContext context){
        return (null == context.getValue(key_));
    }
}

Derived class from SequenceTrimFilter.Whole

To concretize SequenceTrimFilter.Whole, you must define method "shouldTrim(ConsumerContext)". Its return value is whether you want trim sequence at that time, and this allows you to decide it not at construction but at execution time.

In above example, trimming depends on condition whether context contains value identified specified key or not.

Connect and mix

Now, everything needed are ready to use. Execution code is as below (see WelcomeTrimFilter for detail).

try{
    // key to set/get trim condition
    final
    Object key = "Auth.Welcome";

    Bootstrap bootstrap =
    new Bootstrap.Default(filename)
    {
        protected void prepare(ConsumerContext context)
        {
            // "trimmed" is whether sub-sequence
            // should be trimmed or not
            if(!trimmed){
                // put (dummy) value into context
                context.setValue(key, key);
            }
        }
    };

    // apply the filter
    bootstrap.execute(new WelcomeTrimFilter(key));
}
catch(Exception e){
    e.printStackTrace(System.err);
}
Connect and Mix

Sample HTML file as input is "index.en.html" under "src/demo/servlet/war/WEB-INF/page/demosite" in distribution.

In above code, "trimmed" has boolean value meaning whether sequence should be trimmed or not.


To next section "Inset data into sequence"