Parser (Java Platform SE 8 )

java.lang.Object
- javax.swing.text.html.parser.Parser

All Implemented Interfaces:

DTDConstants

已知直接子类：

DocumentParser
```
public class Parser
extends Object
implements DTDConstants
```
一个简单的DTD驱动的HTML解析器。解析器从InputStream读取一个HTML文件，并在遇到标签和数据时调用各种方法（应该在子类中覆盖）。
不幸的是，有许多严重执行的HTML解析器，因此有许多格式不正确的HTML文件。此解析器尝试解析大多数HTML文件。这意味着实施有时偏离SGML规范，有利于HTML。

解析器将\ r和\ r \ n视为\ n。在起始标签和结束标签之前的新行被忽略，正如SGML / HTML规范中所指定的那样。

html规范没有指定如何将空格合并得很好。具体来说，不讨论以下情况（请注意，此处应使用空格，但我正在使用强制显示空格）：

“ blah <strike> foo'可以被视为：' blah <strike> foo'

以及 <a href="xx"> 使用 </a> '，这似乎被视为：' <a href =“xx “> 使用 </a> 

如果strict为false，则当遇到打破流量（ TagElement.breaksFlows ）或尾随空格的标签时，所有空格都将被忽略，直到遇到非空格字符为止。这似乎使行为更接近流行的浏览器。

另请参见：

DTD ， TagElement ， SimpleAttributeSet

Field Summary

Fields
Modifier and Type	Field and Description
`protected DTD`	`dtd`
`protected boolean`	`strict` 此标志确定解析器是否将严格执行SGML兼容性。

Fields inherited from interface javax.swing.text.html.parser.DTDConstants
ANY, CDATA, CONREF, CURRENT, DEFAULT, EMPTY, ENDTAG, ENTITIES, ENTITY, FIXED, GENERAL, ID, IDREF, IDREFS, IMPLIED, MD, MODEL, MS, NAME, NAMES, NMTOKEN, NMTOKENS, NOTATION, NUMBER, NUMBERS, NUTOKEN, NUTOKENS, PARAMETER, PI, PUBLIC, RCDATA, REQUIRED, SDATA, STARTTAG, SYSTEM

构造方法摘要

构造方法
Constructor and Description
`Parser(DTD dtd)`

方法摘要

所有方法接口方法具体的方法
Modifier and Type	Method and Description
`protected void`	`endTag(boolean omitted)` 处理结束标签
`protected void`	`error(String err)`
`protected void`	`error(String err, String arg1)`
`protected void`	`error(String err, String arg1, String arg2)`
`protected void`	`error(String err, String arg1, String arg2, String arg3)` 调用错误处理程序。
`protected void`	`flushAttributes()`
`protected SimpleAttributeSet`	`getAttributes()`
`protected int`	`getCurrentLine()`
`protected int`	`getCurrentPos()`
`protected void`	`handleComment(char[] text)` 遇到HTML注释时调用。
`protected void`	`handleEmptyTag(TagElement tag)` 遇到空标签时调用。
`protected void`	`handleEndTag(TagElement tag)` 当遇到结束标记时调用。
`protected void`	`handleEOFInComment()`
`protected void`	`handleError(int ln, String msg)` 发生了错误。
`protected void`	`handleStartTag(TagElement tag)` 遇到开始标签时调用。
`protected void`	`handleText(char[] text)` 遇到PCDATA时调用。
`protected void`	`handleTitle(char[] text)` 遇到HTML标题标签时调用。
`protected TagElement`	`makeTag(Element elem)`
`protected TagElement`	`makeTag(Element elem, boolean fictional)` 创建一个TagElement。
`protected void`	`markFirstTime(Element elem)` 标记在文档中第一次看到标签
`void`	`parse(Reader in)` 解析一个HTML流，给出一个DTD。
`String`	`parseDTDMarkup()` 解析文件声明类型标记声明。
`protected boolean`	`parseMarkupDeclarations(StringBuffer strBuff)` 解析标记声明。
`protected void`	`startTag(TagElement tag)` 处理起始标签。

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

字段详细信息
- dtd
```
protected DTD dtd
```
- strict
```
protected boolean strict
```
  此标志确定解析器是否将严格执行SGML兼容性。如果是虚假的，那么对某些常见类型的错误的HTML构造将会宽松。在严格或不严格的情况下，将会记录错误。

构造方法详细信息
- Parser
```
public Parser(DTD dtd)
```

方法详细信息

getCurrentLine
```
protected int getCurrentLine()
```
结果

当前正在解析的行的行号

makeTag

protected TagElement makeTag(Element elem,
                             boolean fictional)

创建一个TagElement。

makeTag

protected TagElement makeTag(Element elem)

getAttributes

protected SimpleAttributeSet getAttributes()

flushAttributes
```
protected void flushAttributes()
```

handleText

protected void handleText(char[] text)

遇到PCDATA时调用。

handleTitle
```
protected void handleTitle(char[] text)
```
遇到HTML标题标签时调用。

handleComment

protected void handleComment(char[] text)

遇到HTML注释时调用。

handleEOFInComment
```
protected void handleEOFInComment()
```

handleEmptyTag

protected void handleEmptyTag(TagElement tag)
                       throws ChangedCharSetException

遇到空标签时调用。

异常: ChangedCharSetException

handleStartTag

protected void handleStartTag(TagElement tag)

遇到开始标签时调用。

handleEndTag

protected void handleEndTag(TagElement tag)

当遇到结束标记时调用。

handleError

protected void handleError(int ln,
                           String msg)

发生了错误。

error

protected void error(String err,
                     String arg1,
                     String arg2,
                     String arg3)

调用错误处理程序。

error

protected void error(String err,
                     String arg1,
                     String arg2)

error

protected void error(String err,
                     String arg1)

error
```
protected void error(String err)
```

startTag
```
protected void startTag(TagElement tag)
                 throws ChangedCharSetException
```
处理起始标签。新标签被推到标签堆栈上。检查属性列表是否需要属性。

异常

ChangedCharSetException

endTag
```
protected void endTag(boolean omitted)
```
处理结束标签结束标签从标签堆栈弹出。

markFirstTime
```
protected void markFirstTime(Element elem)
```
标记在文档中第一次看到标签

parseDTDMarkup
```
public String parseDTDMarkup()
                      throws IOException
```
解析文件声明类型标记声明。目前忽略它。

异常

IOException

parseMarkupDeclarations
```
protected boolean parseMarkupDeclarations(StringBuffer strBuff)
                                   throws IOException
```
解析标记声明。目前只处理文档类型声明标记。如果它是一个标记声明则返回true，否则返回false。

异常

IOException

parse

public void parse(Reader in)
           throws IOException

解析一个HTML流，给出一个DTD。

异常: IOException

getCurrentPos
```
protected int getCurrentPos()
```

Submit a bug or feature
For further API reference and developer documentation, see Java SE Documentation. That documentation contains more detailed, developer-targeted descriptions, with conceptual overviews, definitions of terms, workarounds, and working code examples.
Copyright © 1993, 2014, Oracle and/or its affiliates. All rights reserved.

Class Parser

Field Summary

Fields inherited from interface javax.swing.text.html.parser.DTDConstants

构造方法摘要

方法摘要

Methods inherited from class java.lang.Object

字段详细信息

dtd

strict

构造方法详细信息

Parser

方法详细信息

getCurrentLine

makeTag

makeTag

getAttributes

flushAttributes

handleText

handleTitle

handleComment

handleEOFInComment

handleEmptyTag

handleStartTag

handleEndTag

handleError

error

error

error

error

startTag

endTag

markFirstTime

parseDTDMarkup

parseMarkupDeclarations

parse

getCurrentPos