博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Using an entity resolver
阅读量:6408 次
发布时间:2019-06-23

本文共 5793 字,大约阅读时间需要 19 分钟。

hot3.png

Most XML and HTML developers are familiar with entity references, the odd little XML constructs you often see that begin with an ampersand (&) and end with a semicolon (;). Probably the most common use of entity references is to literalize characters that aren't legal in XML, such as &lt; to represent the opening angle bracket (<, also known as the less than symbol) that begins an XML or HTML element.

Some developers use another kind of entity reference as a way to include in an XML document some material from a different source -- often a piece of content that's intended to be shared across multiple XML documents, or perhaps an item such as a Flash animation sequence that cannot be converted to XML. That kind of entity reference, called an external entity reference, saves the time you might otherwise spend copying and pasting the content over and over again. For instance, you might use an entity reference (such as &copyright;) in an XML document to reference boilerplate sections in a different document that someone else is responsible for keeping up to date. (If you don't have a DTD or XML schema, though, external entity references won't work, so don't even try it.)

Parsing entity references, and the associated problems

When XML parsing occurs, the parser resolves the external entity reference in an XML document using the location specified in the DTD or XML schema (In this tip, I'm focusing on DTDs because today they are more commonly used in production applications). During resolution, the parser locates the referenced content and inserts it into the XML. This means that when you manipulate the parsed document (in Java, C, Perl, PHP, Python, or whatever other language you are using), the referenced content appears just as any other content would. As long as everything works properly, you don't have to worry about handling each piece of referenced content individually. Complications, however, may cause the simple process to break down.

You may, for example, need a live network connection in order for the resolution to work properly because so many referenced entities refer to a remote URL somewhere (for instance, http://www.ibm.com/developerWorks/copyright.xml in the example code in Listing 3). The resolution (opening up a connection, pulling down content, closing the connection, and so on) also may slow down the parsing process. You may begin to wonder if there's a way to provide cached, local copies of the referenced pieces of content or another way to circumvent the entity-resolution process. I'm happy to report that there is.

A simple way to resolve external entity references

As long as you're using the Simple API for XML (SAX), you're in luck! And since both DOM and JDOM use SAX under the hood, this simple solution works for all of three APIs (see  for background on all three APIs). SAX defines an interface, org.xml.sax.EntityResolver, that provides just the functionality you want. This interface defines only one method, as shown in Listing 1:

1

2

3

4

5

package org.xml.sax;

public interface EntityResolver {

    public InputSource resolveEntity(String publicID, String systemID)

        throws SAXException;

}

The sole method in this interface, resolveEntity(), provides a means to step into the entity-resolution process. Because each external entity reference has either or both a public ID and a system ID in the DTD specifying how to resolve the content, you can match these up in this method and implement your own behavior. For example, consider the DTD fragment in Listing 2 that defines the copyright external entity reference:

1

<!ENTITY copyright SYSTEM "http://www.ibm.com/developerWorks/copyright.xml">

Here, there is no public ID, and the system ID is http://www.ibm.com/developerWorks/copyright.xml. So, you could create a class called CopyrightEntityResolver as shown in Listing 3.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

package com.ibm.developerWorks;

import org.xml.sax.EntityResolver;

import org.xml.sax.InputSource;

import org.xml.sax.SAXException;

public class CopyrightResolver implements EntityResolver {

    public InputSource resolveEntity(String publicID, String systemID)

        throws SAXException {

        if (systemID.equals("http://www.ibm.com/developerWorks/copyright.xml")) {

            // Return local copy of the copyright.xml file

            return new InputSource("/usr/local/content/localCopyright.xml");

        }

        // If no match, returning null makes process continue normally

        return null;

    }

In this simple implementation, the resolveEntity() method will be invoked every time an entity is resolved. If the system ID for the entity matches the URL in that method, a local XML document (localCopyright.xml) is returned; this is insteadof whatever resource is located at the supplied system ID. (在这个方法中,如果这个实体的system ID匹配了URL地址的值。将会加载本地的XML文档作为输入流byteStream返回,就不需要根据URL地址进行网络加载了,如果那个byteStream返回null,后面的程序还是会根据system ID指定的地址去加载实体引用文件。EntityResolver主要就是起这个作用的!!!In this way, you can "short circuit" the process and supply your own data for a given public or system ID. You'll want to be sure to always return null if no match occurs, so that entity resolution will occur normally in non-special cases.

That's about all there is to it. You can register your entity resolver on your parser as shown in Listing 4.

1

2

3

4

// Get an XML Reader - this code not detailed here

XMLReader reader = XMLReaderFactory.createXMLReader();

reader.setEntityResolver(new CopyrightResolver());

reader.parse(new InputSource("article.xml"));

So there you have it. If you can obtain local copies of entity reference content, or if you need to substitute your own content for an entity reference, use the SAX EntityResolver interface. This should help speed your applications and increase the flexibility of your XML documents. Enjoy!

转载于:https://my.oschina.net/u/2381372/blog/2050807

你可能感兴趣的文章
爱不释手的Ajax
查看>>
Docker-Compose官方学习笔记(2)起步构建一个多容器app应用
查看>>
煦涵说Webpack-IE低版本兼容指南
查看>>
【从基础学 Java】序
查看>>
Oracle推出轻量级Java微服务框架Helidon
查看>>
Java 11正式发布,新特性解读
查看>>
Handtrack.js 开源:3行JS代码搞定手部动作跟踪
查看>>
Nginx 学习书单整理
查看>>
依赖类型语言Idris发布1.0版本
查看>>
爱奇艺短视频软色情识别技术解析
查看>>
微服务网关Kong 1.0正式发布!提供100+项功能
查看>>
Serverless架构开发与SCF部署实践
查看>>
2019 年,容器技术生态会发生些什么?
查看>>
又拍云创新CDN服务,同步提供1:1免费云存储
查看>>
Mozilla网站安全分析工具Observatory已发布
查看>>
独立云计算服务商的多维实践之道:用户需求驱动变革
查看>>
Scrum Guides 2017年最新修改
查看>>
基于Clang的缓存型C++编译器Zapcc开源
查看>>
3·15曝光丨智能机器人一年拨打40亿个骚扰电话,6亿人信息已遭泄露!
查看>>
高通与华为短暂和解,理智看待国内5G现状
查看>>