中国IT动力,最新最全的IT技术教程
最新100篇 | 推荐100篇 | 专题100篇 | 排行榜 | 搜索 | 在线API文档 | 网通镜像
首 页 | 程序开发 | 操作系统 | 软件应用 | 图形图象 | 网络应用 | 精文荟萃 | 教育认证 | 硬件维护 | 未整理篇 | 站长教程
ASP JS PHP工程 ASP.NET 网站建设 UML J2EESUN .NET VC VB VFP 网络维护 数据库 DB2 SQL2000 Oracle Mysql
服务器 Win2000 Office C DreamWeaver FireWorks Flash PhotoShop 上网宝典 CorelDraw 协议大全 网络安全 微软认证
硬件维护  CPU  主板  硬盘  内存  显卡  显示器  键盘鼠标  声卡音箱  打印机  机箱电源  BIOS  网卡  C#  Java  Delphi  vs.net2005
  当前位置:> 程序开发 > 编程语言 > Java > Java与XML
Enterprise Content Mangement on the Java Platform @ JDJ
作者:未知 时间:2005-08-10 18:15 出处:Java频道 责编:chinaitpower
              摘要:Enterprise Content Mangement on the Java Platform @ JDJ

Java Web applications have needed a standards-based API for Enterprise Content Management (ECM) for a long time. ECM is an essential requirement for Web applications on the Internet, intranets, and extranets. ECM vendors have proprietary APIs in various languages and this fact has inhibited ECM architectures from being interoperable. JSR-170 for ECM defines a new set of APIs to standardize the interface with ECM products. It aims to the make the ECM product pluggable, much like the JDBC the API enables application code to be independent of databases products. JSR-170 has been actively supported by several ECM vendors and approved for public review. Its adoption is predicated on enterprises demanding it from ECM vendors, and it remains to be seen if these vendors will forego their unfair advantage.

In this article, we explain the lifecycle management services associated with "Content" to a Java developer building enterprise applications focusing on the new and emerging JSR-170. Enterprise Content Management (ECM) is about managing the lifecycle of "Content" in an enterprise. The lifecycle of managing such content requires a robust architecture. The lifecycle of "content," as shown in Figure 1, begins with its getting authored with some metadata. It's formally represented in digital form and uploaded to some server using Web protocols. It's then processed, which typically consists of sorting, classifying, and storing in a form that's subsequently easy to query and search. Content gets served to an authenticated and authorized user either in isolation or merged and aggregated with other content. Not all users are interested in the same kind of content so content has to be customized to suit individual user preferences, display device characteristics, and local and internationalization requirements. In Figure 1, users from various domains want to access content through various devices. The same content has to be rendered on various client devices. Compounding that, with the innumerous types of content and associated standards, a single general-purpose Content Management System (CMS) is seldom sufficient for an enterprise. Enterprise architectures deploy more than one ECM product, each having its own APIs for lifecycle management services, which increases developer complexity. A unified and standardized API such as JSR-170 can simplify the task of managing content across various vendor products and frameworks.

Enterprise Content Management
The ECM lifecycle comprises of a set of independent tasks, all of which are part of a large workflow. The workflow begins by identifying the roles required for ECM lifecycle tasks and assigning users to those roles. Typical roles are content creators, reviewers, translators, classifiers, approvers, deployers, managers, etc. Once users, roles, and groups are identified, the data must be provisioned in the enterprise identity management repository. The next step involves content creation. Content creation gets done through tools that vary by content type. The creation exercise has localization and internationalization requirements. In an automated system, content can also be aggregated from multiple sources. Both the manual creation tasks and the automated tasks have to consider requirements for the transport protocol. Once the content gets submitted to a server, it has to get versioned. All content has to have metadata that describes the content's characteristics. It can be defined either at the time of creation by the content author, or it can be extracted at the time of classification, or it can be both.

Once submitted content often has to be translated for an international audience. It may also have to be transformed based on visual formatting requirements specified in templates and other style sheets.

Once the content is transformed, it has to be assigned to the appropriate placeholders for dynamic rendering. The scope of ECM can also extend to content delivery. Delivery involves the assembly of dynamic content. It also requires the construction of an index and the ability to search the site for all of its content. There may be personalization requirements and consumers may have a preference about how they want the content to be structured. Consumers may have various authorization privileges based on their roles. Many of these content delivery requirements are also applicable to portal architectures. Portals use ECM solutions as a back-end service. The scope of this article is restricted to the Java interfaces dealing with content repositories.

Java Content Repository Model
The purpose of this API is to provide a standard implementation-independent way to access content bi-directionally on a granular level in a content repository. The challenge is to allow enough flexibility in the API so it can be used for hierarchical (path-based addressing) and non-hierarchical (UUID-based addressing) repository models. The APIs should be easy to use from the programmer's point-of-view and at the same time its core focus should be to interface with a repository and not venture into areas that might be regarded as "content applications." ECM products have some common base features and they distinguish themselves with some unique features. One of the objectives of the API is to make it easy to implement on top of existing content repositories. The other objective is to standardize some of the more complex functionality needed by advanced content-related applications. To accelerate the adoption of this standard interface by ECM vendors, JSR-170 has taken a multi-step approach to the implementation of the APIs. Level 1 of the API defines a set of basic repository operations such as Read, Update, and Delete functions, the assigning of types to content items, serialization and search. Level 2 defines some of the advanced repository functions that are needed such as advanced content management like supporting transactions, versioning, access permissions, locking, and hard links between content items.

The repository as exposed through level 1 of the JCR is a tree structure very much like the Unix file system. It comprises nodes that can have zero or many child nodes. It should also support CRUD (Create/Read/Update/Delete) operations on the nodes and provide for assigning node types and the means to search the repository. Nodes can have zero or more child properties. It should be possible to do retrieval and traversal of nodes and properties. A path-syntax has been defined to navigate the tree. The repository has three layers of isolation. javax.jcr.repository is an interface. An object implementing this interface represents a persistent data-store. javax.jxr.workspace is an interface; objects implementing this interface serve as a private view whose activities are only visible to users in this workspace. Changes made to this view have to be committed with an explicit checkin operation. A third type of isolation is between the workspace and the nodes (objects) in memory. A repository is similar to the well-known concurrent versioning system but there are some subtle differences. JCR doesn't distinguish, and rightfully so, between content and its metadata. It's up to the application to define its preferred conventions. JCR can be implemented on top of a file system, WebDAV, RDBMS. etc. Figure 2 shows a high-level JCR architecture.

An ECM application that's protected through JAAS retrieves a handle to a JCR Repository object using Java's Naming and Directory Interface. It populates a Credential object by pulling attributes from JAAS and invokes the Repository object's login method. So it retrieves a ticket that's like a session. Using the ticket, it retrieves one or more workspaces. The workspace provides for APIs to navigate the node tree and modify the nodes and their properties. JCR provides APIs to copy and move nodes around. It also lets APIs import and export nodes to external systems. A node can be serialized in an XML document. Likewise, an XML document compliant to some schema can be imported and attached to an existing tree. In a nutshell, JCR is similar to a Java DOM (Document Object Model) API with an ECM-friendly syntax.

As we said before, the motivation for having two levels of API for this JSR is so this complex set of APIs can be adopted by the industry in a phased way. A JCR repository is viewed as a collection of workspaces, each of which organizes the information in it in a graph (or tree) structure shown in Figure 3. Level 1 of the API defines a standard way to acquire a handle to a workspace in a repository, to authenticate to the workspace, and to access or manipulate data in a workspace at the content-element level.

A client authenticates to a Workspace in a Repository by presenting a set of Credentials, and when authentication is successful gets a Ticket. Although it's not specified in the specification, a handle to a Repository is generally obtained via a JNDI lookup as shown in Figure 4. A Ticket can be thought of as a particular user's copy of a workspace. Every ticket must have a corresponding Workspace object associated with it but a workspace can have zero or more tickets associated with it. The information in a workspace (as represented as a tree or a graph) can be manipulated through a ticket as long as the user has permissions to make those manipulations. The user can also get a handle to a workspace and directly manipulate the data in that workspace instead of using a ticket associated with the workspace. When a user manipulates data through the methods of a Ticket object, the data that's changed must be explicitly saved (through a Ticket.save call, for example) before it's persisted. Any data changes made directly to the Workspace object are automatically persisted.

The data in a workspace is organized as Items in a tree structure. Items are either Nodes or Properties. Properties can't have children and are leaves in the tree. The enterprise's content data is stored as values of properties. Nodes can have children that are other nodes or properties. Every node has one or more node types associated with it, one of which must be the primary node type associated with that node. A primary node type defines certain characteristics (e.g., the types) that the node properties and child nodes are allowed and/or required to have, i.e., the node type enforces certain constraints on the node's children. Every node in the repository has a special property called jcr:primaryType that records the name of its primary node type. The primary node type uniformly enforces constraints on all child nodes and properties. If a child node is to have unique additional constraints or characteristics, the parent node may also be assigned one or more mixin types. Every node that has a min node type also has a system assigned property called jcr:mixinTypes that records its mixin node types.

As with XML document elements, name collisions for items in a repository is avoided by prefixing the item name with its namespace. The prefix is a short name for a URI (like in an XML document) and all JCR-compliant repositories have a namespace registry. There are served prefixes that are pre-assigned to certain namespaces. For example, the prefix jcr (e.g, jcr:content) is reserved for built-in node type and the prefix mix is reserved for mixin node type. Figure 5 illustrates how a new namespace prefix can be registered before use. Reading data from a workspace consists of accessing the contained properties and extracting property values (the content). Properties can be accessed directly by specifying the path or by traversing the tree from a parent node. Level 1 API specification covers:

  • Tree traversal to access nodes and properties
  • Reading and writing property values
  • Creation and deletion of items (nodes or properties)
  • Identification of node types
  • Searching the repository
Figure 6 illustrates a subset of the level 1 API for reading data from the repository. Each property has to have one of eight associated types. These types are represented by integer constants defined in the PropertyType class. They are STRING, BINARY, DOUBLE, LONG, BOOLEAN, DATE, SOFTLINK, and REFERENCE. A SOFTLINK is a special string type. It's a string representation of a repository path. A REFERENCE type is also a special String type. It's a node UUID. The UUID identifies a node in the workspace and, unlike a SOFTLINK, the UUID must be valid and must point to a particular node. It's important to note that not every node will have a UUID. Only nodes with mixin type mix:referenceable will have an associated unique identifier. This uid is stored as a value of the node property jcr:uuid. This property is read-only and is automatically assigned by the system when a node type of mix:referenceable is assigned to a node. Referenceable nodes aren't required for level 1 API compliance so their use is explained in detail in the next section.

The value of a property can be obtained, even if the client doesn't know the type of the property, by using the special holder class, Value (see Figure 7). Modifying the workspace takes the form of adding child nodes or properties to a node, removing a node or property, and modifying property values. When adding a node, the node type of that node can be specified. If the node type isn't specified, the system assigns the node type of the parent node. When setting a property, it's possible to specify the property name, property value, and property type. If the property type is omitted, the system tries to identify the property type from the Java type of the content. For example, if an input stream is provided, as in the example below, the system assumes a BINARY property type.

As mentioned above, changes made through the Ticket object aren't persisted until explicitly saved. If Ticket.save() is called after making changes to one or more nodes and properties, all changes are persisted. If a Node.save() is invoked, all changes to that node and its children are persisted. If node.save(true) is invoked only changes to the node itself (not the changes to its children) are saved as shown in Figure 8. It's possible for a node to have multiple parents. To accommodate this, the Node interface API has a Node.addExistingNode(<relpath>) method to add an existing node as a child to a node. This method adds a child node that's a REFERENCE type. This means that the child node (the one with multiple parents) must have a UUID (i.e., a system property "jcr:uuid"). It should be noted that only nodes created with a mixin node type of mix:referenceable will have a system-generated uuid property, and so only these types of nodes can have multiple parents. When adding an existing node as a child to another node, a cyclical relationship (where a node becomes its own descendant) is prohibited; the system will throw a ConstraintViolationException if this is detected on the addExistingNode() call.

Level 2 of JCR requires that a provider provides several additional functionalities for compliance such as transactions, versioning, observation, access control, and locking. We'll provide greater details on these features in a future online version.

Summary
JCR is a refreshing and welcome addition to the realm of enterprise content management. The motivation is well-founded and the goals are well-defined and scoped. It has had active representation from several leading Content Management System vendors from conception all the way through the final vote. Some of the vendors, such as Jahia.org and Venetica.com, already offer compliant implementations. It can certainly make applications that interface with ECM a lot easier like JDBC did with databases. It remains to be seen how enterprises adopt this standard and demand compliance and support from leading vendors. OpenCMS, though unrelated to JSR-170, is another Open Source CMS standard available from openCMS.org.

关闭本页
 
首页 | 投资与合作 | 服务条款 | 隐私政策 | 收藏本站 | 设为首页 | 新用户注册 | 免责声明 | 使用帮助
Copyright ©2005-2008 chinaitpower.com All rights reserved. www.chinaitpower.com 版权所有