Internet-based Document Management: Leveraging Digital Archive Retrieval Systems into the Next Century
view this document in pdf
I. The Current Situation
II. Key trends in Intranet-based digital archive and retrieval systems
A. A revolution in printing
B. The move towards “active” documents
C. E-mail innovations
D. Push and pull technologies
E. Convergence of key document management technologies
F. Acceptance of client/server and thin clients
G. DARS integration into common dashboards
H. Intranet-based workflow and Internet-based statement processing
III. The Response
A. Web-enable your DARS systems
B. Integrate DARS into a truly digital document environment
C. Move away from microfiche and dependence on paper-based document strategies
D. Rethink workflow
E. Rethink statement processing and delivery
F. Use DARS in new and innovative ways
IV. Integration of Workflow and DARS
A. Routing and tracking internal reports
B. Tracking LAN documents for permanent archive
C. Capturing and archiving corporate e-mail
D. Distributing and tracking documents within an intranet
E. Document repurposement
F. Proactive document delivery
V. Summary
VI. Definitions
Figures and Tables
Figure 1 – DARS emerging areas of opportunity
and advantage
The internet has changed everything. Technical strategies that seemed sound a
few short years ago now fall short. Intranets -- company-specific versions of
networks with Internet-like capabilities -- are quickly becoming the norm.
Moreover, Extranets -- company to company intranet communication -- are also
taking hold. Soon, browser-like thin clients will become the primary method for
access to many applications, both internal and external to the company, and
between organizations. Already Web browsers are a common part of any
desktop, as common as word processing and spreadsheets, paving the way for
an explosion in the use of thin clients.
The age of the Internet, intranets and extranets is here. The network is truly the
computer.
Some of the key technical drivers behind the explosion of Internet/ intranet-based
document archive and retrieval systems (DARS) are:
- A revolution in printing.
- The move towards “active” documents.
- Innovative use of e-mail.
- The rise of push and pull technologies.
- Convergence of key document management technologies driven by the rise
of the truly digital document.
- The acceptance of client/server and thin clients as an enterprise-wide
architecture and application framework, with the Internet, intranets and
extranets as a logical extension of that architecture/framework.
- The need to integrate DARS into an intranet-based “common dashboard.”
- The evolution of intranet-based workflow and Internet-based statement
processing.
As a response, we recommend the following:
- Web-enable your DARS infrastructure.
- Integrate document management and document output into a true end-to-end
digital document infrastructure, based on a digital automated document
factory (ADF) that extends into your user and customer base through
intranets/extranets and the Internet.
- As a result, move away from microfiche and dependence on paper-based
document strategies and towards a fully automated digital document
environment.
- Rethink the concept of workflow, understanding that new technologies offer
new opportunities to enhance your competitive edge by extending and
redefining the nature of work.
- Rethink how statements are created, developed and delivered, as well as
the purpose of statements beyond the purely informational.
- Integrate DARS and the Internet, intranets and extranets in new and
innovative ways.
As Figure 1 illustrates, production-level documents are by far the largest category of documents handled by most organizations. Because of their sheer volume, they cost more to create, store and manage than any other document type. We call them transaction documents.
It is precisely these types of documents, the transaction documents, that are starting to be stored andviewed on the Web. A large number of these documents are statements and invoices. Others areconfirmations, tax documents, advices and so on. These documents are playing the largest role in electronic commerce, and are the primary focus of this paper.
[top]
Key Trends in the Intranet-based digital archive
and retrieval systems
Technical Drivers
Internet, intranet and extranet technology is evolving faster than most pundits
conceived possible and as a result is defining new ways of accessing and
distributing corporate data. Given the digital nature of archive data, internet
technology can be used to both simplify and speed access to data stored within
a digital archive and retrieval system (DARS) infrastructure through a company-specific
intranet.
Several technical trends are driving the move toward intranet-based DARS
technologies, including:
[top]
Let’ s look at each in detail.
A. A Revolution in Printing
There is currently a revolution (albeit a quiet one) that is happening in document
printing. A primary reason for this is the quick acceptance and adoption of
Adobe’ s PDF page description standard for Web-based applications.
Unlike HTML, PDF offers extremely high quality print capability. This means that
Internet and intranet documents can be used acceptably as presentation-quality
printed documents. More and more, Internet/intranet applications will be the
vehicle through which printing takes place.
The next level of PDF will be for high-speed, high volume printers. Once this
occurs, PDF will become the lingua franca for digitally-stored material. No
translation from other formats will be necessary, and documents formerly limited
to the corporate archive and volume printing will be opened up for a variety of
other uses.
Strategic Bottom Line:
Think about the your future print strategy, and make sure to include PDF
compatibility in that strategy. Not to do so will continually undermine your
company’ s ability to leverage its digital archive across the enterprise.
[top]
B. Rise of “active documents”
The nature and scope of customer documents is transforming because of the
Internet. Customer documents, once passive purveyors of dry information, are
becoming active and alive, specific to the customer, and leveraged in surprising
new ways.
The nature of the active document is that it includes in it “active links” to other
applications or websites of interest to the customer. These links can be provided
automatically, or selectively given based on a customer profile or customer
selection. In addition, targeted marketing will become the norm through
customized, customer-specific Internet documents.
Strategic Bottom Line:
Documents should no longer be passive, but should become active. Include in
your DARS technology the ability to create, distribute and respond quickly to
active documents. If you don’t, your competition will.
[top]
C. Innovative use of E-mail
E-mail is growing geometrically. Everyone uses it, everyone is deluged by it, and
yet everyone finds it indispensable. Why not leverage it?
E-mail must and should be archived in the future (more on this later). Further, e-mail
should be a ( if not the) primary method of distributing documents in your
enterprise. It makes sense that the primary method of communication would
evolve into the primary method of report and document distribution.
Leery about the print quality of e-mail? How about e-mail in high quality PDF
format? We believe the direction of e-mail is just that.
Strategic Bottom Line:
E-mail will only grow in importance. You should get on the bandwagon and use it
to its fullest. Leverage e-mail through an e-mail archive strategy, an e-mail report
and document distribution strategy, and an e-mail print strategy. You’ ll find
productivity up, paper costs down, and communications enhanced throughout
your organization.
[top]
D. Push and Pull Technologies
There are two fundamental methods of accessing corporate data over the
internet. One method provides access to corporate data through a Web browser
(typically thought of as a “pull” technology) and the other has to do with
distributing data to end users (often referred to as “push” technology).
Browsers
Archives of value documents are often used for customer service applications.
Often corporations have a legal obligation to maintain customer documents for a
minimum of seven years. Archiving corporate data within a DARS infrastructure
and providing secure and quick access to data from a Web browser will provide
corporations with service differentiation as well as many competitive advantages.
This same technology may also be used to satisfy internal access to archived
documents. Browsers not only reduce the complexities of installing and
maintaining another application across the varied LAN / WAN configurations
used within a corporation, but also provide a standard and common access
method to all archive data.
Internet Document Distribution
A typical DARS infrastructure is designed to handle large volumes of archive
documents. Once these documents are archived, a small percentage of these
documents are retrieved. A “push” strategy may also be used to distribute some
of these documents over the Internet.
One of the most practical and effective uses of a document “push” strategy is to
distribute reports and other documents to corporate users -- in short, to provide
a new method for report distribution. This is discussed in more detail in the next
section.
Strategic Bottom Line:
Organizations that understand the reality of browsers as the most pervasive
“pull” technology as well as new forms of report distribution as one of the most
compelling “push” technologies will find themselves able to save costs, increase
productivity and create competitive advantage.
[top]
E. Convergence of key document management technologies
The rise of the truly digital document means that we can actually start realizing
the technologists’ dream of the “paperless” office. But beyond that, the digital
document means a new way to think about how we deliver documents to
customers. Shortly, we can think in terms of the “paperless customer.”
What makes this possible is the rapid convergence of several key technologies:
- Web browsers
- New archive and retrieval systems
- Report distribution
- Workflow
- E-mail
- PDF
- Java and thin clients
As already described, Web Browsers are the key method for accessing
intranet/extranet and Internet-based applications. They are standard, common,
and inexpensive. They are the “common dashboard” and framework for
applications access for the near future.
New archive and retrieval systems, based on computer output to laser disk
(COLD) and CD technologies, allow for the archiving and real-time retrieval of
any document from any time period instantly. These technologies are the
“backend” of the ADF infrastructure.
Report distribution technologies allow for the timely and efficient distribution of
reports on an enterprise basis. Report distribution is often the raison de etre of
document management. Here’ s where user needs explicitly drive the technology.
Mainframe-based RDS has been very successful. Numerous vendors have
designed and developed very successful products to distribute mainframe
reports to users electronically in a mainframe environment. Using some of the
core strengths of an effective DARS infrastructure to process and archive legacy
mainframe reports in conjunction with internet / intranet based report distribution
provides a strong alternative to RDS’ s.
A typical DARS archive capability can be augmented with capabilities of
extracting reports based on predefined user requirements, bundling these
reports with a viewer or converting these reports to HTML, and distributing these
reports to end users over the corporate intranet.
Workflow is the other area where user needs are paramount. Workflow
determines how documents are handed in day-to-day business transactions and
processes. Workflow is discussed in more detail in later sections.
As already described, E-mail is quickly becoming as universal as the telephone.
E-mail provides specific advantages to voice communication, including the fact
that it can easily be archived and tracked.
We’ve mentioned that PDF -- the Web-based page description standard from
Adobe -- is gaining ground, if not usurping, HTML as the key method (most often
through Acrobat) of viewing documents. Why? Again, because of the high fidelity
of PDF printing, and its wide compatibility with existing printer installations.
Thin clients and JAVA are quickly becoming the application model and
development platform of choice for client/server. Everyone is moving to this new
paradigm for building client/server applications (with haunting reminiscences of
mainframe computing) because of the very real limitations of full-blown
client/server applications based on “fat” clients that usually perform
unacceptably and are too hard to maintain. More on this next.
Strategic Bottom Line:
Companies must understand how Web browsers, archive technologies, report
distribution and workflow are converging and offer the opportunity to rethink how
documents are handled in their organizations. They must also grasp the
importance of e-mail, PDF, and thin clients. Not to do so will mean a loss of
competitive advantage.
[top]
F. Growing acceptance of client/server and thin clients
Client/server is becoming more accepted as an enterprise-wide infrastructure.
While some companies have moved to client/server wholesale, others are more
reticent. But even here we see UNIX and Windows NT at least forming the
highest tier of a tiered client/server architecture, with the mainframe staying on
as the primary data repository and transaction system.
We believe that UNIX is and will remain the operating system of choice for high
powered DARS environments, even while other operating systems are dominant
on the client side of the equation; Windows NT, specifically, provides the
standard platform required of ubiquitous and cheap intranet access.
As already mentioned, thin clients are a key driver behind the transformation of
DARS. Based on the explosion of browsers and the Web, thin client applications
are gaining acceptance. We see the workstation now as more of a commodity
front end, with the real meat of applications residing on the server side. But far
from being “dumb terminals”, thin clients have intelligence that is specific to the
user: profile information, for example, that gives each user’ s access to the
Internet their own specific flavor. Soon, the user’ s profile will be created and
maintained by intelligent agents that reside on the client.
Ironically, thin clients also allow for greater corporate control. Thin client
software is easier to distribute and maintain, and this to maintain corporate
standards. It takes control away from the user in areas where it’ s irrelevant, and
gives them control where it is relevant. The bottom line is that we’re finally
getting smart about the use of client/server.
Strategic Bottom Line:
Client/server and thin clients are here to stay. UNIX is the most powerful
backend server operating system, while Windows NT provides the both a
backend for lower-volume users and a standard client interface that is requisite
for the internet revolution we are seeing. In order to maintain competitive
advantage, you’ ll want to consider a mixed client/server strategy that includes
UNIX and Windows NT as the backends for your intranet-based DARS
capability. More and more, your front-end applications will be based on
standardized and chea thin clients.
[top]
G. DARS integration into common dashboards
The Gartner Group has coined the term digital archive and retrieval systems
(DARS) to characterize the converging technologies that form the digital
document infrastructure in the future, and this acronym has been the inspiration
for this study. At the core of this infrastructure is the automated document factory
(ADF) that was described in detail in the first Executive Brief in this series.
The common dashboard for DARS will be an intranet-based browser that is
common, standardized, cheap and can access a variety of applications (whether
PDF or HTML based). This dashboard will be the standard method for
accessing your DARS infrastructure.
Strategic Bottom Line:
Common, browser-based “dashboards” are quickly becoming the norm for
application access, particularly for those applications that have enterrprisewide
scope. If you seek competitive advantage, it is better to accept this trend and
work toward common dashboards now rather than to insist on a more proprietary
or fragmented scheme.
[top]
H. Intranet-based workflow/Internet-based statement processing
Workflow is rapidly converging with DARS. This means that the very nature of
workflow can be redefined and extended (we deal with this more in a subsequent
section). Further, statement processing will soon be redefined by Internet
capabilities.
Redefined workflow means that because of the capabilities of high speed
archives, the “time” aspect of the archive ceases to influence the realities of
work. All documents are available at any time from any place. This changes how
work processes are handled dramatically.
Further, advanced features embedded in workflow will also extend its nature and
scope. Data mining, report mining, and intelligent agents that perform many of
the rote tasks of today will greatly enhance the productivity and effectiveness of
workers.
Delivery of statements over the Internet redefines not only how customers
access statements, but what the statement itself is. Once a statement can
provide “hot links” to other areas, the statement becomes a “window” or
“customer profile” that can be used to market to your customer in numerous new
ways.
Strategic Bottom Line:
Intranet-based workflow and Internet-based statement delivery will redefine how
work is done and how customers think of statements. Companies that
understand this fully will be able to dramatically streamline their operations and
customer service, as well as offer new services to customers and market to
customers in new and innovative ways.
[top]
III. The Response
Web enable your DARS system
The first step toward realizing the power of Internet/intranet-based DARS is to
Web-enable your current DARS infrastructure. This means giving access,
through browsers or thin client applications, to corporate documents that had
previously been out of reach to both your users and your clients.
Once you do this, innovation will in some ways come from the “bottom up.”
Meaning that your user community will begin to see new and innovative ways to
use the Web-enabled infrastructure. A few possibilities are outlined a little later,
but we’ re sure your own personnel will dream up uses as yet unthought of.
[top]
Integrate DARS into a truly digital document environment
When we say a “truly digital document environment” we mean one that supports
the input, creation, maintenance, and distribution of documents in purely digital
form, and hypothetically requires no paper-based printed artifact. In other words,
the creation of documents that support business transactions would become
totally digital in nature.
We recommend this for many reasons, as follows:
- Digital documents can be leveraged more easily across the DARS
infrastructure once they are captured, archived and indexed.
- They can be easily distributed via a Web-based document distribution
mechanism, whether in house or external to your customers.
- They are better candidates (if they are captured textually) for full text retrieval
and intelligent searches.
- They cut down on paper costs.
- They are better suited for report and data mining.
[top]
Move away from microfiche and dependence on paper-based document strategies
Many industry pundits have stated that “Microfiche is Dead.” We realize that
many companies have substantial investment in microfiche and for this reason
the technology will be with us for awhile. But the trend is nonetheless away from
microfiche; in fact, those companies that linger too long will find themselves at a
competitive disadvantage. According to the Gartner Group:
“Enterprises today that continue to rely heavily on paper and microfiche
for storage and retrieval of mission-critical files will experience substantially
higher administrative support costs and inferior customer services,
situating themselves for both reduced profitability and market share loss until
they implement a strategic DARS environment.”
The proverbial writing, you might say, is on the wall.
[top]
Rethink workflow
Rethinking the concept of workflow means reevaluating what work means when
taking new capabilities into consideration. We recommend that companies
actively and aggressively evaluate how they can rethink their work processes
provided the increased functionality offered by the new DARS environments.
Primary among these capabilities is innovative and proactive report distribution.
Second is data and report mining. Third is the use of intelligent agent
technology. All of these are discussed in more detail in Section IV.
[top]
Rethink statement processing and delivery
Rethinking statements is going to have a profound impact on the way you do
business. We recommend a total revamping of how statements are handled to
take into consideration new developments in DARS technology.
Specifically, this means:
- Evaluate and implement an Internet-based statement delivery system.
· Seek to use Internet-based statements in useful ways to link those
- tatements back to archived information, or to provide links into marketing-related
information.
- Rethink how statements are created, developed and delivered, as well as
the purpose of statements beyond the purely informational.
- Integrate electronic billing with electronic payment (such as through
Quicken).
- Seek to deliver documents including invoices and payment records directly to
PCs on a scheduled basis.
- Provide direct, ad hoc customer access to documents including invoices and
statements for viewing.
To date, most documents that record commercial transactions have been on
paper or microfiche. These payment and billing records are certain to be sent
over the Internet and put on Web sites for customer viewing and downloads.
These electronic transaction documents will become part of digital repositories
on the Internet. They will become the source documents for Internet commerce,
and linked to more traditional vital elements in Internet payment mechanisms --
such as accounts receivable and accounts payable databases. These databases
in turn will also become vital elements in Internet payment mechanisms.
[top]
Use DARS in new and innovative ways
A DARS infrastructure integrated effectively into an intranet, extranets or the
Internet provides many opportunities to evolve new technologies, including:
- Broadcasting of documents through “push” technologies.
- Switch from passive to active documents -- documents that are more than
passive source of information, but provide some kind of value-added through
the customer’ s interaction with them, whether that be through a hyperlink to
another related document/application or through gathering information on
customer behavior.
- Switch to intelligent documents -- documents that “change themselves” via
customer interaction or customer profiles -- documents that customize the
type of information presented based on who is being presented to.
- Incorporation of video stream in documents -- video-based graphics can be
used for marketing and educational purposes.
- Incorporation of hyperlinks into documents -- documents can provide instant
links to other related documents, web sites, or other applications.
- Computer output documents as source documents for electronic commerce --
provide easy means to generate a sale or information request through
documents that would previously be thought of as “informational only.”
- Changing role of output documents from “internally-oriented financial
documents” to “customer-facing documents” with sales and marketing
purposes -- again, documents become active mechanisms for marketing,
commerce, intelligence gathering, and the like.
- Use of computer output documents as way of empowering the customer and
making them a part of the organization -- create value-added services
through giving your customer access to critical data and market information.
These are just a few of the ways we think you can use a DARS infrastructure in
conjunction with the Internet or intranets. As the technology evolves, there are
sure to be dozens of other uses that are only limited by the creativity of your
organization.
Figure 1 shows the new DARS environment. It shows how DARS technology
emerges into ever increasing areas of opportunity and advantage through the
Internet, intranets, and extranets.
[top]
We all know how reports are distributed. Every morning, somebody either drops
off a report on a desk or number of desks or every morning personnel go by the
data center or report distribution center and pick up their daily report.
All this is changing.
With new DARS technology, reports can be delivered electronically via e-mail.
The notification comes that the report is ready, and with a click of mouse the
report attachment is available for work.
The report remains digital. It is also archived and indexed, and available for later
use at any time. It is stored textually -- meaning that the body of the text is
available for intelligent queries or use in data and report mining. The report is
leveraged for the future.
In keeping with traditional paper based report distribution, end users typically
wait for the report before they start working. By utilizing workflow technology in
conjunction with a DARS infrastructure, workfow systems are now capable of
mimicking the traditional report distribution method electronically.
Dynamic LAN generated documents may be tracked by a workflow sub-system
and then archived to the DARS, once these documents are ready for permanent
archive.
Using workflow integrated with DARS makes it possible to capture all incoming
and outgoing e-mail messages. Once these e-mail messages are captured,
these e-mail messages may be routed to the DARS for permanent archive.
As your DARS is enhanced to handle routing and tracking in an internet /
intranet and extranet environment, this same technology may be used to
provide corporations with an alternative to traditional report distribution. The
new paradigm, Internet Document Delivery (IDD) will provide customers with a
much more cost-effective method of routing and tracking documents.
Once documents are captured, indexed and archived in a high speed archived
system (COLD or CD) they can then be “repurposed” much more easily. What
does this mean? It means that documents can be brought together, synthesized,
massaged, and repurposed into another document.
Today, this is done manually on a daily basis, requiring extensive manual labor
to hunt down documents, rekey in information, and analyze information so that it
can be “repackaged” into a new document.
Report mining tools and data mining tools allow for intelligent searches on
documents so that much of the manual task of gathering disparate documents
together can be automated. Further, intelligent agents can be used to
automatically create certain kinds of documents based on pre-defined rules and
objects.
Let us paint a picture. Faxes come into the company. They are fed into the
COLD archive and indexed appropriately so that they can be retrieved
intelligently later. Other documents as well are captured, indexed, and stored on
your COLD system. Reports are generated and stored as well, again indexed
and ready for repurposement.
The indexing of these documents essentially adds value to them by allowing a
variety of intelligent searches on them, instantaneously and at any time. Add an
intranet-based intelligent search and distribution engine to the mix that
proactively packages, notifies and delivers documents to users on a daily,
weekly or monthly basis (what we call proactive digital document delivery) and
you’ ve got power. You’ ve taken an archive technology, coupled with a powerful
indexing scheme and retrieval capability, added an intelligent search engine or
agents, and you’ ve got the future of workflow.
Now if we have a predefined, stable process that is the focus of this manual
effort, we certainly have a candidate for automation. Daily reports, for example.
As mentioned previously, what if the paper report delivered every day could be
delivered every day through e-mail? The document is digital and remains digital
throughout the work process. The paper is eliminated; the whole process of
tracking and routing the document is handled through the network, with the “hard
copy” only existing digitally, on the COLD system.
But what about ad hoc requirements? We suggest the equivalent of the Digital
Yellow Pad (DYP). In a nutshell, the convergence of COLD and workflow can
help here too because indexed digital documents can be accessed so much
quicker and easier, and “repackaged” using data mining tools that can extract
discrete levels of information from them quickly. You can “repurpose” a variety
of documents into a new document much more quickly. And do so while
remaining completely digital in the process. Your personal digital assistant finally
will do more than hold phone numbers, because you’ ll download documents to
the PDA, or access them at home later over the Net.
The rise of the digital document will have a profound impact on businesses and
their customers. It will also create new opportunities to market to and attract
customers to other products and services.
The internet provides a way to “push” statements to customers, but once
pushed, customers can then “pull” selectively from the statement. Further, this
“push/pull” model balances the need to delivery documents with the opportunity
to exploit your customer’ s attention once you’ ve got it. Statements are a tried
and true method of getting your customer’ s attention, and once you have it on an
internet-based documents, the possibilities to marketing to your customer are
only limited by the imagination of your technologists and marketers. Statements
become “active” and can be used in new and innovative ways.
Internally, workflow must be redefined. Report distribution needs to be
automated, streamlined -- and documents captured and archived in such a way
that they can be leveraged across the enterprise over time. This is what DARS
offers to you.
INSCI is committed to a vision of the truly digital document based on the
foundation of the automated document factory (ADF) that extends the value
chain of digital documents beyond the printed paper document we see today.
Digital documents will allow us to create not only the “paperless office” but the
“paperless customer” -- and it is likely, in fact, that the customer will be
paperless long before the office.
For more information about how INSCI can help you integrate DARS into the
Internet, intranets and extranets, feel free to give us a call at 508-870-4000.