Sunday, November 30, 2008

DON'T Make PDFs Searchable?

Fellow blogger Patrick C. Walsh had an interesting post today about Google's recent move in making imaged PDFs searchable on the World Wide Web.

Mr. Walsh thinks searchability can be a bad thing when it comes to document imaging. And, frankly, after hearing him out, he may have a point.

Here's his post, and here is the nut graph on why PDFs don't need to be searchable in most cases:

"I have been told that searchable PDFs will be a very good thing for intranets but I just don’t get it. The poor users put their search terms in and, as all documents are searchable, they will get a mountain of results back. Then, when they click on a result, it will land them on a document containing the search term.

This can be a problem as a lot of documents aren’t set up like the best web pages and if they are PDFs from external sources, e.g. legislation, H & S advice etc., you won’t be able to change them anyway. The problem is context."

This is absolutely true that context can be a problem with searchable PDFs. If the users do not find search intuitive and easy to use, search isn't much use.

In that sense, Walsh may be correct in stating that for the average corporation, making scanned docs searchable across the enterprise may not be best allocation of resources. Better to file appropriate scanned docs in appropriate folders...

And make people search the old-fashioned way: using their brains, instead of relying on keywords.

Thursday, November 27, 2008

thanksgiving day

today, we have much to be thankful for, including the ability to blog from a mobile phone, as i am doing right now, so pardon the lack of proper capitalization. including, too, ducument imaging solutions that work. thanks for those. thanks for the disaster recovery files that don't take up any space. thanks for the medical records that cost nothing to transport from hospital to hospital. thanks for the searchability and scalability. thanks for not just scanning documents, but sharing them. thanks for the chance to use a weak economy to prove document imaging's roi is the real deal. thanks for fewer filing cabinets. thanks for better document imaging options and services, and happy thanksgiving to you and yours.

Sunday, November 23, 2008

When Disaster Strikes, Document Imaging Survives?

In the course of scouring the Internet, one runs upon some good things. If you missed it, check out this wonderful story in the Atlanta Journal-Constitution the other day.

It is about a pastor and a business man striving to reform and revitalize a tough neighborhood in northwest Atlanta, where a beloved 92-year old grandmother had recently been shot and killed by the police.

The business man owns a document imaging company in Atlanta. His name is Mr. Gordon, he donates money to his case, and he had this to say:

“There’s so many good people who’ve lived a lifetime here that go to sleep afraid at night. We don’t believe people should live in that kind of fear.”

Though the connection may seem tenuous, one can surmise that Mr. Gordon's ability to relate to and connect with this community could have something to do with his status as the owner of a document imaging company.

How so?

Because the strength of document imaging is in its interest in combating disaster. Or not necessarily disaster itself, but the loss that comes with it.

All those papers--wills, memos, letters--disappeared. Into nothing, a hurricane, a tsunami...

A neighborhood at war.

No easy task, to outlast those, but document imaging keeps some of those losses. Sent out over email, maintained on backup Internet-connected databases, scoured by searchers of relevance, the images of documents can escape disaster--and live to tell about it.

Wednesday, November 19, 2008

Winter Reading: Document Imaging

As the winter months approach, it's time to curl up with your favorite novel. Or, if you're considering implementing a document imaging solution for your business, your favorite Microsoft SharePoint textbook.

There are some great books out right now that address the potential--and potential pitfalls--of making SharePoint a part of your business life.

Here are a couple to consider:

Essential SharePoint 2007, by Scott Jamison.

This is a quite substantial book that can teach you how to:

-- Define optimal, workable collaboration strategies
-- Build SharePoint applications people want to use
-- Provide your customers with state-of-the-art sites, blogs, and wikis
-- Implement forms-based workflow to optimize virtually any business process
-- Organize and staff SharePoint support teams
-- Migrate efficiently from SharePoint 2003

Another book to consider is SharePoint 2007 User's Guide: Learning Microsoft's Collaboration and Productivity Platform, by Seth Bates and Tony Smith.

Be careful when ordering books on document imaging, however. Many of them are quite old, from the 1990s. And while they may be instructive on basic principles, they do not reflect the current environment.

Since the 1990s, document imaging has become part of the larger Enterprise Content Management space, so look for books with that more general tag.

Happy reading!

Tuesday, November 18, 2008

Document Imaging "Extras"

You can get ten different price quotes for the same document imaging job. Why the discrepancy?
Indeed, let's look at a few "extras" that can push the price of a job up or down.

-- Volume. Generally speaking, the more volume, the less "per piece" charge. However, very high volume jobs require special machinery, and thus may cost more.

-- Size. If your documents are irregularly sized, your job is not standard and will probably cost a little more based on its format-specific nature.

-- Packaged Documents. Sometimes, the document imaging services company will be presented with a stack of papers that are binded or stuck together. Someone has to separate them, and if it's the document imaging company, that'll cost a bit more.

-- Data Services. Once you've scanned all the documents, it's time to decide where they will be stored, and how the data will be organized. Document imaging service companies are often great consultants on these issues, but they do charge for the consultations and/or software.

As with a mechanic, the answer to the question "How much do you charge?" is always, "It dpends." When you know what it depends on, you not only understand the mechanic's point of view, but are able to negotiate with him better on price.

Monday, November 17, 2008

Document Imaging and the Obama Presidency

Of course it's impossible to see the future, but that doesn't prevent anyone from trying. In the current economic straits, moreover, accurately spotting trends that can impact your business is especially valuable. All of which brings us to the questions:

What will document imaging look like in 2009? Will there be major changes in the industry? New technologies? Exciting startups?

John Mancini over at AIIM had an interesting post the other day based on a survey in which respondents who work in the document imaging industry pointed out the three vertical markets where they saw the most opportunity in 2009:

Healthcare / Medical -- 59%
Insurance -- 38%
State and Local Government -- 33%

These thoughts make a lot of sense, because they seem to pinpoint the change this nation experienced two weeks ago: the election of Barack Obama.

One of the main pieces of legislation Obama sponsored while he was in the Senate was called "Google for Government." Read about it here. The idea was to create a Web site that showed where Federal tax dollars were being spent.

Obama has promised to apply this same transparency standard to the government as a whole. This will no doubt create the need for document imaging services to move the tons and tons of paperwork into digital formats, so the people can see them.

Only time will tell if that project is successful, but the trying is bound to be profitable for the document imaging industry as a whole, especially the companies that get those contracts.

Sunday, November 16, 2008

Document Imaging: Think Integration

One of the great things about document imaging is the ability to move paper around without moving paper around, allowing people in different locations to access the same documents at the same time, if need be.

But in order to fully exploit the potential of this aspect of document imaging, we need to think about integrating existing software and hardware into an overall well-functioning system.

There are many companies who promise to do exactly that, such as this one and this one. However, the important thing is for the business professional to insist on, and drive, this integration.

For example, you are a real estate broker, and you are looking to scan and archive all of your past signed contracts. Fantastic idea. But consider tying in your document imaging solution with an eFax capability, where you can essentially email faxes.

Because real estate contracts need hard signatures, faxing is still a big part of the business. That shouldn't stop you from implementing a document imaging solution, but it should make you think about how you can integrate for better performance in the real world.

Likewise with peripheral devices such as cameras, older printers, and laptops. When everything works together, everything works better.

Friday, November 14, 2008

Controversy Comes to Document Imaging

Document imaging is not usually considered a controversial subject, but the argument between authors and Google has been pretty heated over the past few years--and it is also a reminder that security of imaged documents is always a concern.

It all started when Google got the bright idea to scan library books and make them available on the Internet. This seemed like a great public service, but what about the authors who wrote those books? Essentially, their work would be given away for free.

Google argued hard and publicly for their side, as did certain authors and authors unions. A major leage copyright suit ensued. And was only recently settled. Read the whole settlement here (if you have a few spare months on your hands).

The upshot of the judge's decision is that the program to scan books and put them on the Web, Google Books, can and will continue. But Google will have to pay a royalty to authors, and can only use small portions of scanned books, unless they get explicit permission to use the whole text from the author or the author's representative.

Check out Google Books here, and consider, too, how this issue might affect your own document imaging situation. Many organizations, for example, use document scanning and imaging to retain sensitive and private data, such as medical records or legal paperwork.

The whole thing is just one more reminder that if you're thinking about document imaging, you need to be thinking about document security. Because far from disappearing once documents go into the computer, they might just live forever, available to everyone.

If they make it onto the World Wide Web.

Thursday, November 13, 2008

The Document Imaging Social Network: Information Zen

Social networking is not just for college kids on Facebook. Social networking applications for business are becoming increasingly common.

AIIM, the industry trade group for those of us interested in document imaging and related topics, recently opened up a social network called Information Zen, located here.

Though the site is new, it has already attracted about 1,000 members. If you are interested in document imaging, scanning, or the overall Enterprise Content Management issue, you should definitely check out this site.

One of our favorite posts was this one, which invited forum members to describe their best experience with a vendor. Document imaging services are of widely varying quality, so this type of community feedback from real users is invaluable to prospective buyers.

Also see the "Ask the Experts" function, such as this discussion of the concept of Digital Mailroom, whereby all documents that arrive in the mail are scanned and distributed throughout the company. The quality of the responses on this site is quite high.

We hope that this will continue, and that vendors will not "game the system" by lurking on the site only to shamelessly plug their own products.

In short, get on this site ASAP if you are interested in document imaging. Interact with your peers, learn, and make better decisions about document imaging solutions.

Thank you, AIIM.

Wednesday, November 12, 2008

Is OCR Ready for Prime Time?

We blogged the other day about Google's new document imaging initiative, which promises to make "hard" PDF documents searchable on the World Wide Web. The technology being used to accomplish this feat, we noted, is called Optical Character Recognition, or OCR.

OCR is one of those technologies that has a definite cool factor. Through OCR software, a computer can "read" a hard copy document and render it into bits and bytes--in short, make a paper document digital.

At that point, sharing and searching that document becomes much easier, because the digital world is not dependent on things like planes, trains, and automobiles. You want something, it's at your fingertips.

But is OCR really ready for widespread corporate use? For example, say you run a law firm, and you are interested in implementing a document imaging solution, but your files are hard for a non-lawyer human being to comprehend, let alone a non-lawyer computer.

Can you rely on OCR technology to accurately transport your paperwork to your computer accurately and completely?

The answer is, yes but. Depending on your provider (so check around), OCR software has bugs and troubles just like any other software. Nevertheless, accuracy rates on typed words are 99%.

Cursive writing, however, is still being figured out.

Saturday, November 8, 2008

An Issue of Standards: Document Imaging Needs Those

There are about five major trade groups devoted to advancing the interests of the document imaging industry. Here is a good list of those, for your reference.

Among those, there is one organization that works specifically to promote a particularly important need of the document imaging industry: standardization. TWAIN was founded in 1992 and has been very aggressive since then, agitating for greater and greater standardization between the great variety of hardware and software that make up this large community.

The fact that this organization (a not-for-profit) has been around so long, since the early days of document imaging and scanning, not only attests to the foresight of its founders, but encourages all of us to appreciate--and, more than that--use this standardization to better effect.

Sharing scanned document is, once again, the prime example of this better effect. The TWAIN board of advisors is a who's who of the document imaging industry, which might imply that they'd be too competitive with each other to cooperate, but the opposite is true.

Because sharing scanned documents empowers all players in the market. TWAIN is a testament to that. If you are interested in becoming a member of this organization, visit this page. You can sign your company up to head a sub-group supporting the larger mission of standardized, work-together document imaging systems, worldwide.

Tuesday, November 4, 2008

See the Forest: Enterprise Content Management

Document imaging is part of a larger system, commonly called Enterprise Content Management. As we research all the various document scanning and imaging solutions available, we can sometimes miss the forest for the trees, as the saying goes.

This is regrettable. Even if your business is small by comparison to companies that need products with lofty terms like "Enterprise Content Management," think about what this Harvard Business School-sounding concept means--or could mean--for your business.

If you are new to this conversation or even just looking for a great resource to learn more, visit the Association for Information and Image Management, or AIIM, Web site. It's here and it's rich. To start with a definition:

"Enterprise Content Management (ECM) is the technologies used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists."

Document imaging is one technology among many in that ecosystem.

This is important to realize not to make some abstract point, but because it makes us consider the real-life workings of these interconnected technologies, i.e. stuff needs to work together. This is why a company like IBM is a major provider of ECM products: because they make everything, everything should (theoretically) work together. The old standardization sell.

However, that is not an argument for using IBM products. It is merely an argument for thinking ahead of time about whether your document imaging system is going to fit into your overall goal of archiving and using the information important to your business.

It is merely an argument for seeing the forest, not just the trees.

Monday, November 3, 2008

Google Making Scanned Docs Searchable Worldwide

Being able to search within your organization for scanned documents is a necessary aspect of the most successful document imaging systems. But maybe, in focusing so much on intra-organizational search, we're missing the bigger picture.

For instance, Internet giant Google recently revamped its technology in hopes of making scanned documents as searchable as any other Web page.

As this post indicates and your experience may corroborate, PDF files have been showing up in Google search results for a while now. But before, Google was merely reading the "metadata" of these pages, i.e. the tags attached to them, such as document title, keywords, etc.

But now, Google has created a solution whereby actual scanned documents are "read" by the Google search technology through a method called "Optical Character Recognition." Read about how this is being done at the Google blog, here. Right now, it only works on PDF docs.

This is a very interesting development, and perhaps a huge one. Even today, think of all the paper documents that are not searchable to the world at large. As humongous as the Web already is, it's growing by leaps and bounds every day!

We are going to have to keep a sharp eye on this developing story.
 
http://www.blogger.com/rearrange?blogID=1022838784761333320