Introduction to the Manual


The EPC and the World of Digitization

The Electronic Publishing Center was established as part of the Oklahoma State University Library in 1996. Since then it has been involved in taking numerous documents and digitizing their contents for online presentation.

Digitization is a long, careful process, but it is an important step toward preserving aging documents and disseminating the information contained in these documents to a large audience.


Purpose of the Manual

The purpose of this manual is to introduce new employees to the process of digitization. Most of you have probably never worked in digitization before, and some of you may not even really know what is involved in making a physical document an online resource.

By the time you finish reading and working through this manual, you still may not know all there is to know about digitization, but you will have a better understanding of how the Electronic Publishing Center (EPC) acquires projects, scans documents, OCR's, proofs and uses XML to mark them up for presentation and publication on the Internet.

Anyone who has worked at the EPC and knows the process may also use this manual as a refresher tool. We get busy working on one part of digitization and forget some of the steps involved in another part, so this manual can also help us remember what all is involved in different aspects of digitization.


Learning with Examples

We have tried to provide you with detailed instructions that will help get you started in scanning, OCR'ing, proofing and marking up documents.

However, it is important for you to understand that each project is different and brings with it different guidelines for OCR'ing, marking up, and so on. We recommend reading this manual so you can get acquainted with procedures for each step in the digitization process, but it never hurts to get thorough instructions from other EPC employees when it is time for you to work on a particular project.

To help you understand the instructions more clearly, we have tried to include examples whenever possible. We also have you many screenshots to better illustrate what we're trying to describe, and we've also included a Glossary and an Index, but you may still need help, so don't hesitate to ask questions!


The EPC and Digitization

We hope you're not intimidated by the process of digitization, because one you get started, it really isn't very difficult. You do need to have patience, however, and don't rush through. Especially in our line of work, it is better to do it correctly than quickly!

For more details about the EPC and to see examples of what we do, visit


Terms and Function You Should Know Before Beginning

File Types/Extensions and Descriptions

File Name Extended Name Descriptions of File Type/Extensions
JPEG or JPG (.jpg) Joint Photographic Expert Group a file format that is best for keeping your document size small; ideal for web pages
TIFF (.tif) Tagged Image File Format a more complex file format designed for image processing; slower execution time; recommended for black and white images
TXT (.txt) Text Only text files become a common denominator between applications that do not import each other's formats
XML (.xml) Extensible Markup Language used for defining date elements on a web page

Mouse Functions and Descriptions

Mouse Function Description of Function
click click once the left mouse button (same as "single click")
double click click twice the left mouse button
right click click once the right mouse button
single click click once the left mouse button (same as "click")