Introduction to EDI, Groupware, DBMS, and TPS
End-user Computing - An Historical Perspective
Large-scale use of computers in business dates back to the 1960's and 1970's. At that time, unlike today, the standard computing environment was one that involved a central, multi-user computer system. The computer was accessed by typing program source code and data onto 80-column punch cards, using keypunch machines the size of a desk that were either shared between multiple users or were the work location for clerical "data entry" staff. After keying in the program and data, the cards were fed into a card-reader for transmission to the central system, where they were executed in turn and the results sent back to line printers or card-punch output devices. Thirty minutes was a typical turn-around time for a job that would require 2 seconds of CPU time.
In the early-1970's communication with the central system shifted over to terminals that were capable of printing up to 80 characters per row, on continuous form paper, and which communicated with the central system on dedicated wiring or acoustic-coupled modems that could handle speeds of 300 bits/sec. By the late 1970's, video display terminals with 24 rows of 80 characters had become common, and connection speeds for dedicated wiring were up to 9600 bits/sec, with the fastest modems being direct-wire connected and operating at 1,200 bits/sec.
Electronic Data Interchange
Potential Problems
How do you know that the company whose order your computer just accepted really wants that stuff and will pay for it promptly when you deliver it?
How do you prevent embezzlement, fraud, etc? With conventional systems there is an "audit trail" of paperwork. Can an electronic audit trail be assured? This problem is an issue for all on-line systems, but even more so for EDI.
PC-based EDI
PC-based EDI is typically the minimal system that will permit a company to do business with trading partners that have decided to do business only by EDI. Most of the advantages of EDI will not occur with PC-based systems, but for small companies the cost of full-function EDI may well exceed the benefits, so PC-based systems will continue to be sensible choices for many situations.
Data Warehouse
A standard legacy DBMS environment provides for:
- The efficient operation of day-to-day business.
- Meeting the system performance requirements for rapid updates to keep up with the flow of business (for example, TRIPS).
- Tightly combining multiple functions, so that order entry, manufacturing materials orders, payroll, production scheduling, and billing, for example, are all driven off of the same unified system.
- Maintaining a robust environment that ensures data integrity.
- Provide standard reports to users, staff and management (for example, DARS degree audits).
Where they are often less successful is in creating reports that the original software design planning did not anticipate being of interest (for example, generate a list of all faculty who teach in RTEC 203 at least one course during the year, so that they can be asked what improvements to the room would be most beneficial).
A data warehouse (also known as a "data mart" or "information warehouse") DBMS environment
- Is optimized for the easy creation of ad hoc reports - reports that may be requested only one time, or that may be modified somewhat from time to time. This involves two features:
- A relational design.
- A graphic-user-interface report generator running on a Windows or Macintosh desktop system, communicating through a high-speed network to the warehouse system.
- Runs on separate hardware from the production legacy system, so that any inefficiencies in hardware or software operation will not impact timely responses from the legacy system. Typically the legacy system is running on an IBM mainframe and the data warehouse is running on a RISC hardware and a unix operating system.
- Can be designed initially with a cheaper hardware system, because if one of the reports comes back in fifteen minutes instead of five, it is no big deal. If the system becomes so popular that performance becomes an issue, it will be evident to all managers involved that their and their staff's time is worth the investment to upgrade the hardware.
- Typically contains redundant data (whereas the opposite is true in OLTP systems). For example, a customer's name will be repeated in many tables to eliminate "joins" (a particularly slow process in relational databases) so as to speed processing and make the applications more "user friendly."
- Is normally "read only."
- Will typically be loaded at any given time with "last night's data" - it is intended to be used for reports of trends, averages over the last month, and such. It is not intended to be used for minute-by-minute management of operations. The loading of updates from the legacy system to the warehouse each evening may impose a significant burden on the legacy system, and so requires careful design.
- Typically contain both lightly summarized and highly summarized data (i.e., sales by region, salesman, quarter). Sales applications are the most common and oldest "warehouse" applications.
- Is likely to be found in industry or commerce. The oldest warehouse we've found on the web for a university dates from late 1993.
On-line or Real-time vs. Batch Processing
The key issue is human efficiency vs. computer system efficiency: batch processing permits each day's work to be accomplished with a cheaper machine. On-line processing reduces to a minimum the time people spend "just waiting around," but only if the system hardware and software have so much extra capacity that they can meet the peak load, which means that there are many hours of the day when the system is "loafing."
General purpose operating systems, such as Digital's VMS, provide for a robust mix of real-time and batch processing on the same system. With priority-based scheduling of the CPU (common in multi-user operating systems) the interactive users will get prompt attention whenever they need it, but the otherwise idle time for the CPU will instead be assigned to completion of the less urgent jobs submitted for batch processing.
Transaction Processing
On-line DBMS
- The distinction between DBMS and TPS arose early in the mainframe era, when databses usually operated in batch mode. A modern, on-line DBMS is by its nature also a TPS, although it may be optimized for reading (query transactions) rather than writing (update transactions), or may for other reasons have limited performance capabilities when used in a transaction processing mode.
EDI and TPS
- Electronic Data Interchange provides one of the "source data capture" methods, in addtion to Point-of-Sale systems.
Bar Codes
- Bar Codes provide another machine-based data entry method. Modern bar codes include far more information than the simple grocery-store UPC symbols. Ohio University's Russ College of Engineering supports one of the world's leading programs in the use of bar codes for industrial engineering.
Sanity Checks
- Any time data is first entering a system, cautious programmers will expend significant efforts to ensure that the data are correct, thereby avoiding the consequences of bad data and the need for later corrections. The simplest of the entry-time data confirmation routines goes by the name of "sanity-checks": Does the data provided have believable charactistics for the field it was entered into? For example, Does a field for a person's first name include only letters? Does a field for telephone number consist only of digits? When sanity checks are imposed in the software, design decisions need to be made as to whether (and if so, how) they can be overridden to accomodate exceptional cases.
Two-Phase Commit
- TPS often involves updates to multiple locations. For example, the customer invoice history, the production schedule, the shipping schedule, etc. Significant effort needs to be expended to ensure that if a system failure (or operator error) occurs during the process of updating the files, that they will either be left in a known consistent state, or that information will be on hand to restore them to a known consistent state (either "backing-out" or completing the partial transaction). The so-called "Two-Phase Commit" is one of the standard strategies used to achieve this robustness.
Return to MIS 300 Page
Dick Piccard revised
this file (http://oak.cats.ohiou.edu/~piccard/mis300/edintro.html) on October 12, 1998.
Thanks to Judy Collins, co-leader of Computer Services' data warehousing project, for reading and commenting on the first draft of the data warehousing discussion.
Please E-Mail comments or suggestions to "piccard@ohio.edu".