In this post, we study database applications, database management approaches, support systems, their activities, functions, advantages and disadvantages. We also learn about different kinds of useful file organization and data structures or models.
- File or table is a group of related records
- File are classified according to (1) application, (2) type of data they contain or (3) permanence
- Application: payroll file, inventory file etc.
- Data type: document file, graphical image file etc.
- Permanence: master file (monthly, yearly), and transaction file (daily, weekly)
A database is an integrated collection of logically related data elements. It’s a single, large repository of data which can be used simultaneously by all departments and users of an organisation. It consolidates file previously stored in separate files into a common pool of data elements. Data stored in a database are independent of (1) application programs using them and (2) type of storage devices on which they are stored.
Management of data as a resource
Database management approach (DBMA)
Development of databases and Database management software is the foundation of managing organisational data. DBMA consolidates data records and objects into databases that can be accessed by many different application programs. In DBMA, an important software package called Database Management System (DBMS) serves as a software interface between users and database. DBMS helps users easily access the records in a database. Database Management Involves use of DBMS to control how databases;
- Are created
- Interrogated and
- Maintained to provide information needed by users and organisations
Example, customers records in banking are needed for several applications such as check processing, Automated Teller systems, bank credit cards, savings accounts, and installment loan accounting. These data can be consolidated into one database.
DBMA has three activities
Updating and maintaining common databases to reflect new business transactions and other events requiring changes to an organisation’s records (operational applications). Providing information needed for each end user’s application programs that share the data in common databases.
Providing an inquiry/response and reporting capability through DBMS software so that end users can use web browsers and the internet or corporate intranets to easily interrogate databases, generate reports and receive quick responses to their ad hoc requests for information (analytical applications).
Functions of DBMS
Database management packages like MS Access or Lotus, allow end-users to easily develop the database they need. In large organisations database development is vested in hands of Database Administrators and specialists. This improves the security and integrity of the organisational databases.
Database interrogation ability is the major benefit of database management systems. End users can use a DBMS by asking from a database using a query language or a report generator. They can receive immediate response in the form of video displays or printed reports.
Databases of organisations need to be updated continuously to reflect new business transactions and other events. This database maintenance process is accomplished by transaction processing programs and other end user application packages with support of DBMS.
End users, system analysts and other application developers can use internal 4GL programming languages and inbuilt software development tools provided by many DBMS packages to develop custom application programs. Example, you can use DBMS to easily develop data entry screens, forms, reports, or web pages of a business application.
Commercial DBMS include;
- MS-SQL server
- IBM DB2
Advantages of DBMS software
Reduced data redundancy
Data redundancy means that the same data files appear in many different files and often in different formats. In a file-based system, separate files tend to repeat the same data over and over again. In database approach same information is available to different users.
Data is accurate, consistent and up to date. In DBMS reduced data redundancy increases chances of data integrity.
More programs independence
With files-based management file systems, different files were often written by different programmers using different file formats, hence programs dependence. DBMS programs and files formats are the same, thus one or several programmers can spend less time in maintaining files.
Increased user productivity
DBMS are fairly easy to use i.e. users can get their requests without having to resort to technical manipulations. Users don’t need wait for professionals to provide what they need.
Although various departments may share data in common, access to specific information can be limited to selected users. E.g. through the use of passwords, a student financial, medical and grade information in a university database is made available to legitimate individuals.
Disadvantages of DBMS software
Installing and maintaining a database is expensive particularly in large organisations and user training is costly.
Possibility of unauthorized user to get past the safeguards. If successful, they access all files not few. Natural disasters e.g. earthquakes, fire etc. Others include theft, hardware or software problems. It is advised to make backup copies of files and store them elsewhere.
DB may hold information they should not and be used for unintended purpose, perhaps intruding people’s privacy. E.g. when medical data are used to evaluate the employee for a job promotion. Manipulation of sound, photos or video.
File organization and Data structures
A file-based system
Is a collection of application programs that perform services for the end users such as accounting packages, personnel management, report generators etc. Each application program defines and manages its own data. Departments own their respective files. Predecessor of the database.
A sequential file: records have been arranged into ascending or descending order of a key field. When new records are added the sequence is maintained in respect of magnetic tape by the insertion of records in the correct sequence in a new reel of tape. It is necessary to generate a completely new tape when applying file amendments because it is not possible to physically insert or delete them on the same tape.
On a disk file when records cannot be stored in the correct location on a track, the logical sequence of records, not the physical one, can be maintained using pointers and an overflow area.
Advantages of a sequential file
- Very efficient when a majority of the records are updated
- Files easy to design
- Can be stored on an inexpensive storage media such as magnetic tape
- Old master file and transaction files are automatic back-up files since updating produces a new file
Disadvantages of a sequential file
- If only a few records need updating, entire file must still be processed and a new file written
- Master files must be sorted into same key field sequence
- Transaction files must be sorted into same key field sequence as master files they are updating
- Files are only current immediately after an update
- Records cannot be accessed directly
- Sequential organisation
A Direct file: each record, as with sequential file, contains a key field. However, records need not appear on the file in a key field sequence. Any record stored on a direct file can be accessed if its location or its address is known. Previous records need not be accessed.Each record on a file is assigned to a storage address.
Not used for processing an entire file, but where a limited number of records need to be accessed quickly and directly as the case of online inquiry. Used by airlines to obtain up-to-date flight information quickly and directly. Generally maintained on magnetic disks, optical disks (when the file is static), or on mass storage systems.
Advantage of direct files
- Terminals can be used to update files online as they occur, keeping the file current
- No need for separate transaction file as transactions are processed as they occur
- Any record can be accessed in milliseconds
- Not necessary to read entire file to update it; only master records to be updated need be accessed
- Several files can be updated at the same time as transactions occur
- Files do not have to be sorted into key field sequence
Disadvantages of direct files
- Back-ups must be created because updating destroys old records
- Appropriate algorithm/address often difficult to devise
- Accidental destruction a problem
- Security difficult to control because many users have access
- More storage space may be required than for a sequential file
- Does not facilitate sequential processing
- Hardware and software requirements greater than for sequential files
An indexed file: records are stored on the file in sequential order according to key field. Offers the simplicity of sequential file processing as well as a capability for direct access. As the records are recorded, the system establishes one or more indexes to associate the key field value(s) with the storage location of the record on the file.
The system then uses indexes to access desired records directly. Indexes function in the same way as book indexes, that is in order to access a record directly indexes are searched for the key value(s) of the desired record, which provide the record to be accessed from the file. Indexes files are maintained on magnetic disk, optical disk (static files), or on a mass storage system.
Advantages of indexed files
- Well suited for applications requiring both sequential and direct access
- Access to specific records faster than with sequential files
- Use of indexes eliminates need for an algorithm as is required with the direct files
Disadvantage of indexed files
- Storage requirements greater than with sequential files since space is required for the indexes
- Random access is slower than with a direct file
- Require more complex hardware and software than needed with sequential files
Differences between direct & indexed files
Records may be accessed randomly on a directly organised file, whereas records may be accessed sequentially or randomly from an indexed organised file. Direct-organised files utilise algorithm to determine the location of a record, whereas indexed organised files utilise indexes to locate a record to be randomly accessed.
An algorithm is a mathematical method devised to convert the key field value into an address.
Examples of files
Payroll master file; typical contents are:
- Clock number
- Tax code
- National Insurance number
- National Insurance category
- Taxable gross to date
- Tax to date
- Taxable gross previous employment
- Holiday credit to date
- Sickness holiday credit weeks to date
- Employee’s national insurance to date
- Total National Insurance contributions to date
- Fixed deductions; charities, overalls, savings, loans
- Weakly salary amount (as appropriate)
- Hourly rate (as appropriate)
- Holiday credit flat rate
- Employee bank details
Customer file: typical contents include:
- Account number
- Name and address
- Credit limit
- Account balance
- Category/discount rate
- Age analysis of account balance
Customer sales history file: may be structured as:
- Account number
- Representative code
- Customer type
- Area code
- Turnover this period – analysed by product
- Turnover to date since week 1- analysed by product
- Number of order to date since week 1
- Discount received to date
- Cost of sales (if appropriate)
- Profitability (if appropriate)
Product file: could be structured as follows:
- Product code
- Product description
- Pack size
- VAT code
- Recommended retail selling
- Location code
- Quantity in stock
- Reorder level
- Reorder quantity
- Ordered but outstanding
Stock file: raw materials or components:
- Stock number
- Location in stores
- Actual stock
- Control parameters; reorder level, maximum stock level, minimum stock level
- Used on
- Quantity: free stock
- Quantity allocated stock
Orders file: may be structured as follows:
- Order number
- Account number
- Product code
- Customer name (online)
- Representative code (online)
- Customer type (online)
- Further product may be added from the product file
Plant register file – Contents of this file provide information for asset accounting requirements, giving details relating to specific items of plant equipment.
- Type of asset
- Plant code
- Rating (horsepower or kilowatt rating)
- Floor area
- Date of installation
- Date of purchase
- Original cost
- Installation cost
- Depreciation class
- Annual amount of depreciation
- Cumulative depreciation
- Written down book value
- Maintenance cost
- Cost of additions
- Disposal value
DATA STRUCTURES OR MODELS
Structure of data in a database are either (1) physically or (2) logically arranged. Physical data structure refers to the physical arrangement of data on a secondary storage device. Physical structure is the concern of the specialist who designs Database Management systems (DBMSs).
Logical data structure concerns how the data ‘seems’ to be arranged and the meanings of the data elements in relation to one another. The concern of users and programmers. Database management system packages are designed to use a specific data structure to provide end users with quick, easy access to information stored in the database.
Types of logical data structures
Logical Data structures or models are generally defined in terms of schema. Schema is a conceptual view of the logical relationships between the data elements in the database. Includes: names of major elements, their attributes, and logical relationships between them.
Hierarchical structures, also called tree structure:
All records are dependent and arranged in multilevel structures consisting of one root record (parent) and any number of subordinate levels i.e. one to many relationship. Any data element can be accessed by moving progressively downward from a root and along the branches of the tree until the desired record is located. Application programs process hierarchical databases one record at a time.
Limitations include that it was a natural model for the databases used for structured, routine types of transaction processing. Records with multiple relations or those that lack hierarchical relations cannot be represented by this model. E.g. employees from more than one department can work on more than one project.
Similar to the hierarchical structure except that in this structure a node may have more than one parent. Can represent more complex logical relationships and it is still used by some mainframe DBMS package. It allows many to many relationships among records i.e. the model can access data by following one of several paths. E.g. departmental records.
More flexible than hierarchical model in support of databases for many types of business operations. One limitation is that the model cannot easily handle ad hoc requests formation because relationships must be specified in advance just like the hierarchical model.
Most popular of the three database structures. Used by most microcomputer DBMS packages as well as by most mid-range and mainframe systems. Organises data in terms of dimensional tables. All data elements within the database are viewed as being stored in the form of a simple table. Database management systems such as Access create new tables of data relationships using parts of the data from several tables.
Allow users to easily receive information in response to ad hoc requests. Relational databases are easier for programmers to work with and easier to maintain than hierarchical and network models. One limitation is that relational database management systems cannot process large amounts of business transactions as quickly and efficiently like other models.