Commonwealth Computer Navigator's Certificate/Milestone 4/Re-structure of template/Example Tutorial What is a Database?
Database Concepts |
What is a Database | How a Database is Organized | Data Types | Normalization | Self Assessment | Summary & FAQs |
Contents
Tutorial 5.1.1.1 - What is a Database
Upon completion of this tutorial the learner will be able to:
|
Database described
A database is a collection of related information which is organized into a series of rows (called records) and columns (called fields) that are populated with data. The purpose of organizing data into a database is to answer queries, therefore turning the data into information. The computer program used to manage and query a database is known as a database management system (DBMS). A well constructed database should make it easy to manage information quickly and effectively, and to ensure the data stays accurate over a period of time. The data is the raw information that is entered and makes up the records in the database while the resulting information is the processed data.
It should be noted that database records have existed for ages, any form of recording data in ledgers, notebooks and other collections of data could also be considered databases.
Subject domains
Databases are used in many different subjects domains, a few worth considering would be;
- Personal Banking - whenever you use a bank machine or ATM all the data regarding your transaction is saved to a database. This makes it more efficient for creating bank statements, reports and for auditing bank accounts.
- Health Care - It is frequent that all the data from your visits to the doctor or clinic are recorded in a database. This makes it easier for doctors and health care workers to see your health history and make better decisions regarding treatment and advice.
- WikiEducator - behind this educational wiki is a database. All the data about user accounts, wiki pages and their change history, etc. is all stored in a database. This allows for considerable flexibility when it comes to organizing and reporting on the activity within a wiki.
|
Databases vs. OpenOffice applications
To better understand what the OpenOffice database is (and databases in general), it is good to know what a database isn't. In this sub-section we compare and contrast a database with the other OpenOffice applications.
Database | vs | Application | Compare and Contrast |
---|---|---|---|
Database | vs | Spreadsheet | A spreadsheet provides the ability to store and organize numeric data and allow calculations to be performed upon the data. Like a database a spreadsheet stores data in columns and rows. How these two differ is the data in the spreadsheet is in one or two related sheets of columns and rows. With a database these columns and rows can be related to many other sheets or tables (in database terminology). And the way you join the sheets or tables together allows for a rich query and reporting environment for creating information. A spreadsheet is focused on organizing and graphically representing numeric data. A database is focused on organizing both textual and numeric data and providing abilities to query and report upon this data. |
Database | vs | Word Processor | A word processor provides the ability to store text and graphics used to document, communicate and correspond with individuals, teams and groups. A word processor is an excellent tool for writing and publishing data. In general, the data within a word processed document is unstructured. As unstructured we mean that the data is more free flow and sometimes accompanied with graphics and pictures. Unstructured data is not categorized at the field of 'word' level. Where a database contains structured data, for most every word is categorized by a attribute or field name. |
Database | vs | Presentation | Presentation software provides the ability to present information and graphics. Presentation software is an excellent tool for giving presentations, facilitating discussions and organizing the visual content for group activities. Sometimes the information found in a presentation has been derived from database queries and reports. A database is not meant as a presentation tool, though it does contain data that can be presented. A presentation tool is not meant to store data, it is meant to present already existing information. |
Introduction to database normalization
Database normalization is a method used in designing the structure of your database. In it simplest form database normalization is meant to increase the accuracy and consistency of the data stored in the database. A simple example of normalization is to think about a family living in one home (at one address), see the table below of how this data may be organized. As you can see the table contains duplicate address data. If the family moves you would need to change the address in more than one record. Consider the following scenario;
first_name | last_name | birth_date | street_address | city | province | country | postal_code |
---|---|---|---|---|---|---|---|
Peter | Rawsthorne | 28/12/1963 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
David | Rawsthorne | 26/10/1961 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
Lisa | Rawsthorne | 11/03/1965 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
Malcolm | Rawsthorne | 03/05/1987 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
Hannah | Rawsthorne | 23/09/1989 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
Consider the above Families table as not normalized. It contains duplicate data, and if the family moved the address of the five records would have to be updated. And if one record was updated incorrectly, the data would be inaccurate and inconsistent. Now consider the two tables below;
first_name | last_name | birth_date | address_id |
---|---|---|---|
Peter | Rawsthorne | 28/12/1963 | 0062 |
David | Rawsthorne | 26/10/1961 | 0062 |
Lisa | Rawsthorne | 11/03/1965 | 0062 |
Malcolm | Rawsthorne | 03/05/1987 | 0062 |
Hannah | Rawsthorne | 23/09/1989 | 0062 |
address_id | street_address | city | province | country | postal_code |
---|---|---|---|---|---|
0062 | 1234 Millar Rd. | Vancouver | BC | CANADA | V0N1B0 |
When you normalize a database design you increase the accuracy and consistency of the data. As you can see in the above two tables the address was moved out of the Families table and created in a new table called Address. These two tables (People and Address) are linked via the address_id. If you change the address at address_id = 0062, the address would be changed for all the family members. These two tables are known as entities and the fields are the attributes. The relationship between the address and people entities is one-to-many. One address for many people. If you wanted to then see the data as it was in the Families table in this example you would join the two tables via the address_id and build a query upon this join. This is the fundamental concept of normalization. It gets much more complicated and is based upon a foundation of relational mathematics. We will cover this further as we continue through this chapter.
Wikipedia has a good number of normalization based entries, here are a few of the most relevant;
Don't get scared away by such heavy sounding terms like normalization and calculus. We will explore these concepts slowly and gently. An understanding of normalization and relational theory will pay huge dividends when you start to build your databases. |
Entity and Attribute identification
|
View our sample solution for the above activity
|