CCNC/CCNC Module 5/The database application/Database Concepts

From WikiEducator
Jump to: navigation, search
Tutorial.png Database Concepts 

What is a Database | How a Database is Organized | Data Types | Normalization | Self Assessment | Summary & FAQs


What is a Database

Icon objectives.jpg

Outcomes

Upon completion of this tutorial the learner will be able to:
  • Describe what is a database
  • Explain how databases are used within different subject domains
  • Compare and contrast the database from other OpenOffice applications
  • Discuss database normalization and identify different levels of normalization
  • Recognize entities and their attributes from a subject domain


Database described

Figure 1. Rows and Columns

A database is a collection of related information which is organized into a series of rows (called records) and columns (called fields) that are populated with data. The purpose of organizing data into a database is to answer queries, therefore turning the data into information. The computer program used to manage and query a database is known as a database management system (DBMS). A well constructed database should make it easy to manage information quickly and effectively, and to ensure the data stays accurate over a period of time. The data is the raw information that is entered and makes up the records in the database while the resulting information is the processed data.

It should be noted that database records have existed for ages, any form of recording data
in ledgers, notebooks and other collections of data could also be considered databases.

Subject domains

Databases are used in many different subjects domains, a few worth considering would be;

  1. Personal Banking - whenever you use a bank machine or ATM all the data regarding your transaction is saved to a database. This makes it more efficient for creating bank statements, reports and for auditing bank accounts.
  2. Health Care - It is frequent that all the data from your visits to the doctor or clinic are recorded in a database. This makes it easier for doctors and health care workers to see your health history and make better decisions regarding treatment and advice.
  3. WikiEducator - behind this educational wiki is a database. All the data about user accounts, wiki pages and their change history, etc. is all stored in a database. This allows for considerable flexibility when it comes to organizing and reporting on the activity within a wiki.



Icon reflection.gif

Reflection

  • Identify another subject domain where a database could be used. Ask yourself if having a database available could improve the data management for the subject domain. Also ask yourself what other important factors would determine if using a database would be appropriate for the identified subject domain.



Databases vs. OpenOffice applications

To better understand what the OpenOffice database is (and databases in general), it is good to know what a database isn't. In this sub-section we compare and contrast a database with the other OpenOffice applications.

Database vs Application Compare and Contrast
Database vs Spreadsheet A spreadsheet provides the ability to store and organize numeric data and allow calculations to be performed upon the data. Like a database a spreadsheet stores data in columns and rows. How these two differ is the data in the spreadsheet is in one or two related sheets of columns and rows. With a database these columns and rows can be related to many other sheets or tables (in database terminology). And the way you join the sheets or tables together allows for a rich query and reporting environment for creating information. A spreadsheet is focused on organizing and graphically representing numeric data. A database is focused on organizing both textual and numeric data and providing abilities to query and report upon this data.
Database vs Word Processor A word processor provides the ability to store text and graphics used to document, communicate and correspond with individuals, teams and groups. A word processor is an excellent tool for writing and publishing data. In general, the data within a word processed document is unstructured. As unstructured we mean that the data is more free flow and sometimes accompanied with graphics and pictures. Unstructured data is not categorized at the field of 'word' level. Where a database contains structured data, for most every word is categorized by a attribute or field name.
Database vs Presentation Presentation software provides the ability to present information and graphics. Presentation software is an excellent tool for giving presentations, facilitating discussions and organizing the visual content for group activities. Sometimes the information found in a presentation has been derived from database queries and reports. A database is not meant as a presentation tool, though it does contain data that can be presented. A presentation tool is not meant to store data, it is meant to present already existing information.

Introduction to database normalization

Database normalization is a method used in designing the structure of your database. In it simplest form database normalization is meant to increase the accuracy and consistency of the data stored in the database. A simple example of normalization is to think about a family living in one home (at one address), see the table below of how this data may be organized. As you can see the table contains duplicate address data. If the family moves you would need to change the address in more than one record. Consider the following scenario;

Families
first_name last_name birth_date street_address city province country postal_code
Peter Rawsthorne 28/12/1963 1234 Millar Rd. Vancouver BC CANADA V0N1B0
David Rawsthorne 26/10/1961 1234 Millar Rd. Vancouver BC CANADA V0N1B0
Lisa Rawsthorne 11/03/1965 1234 Millar Rd. Vancouver BC CANADA V0N1B0
Malcolm Rawsthorne 03/05/1987 1234 Millar Rd. Vancouver BC CANADA V0N1B0
Hannah Rawsthorne 23/09/1989 1234 Millar Rd. Vancouver BC CANADA V0N1B0

Consider the above Families table as not normalized. It contains duplicate data, and if the family moved the address of the five records would have to be updated. And if one record was updated incorrectly, the data would be inaccurate and inconsistent. Now consider the two tables below;

People
first_name last_name birth_date address_id
Peter Rawsthorne 28/12/1963 0062
David Rawsthorne 26/10/1961 0062
Lisa Rawsthorne 11/03/1965 0062
Malcolm Rawsthorne 03/05/1987 0062
Hannah Rawsthorne 23/09/1989 0062
Address
address_id street_address city province country postal_code
0062 1234 Millar Rd. Vancouver BC CANADA V0N1B0

When you normalize a database design you increase the accuracy and consistency of the data. As you can see in the above two tables the address was moved out of the Families table and created in a new table called Address. These two tables (People and Address) are linked via the address_id. If you change the address at address_id = 0062, the address would be changed for all the family members. These two tables are known as entities and the fields are the attributes. The relationship between the address and people entities is one-to-many. One address for many people. If you wanted to then see the data as it was in the Families table in this example you would join the two tables via the address_id and build a query upon this join. This is the fundamental concept of normalization. It gets much more complicated and is based upon a foundation of relational mathematics. We will cover this further as we continue through this chapter.


Icon inter.gif

Web Resources

Wikipedia has a good number of normalization based entries, here are a few of the most relevant;

Don't get scared away by such heavy sounding terms like normalization and calculus. We will explore these concepts slowly and gently. An understanding of normalization and relational theory will pay huge dividends when you start to build your databases.



Entity and Attribute identification

Icon activity.jpg
Activity
  1. Write a paragraph describing a student and their relationship with a teacher
  2. Read over your paragraph and identify the entities within this writing
  3. On a piece of paper write the identified entities in boxes
  4. Draw arrows between the boxes, identifying the relationships
  5. On another piece of paper list the attributes for each entity





Feedback.png