| << 12.1.0- Databases- Are they a Thing of the Past? | Chapter12 | 12.1.2- Universal Access for Databases >> |
How Databases Store Data
Databases store information in tables and records so that, in theory at least, the information contained within is easily accessible to whoever needs it. For example, if you work for a large company, then they probably have a database that contains a table of employee data, with lots of employee records – and one of those records will contain information about yourself. My record in the Employees table of the 'Wrox Press Human Resources' database probably looks something like this:
|
Name |
Address 1 |
Address 2 |
City |
State |
Zip |
Phone |
|
Chris Ullman |
29 LaSalle Street |
Suite B01 |
Chicago |
Illinois |
60603 |
392-893-8004 |
There's a field for my name, two fields for the first two lines of my address, a field for the city…and so on.
In database-speak, a record is a collection of data containing information from each field, relating to a specific entry. For example an Address Book database takes the form of a table consisting of the columns (or fields) "Name", "Address", "Phone Number". Then each row of the table is referred to as a record.
Standardization of Data
How does the database ensure that each employee's record is stored in the same way? We don't want a situation where one employee's name is stored across two fields while another's name is stored in a single field; or where half the employees have their entire address packed into a single field.
Databases standardize the way that information is stored. If you are obliged to put your first and last name information into a single column – then everybody else who uses the database has to adhere to the same template as well. So in this case, each employee record contains a single Name field, in which the employee's full name is stored – no matter how many names they might have. Some employees might have an email address while others don't, but if you include an email field then you will have the option of entering an email address for every employee.
Keys
How is the database designed so that we can tell two different records apart? Principally, we could look at the data contained in each one. Ah, but what if there are two employees with the same name? How do we set about differentiating between the Chris Ullman who works on the fifth floor in the budgeting department, and the Chris Ullman who works in the basement writing technical books? You could look at the telephone numbers, which are likely to be different – but that's not guaranteed.
Instead, we use keys. A key is a unique identifier – each record in the database table has a key that is guaranteed to be unique from all the other keys in the key column:
|
Employee Key |
Name |
Address 1 |
Address 2 |
City |
State |
Zip |
Phone |
|
A100 |
Chris Ullman |
29 LaSalle Street |
Suite B01 |
Chicago |
Illinois |
60603 |
392-893-8004 |
|
A101 |
Chris Ullman |
29 LaSalle Street |
Suite 520 |
Chicago |
Illinois |
60603 |
392-893-8006 |
Key values can be numeric or alphanumeric so long as they're unique within the key column. If your database table has a key, then you can use it in other tables to link them back to other information stored in your database. For example, I can use the Employee Key values in the following Books table of the database to refer to the information contained back in the Employees table:
|
Book Key |
Author Key |
Book Title |
|
LB01 |
A100 |
Instant HTML |
|
LB02 |
A100 |
Beginning Active Server Pages 3.0 |
|
LB03 |
A101 |
Budgeting for Buffoons |
|
LB04 |
A100 |
IE5 Dynamic HTML Programmer's Reference |
Thus, the values in the Author Key field simply refer back to the values in the Employees table – thus identifying the author of each book uniquely and without ambiguity. From this table you should be able to deduce that the author of this chapter was also author of the best-selling titles Instant HTML and IE5 Dynamic HTML Programmer's Reference, and that it was the other Chris Ullman who wrote the less well-known Budgeting for Buffoons. You might also be wondering why there is a second key in the Books table – the Book Key field. This key belongs to the Books table, and is a unique identifier for the books in the table. Remember, an author might have written more than one book and we still need to be able to differentiate each book.
The reason we usually choose to store information in different but related tables like this – and not to store it in one great big table – is because it can help us to avoid logging the same information multiple times. The Book table here doesn't need to store all the personal details of the author of each book – instead, it stores a simple key value. It means that my details aren't repeated three times in the Books table – they're just stored once in the Employees table.
It also means that if my telephone number changes, the database administrator won't need to trawl through the Books table looking for the books that I wrote, in order to update that data multiple times. Instead, they can just go straight to my record in the Employees table, where they need to make exactly one update! So, keeping this kind of repeatable information separate means that you can simplify admin tasks dramatically, by ensuring that you only need to change it in one place. In fact, the process of carrying out this simple piece of common sense is the nuts-and-bolts of the normalization process! See how terminology can often obscure the most straightforward of tasks…
| << 12.1.0- Databases- Are they a Thing of the Past? | Chapter12 | 12.1.2- Universal Access for Databases >> |

RSS

