Saturday, July 31, 2010

One-To-Many Programming: From the Cloud to the Ground

A Programmer’s View of One-to-Many

Let’s return to what David Remsen says about one-to-many data relationships: “When deciding upon how to best split data into tables we often ask "What are the logical units of the data?" In other words, what is best represented by a single row of data. It is often convenient to think of physical representations when applicable. For example, what types of information can we associate with a patient that only appears once. A person usually has one name and social security number so these can be safely placed in a table. We might add date of birth. [On the other hand] a person can have many tests. They might only have one but they might also have fifty. Thus one person might have many tests and this is represented by a One-To-Many relationship…”

As we stated previously, a one-to-many data relationship exists when one record from one data source is related to several records from another data source; a single country in a Country data source will be linked to many entries within a City data source. This much is clear and well known to developers and users alike.

With the uniPaaS application platform, development of an application that utilizes data that is in a one-to-many relationship is implemented by the interrelationship of two tasks or programs. In uniPaaS, a program is not an all encompassing application, but rather a very small but discrete task. Most developers in uniPaaS lean towards fine-grained programs because of their reusability.

The logic is very straightforward. The first task or parent task deals with the main record and the second task (sometimes call a child or subtask) deals with the "many" or multiple records that are related to the main record. This relationship is often displayed to the end-user using two different display formats, the parent program displaying the single record in screen mode and the child program displaying the multiple records in a table. This is not always the case, since you may decide that the parent program will also be a table.

uniPaaS uses a Subform control to display the child task's form within the parent task's form and refreshes the subtask each time the common variable is changed in the parent task.

A one-to-many data relationship is established between a primary data source and a secondary data source in such a way that each record in the primary data source has several related records in the secondary data source.

In a one-to-many relationship, there is a primary data source (the one) and a secondary data source (the many). The primary data source must contain a unique identifier or index to which the secondary data source can associate. The secondary data source will contain both the unique identifier of the primary data source (the reference index) and its own unique identifier (the secondary index).
This can best be illustrated through an example such as order numbers and order lines in an order entry program.

Primary Data Source

In the following example, the Orders data source is the primary data source.

An order record usually contains basic details regarding the order, for example:
  • Order Number
  • Order Date
  • Customer Code
  • Total Amount
  • Method of Payment

    The Order Number will be the Unique index of the data source, since each record has a unique order number.

    Secondary Data Source

    The Order Lines data source is the secondary data source. It contains several data items:

  • Order Number
  • Order Line
  • Product
  • Quantity
  • Price

    The Order Number and the Order Line will be the Unique index of the data source. The same order number can repeat several times with different order line numbers. The linkage between the Orders data source and the Order Lines data source is the Order Number.

    For each record in the Orders data source there are several records in the Order Lines data source. This is the exact definition of a one-to-many data relationship.

    Without the use of one-to-many data sources, all information would need to be contained in one record. This will cause problems, such as the end-user having to repeatedly type the same information, and large records because of the repeated data.
    The result would be a large data source, which takes up more disk space and slows down the program's performance.

    The solution to this problem is to define two data sources according to the following rules:

  • All of the repeating data columns should be defined in the primary data source.
  • All of the unique data columns should be defined in the secondary data source.
  • Both data sources should be connected using a common column.
  • The common column is the only column that will be repeated in the secondary data source.

    Separating data into two data sources like we did in the Orders and Order Lines example has several advantages.

    The separation saves the end-user from having to type the same data several times and it also reduces the size of the records.

    Step-by-step procedures on the creation of a program that utilizes one-to-many relationships are available in Module 18 of the Getting Started with uniPaaS course. For a description of what we mean by a one-to-many application platform, see our previous entry.
  • One-To-Many Application Platform: Is Cloud Computing In Need of New Infrastructure?

    An Architectural View of One-to-Many

    uniPaaS is emerging as the industry’s first one-to-many application platform. We are all familiar with the one-to-many data relationship as the workhorse of relational databases. As David Remsen states: “When deciding upon how to best split data into tables we often ask "What are the logical units of the data?" In other words, what is best represented by a single row of data. It is often convenient to think of physical representations when applicable. For example, what types of information can we associate with a patient that only appears once. A person usually has one name and social security number so these can be safely placed in a table. We might add date of birth. [On the other hand] a person can have many tests. They might only have one but they might also have fifty. Thus one person might have many tests and this is represented by a One-To-Many relationship…”

    A one-to-many data relationship exists when one record from one data source is related to several records from another data source; a single country in a Country data source will be linked to many entries within a City data source. This much is clear and well known to developers and users alike.

    So what is a one-to-many application platform? A one-to-many application platform is an application platform designed to optimize the delivery of many applications, each of which may have many implementations. And of course, each implementation may have many users.

    In the past, client-server applications tended to be bound by notions of physical location and ownership. My company owns an application server in Chicago and any of the local area network clients can use it. Using the medical record analogy above, this situation is a bit like building a doctor’s office inside of every one of your company offices just in case someone gets sick. Shouldn’t you send them to a clinic where the needs of many in the community can be served? Why monopolize the skills and equipment of a doctor for only one or two patients per day?

    Even with the introduction of wide area networks, VPNs, the Internet, and the like, we still got hung up with the concept of ownership. My company bought a software license and only our people can use our software on our server, but yes they can sit anywhere on the web when they do it. Now granted, this was very liberating. Physical space dissolved when using software applications. Networking, T3, WiFi and a host of other technologies have conquered the notion of space separating us from use of computer applications.

    So along comes the idea of Software-as-a-Service (SaaS) to break down another barrier. Through a concept known as multi-tenancy, the ownership barrier is broken down. Many different implementations of the same software application can exist within the same database. Each tenant’s data is firewalled from the others so that ACME Corp has no idea that ABC Corp is even using the same application. In this instance the physical servers and the software are not installed on the client computers. Only a web browser or client runtime engine need be installed, but the real application runs remotely on someone else’s physical servers. One-to-many computing is better served when the client side is capable of Rich Internet Applications (RIA).

    Gradually and then all of a sudden, SaaS became a popular trend as successes like Salesforce.com bedazzled customers and Wall Street investors. The SaaS story sounded somewhat familiar to people who had been promoting hosted computing services. As the notion of grid computing sort of floundered for want of a customer base, the industry searched for a new hype cycle and found it in the notion of cloud computing. Cloud computing sort of combines the SaaS idea of Software-as-a-true-service with Software-as-a-do-it-yourself-service. Our application on your hosting server fits some people’s notions of what cloud computing is all about.

    Thinking back to our medical records analogy in the beginning to explain one-to-many data relationships. SaaS is a bit like having access to a doctor’s office. You can go there and see one doctor or maybe a few, but you are pretty much limited by the skills of the doctor and the equipment in the office. Seeing a doctor is great, unless you need the services of a hospital.

    A one-to-many application platform doesn’t just serve one customer and it isn’t limited to a single software supplier either. A true one-to-many application platform is independent of the underlying physical computing architecture and capable of sustaining n-applications across n-implementations with n-users. So while the application platform is unitary, it can not be bound to a single physical CPU or server in order to support its many applications, implementations and users. The architecture must be capable of application partitioning across multiple physical servers and data storage servers.