1st Quarter 2011

Removing Superfluous Spaces

Although data integrity is a pervasive problem, there are some data integrity issues that can be cleaned up using a touch of SQL. Consider the common data entry problem of extraneous spaces in a name field. Not only is it annoying, sometimes it can cause the system to ignore relationships between data elements. For example, “Craig Mullins” is not equivalent to “Craig Mullins”; the first one has two spaces between the first and last name whereas the second one only has one.

You can write an SQL UPDATE statement to clean up the problem, if you know how to use the REPLACE function. REPLACE does what it sounds like it would do: it reviews a source string and replaces all occurrences of a one string with another. For example, to replace all occurrences of Z with A in the string BZNZNZ you would code:

    REPLACE(‘BZNZNZ’,’Z’,’A’)

And the result would be BANANA. But a simple REPLACE is not sufficient for the task at hand, so let’s create a SQL statement to get rid of any number of unwanted spaces in the NAME column of our EMPLOYEE table:

    UPDATE EMPLOYEE
      SET NAME = REPLACE(
       REPLACE(
        REPLACE(NAME, SPACE(1), '<>')
         '><', SPACE(0))
          '<>', SPACE(1));

What are all of those less-thans and greater-thans and why do you need them? Let me explain. The inside REPLACE statement takes the NAME column and converts every occurrence of a single space into a left/right carat. The next REPLACE (working outward), takes the string we just created, and removes every occurrence of a right/left carat combination by replacing it with a zero length string. The final REPLACE function takes that string and replaces any left/right carats with a single space. The reversal of the carats is the key to removing all spaces except one – remember, we want to retain a single space anywhere there was a single space as well as anywhere that had multiple spaces. Try it, it works.

Of course, you can use any two characters you like, but the left and right carat characters work well visually. Be sure that you do not choose to use characters that occur naturally in the string that you are acting upon.

Finally, the SPACE function was used for clarity. You could have used strings encased in single quotes, but the SPACE function is easier to read. It simply returns a string of spaces the length of which is specified as the integer argument.

Contribution by Craig Mullins

Craig S. Mullins, president and principal consultant of Mullins Consulting, Inc., is a data management strategist, researcher, and consultant. He has nearly three decades of experience in all facets of database systems development and has worked with mainframe DB2 since V1. You may know Craig from his popular books: "DB2 Developer's Guide" (with over 1500 pages of in-depth technical information on DB2 for z/OS) and "Database Administration: The Complete Guide to Practices and Procedures" (the industry's only comprehensive guide to heterogeneous database administration). Craig can be reached via his website at www.craigsmullins.com.

< < < Back to White Papers


  • PRESS
  • BECOMING A PARTNER
  • SUCCESS STORIES
  • UNIQUE TECHNOLOGIES
  • WHY CDB?

CDB Software:The Industry's Leading Authority on z/Series DB2 Data Management

CDB Software is a leading authority on z/Series DB2 environments and how they can be optimized to meet the changing demands of business. From the critical every-day processes of database performance and protection to the implementation of new requirements for security and compliance, CDB Software is the visionary leader for enterprise DB2 environments that want both flexibility and control.

For more information or to schedule a briefing with a CDB expert, contact CDB Press and Analyst Relations .

Press Releases      |      White Papers &Articles     |     Press Contacts     |     About CDB