Google
 

Saturday, April 18, 2009

SQL Server UDF to pad a string

Problem
Unlike other relational database management systems that shall remain nameless, SQL Server's underlying coding language, T/SQL, does not have a built-in function for padding string values. I recently took it upon myself to create my own and as you'll see I got a little carried away.

Solution
The task seems simple enough: create a user-defined function that will allow padding of a string value with a finite count of a desired character. What ended up as an experiment out of necessity became a slightly more-involved function once I decided that padding position should be customizable to meet the end user's needs.

While Transact-SQL (T/SQL) does not offer a comparable function similar to LPAD or RPAD available in other RDBMSs, the SQL Server Professional does have the REPLICATE() function that can be used to build a simple user-defined function that can be used to pad a string. Let's take a look at the REPLICATE() function and what it offers before moving on to the code for the custom padding function.

REPLICATE (string_expression ,integer_expression) will allow you to replicate a character string (the string_expression parameter the number of times consecutively per the integer_expression parameter).

A simple example of this function is presented below:

SELECT REPLICATE('ABCDE|', 3)

------------------
ABCDE|ABCDE|ABCDE|

(
1 row(s) affected)

Whereas the REPLICATE() function will allow you to return a string to a maximum size of 8000 bytes, the function I'll be creating will be based upon an output value of varchar(100). You will be able to modify this value to fit your needs, however I very rarely have a need to pad a string value greater than even 20 characters. I thought it worthwhile to create a single function for padding either to the left or right of the unpadded string. Then it became interesting, what if for some reason you wished to pad the center of the string? What about padding both the left and right sides of the string evenly? Whether you use those options or not, the functionality is there. The code below represents the function I've created.

CREATE FUNCTION [dbo].[usp_pad_string]
(
@string_unpadded VARCHAR(100),
@pad_char VARCHAR(1),
@pad_count tinyint,
@pad_pattern INT)
RETURNS VARCHAR(100)
AS
BEGIN
DECLARE
@string_padded VARCHAR(100)

SELECT @string_padded =
CASE @pad_pattern
WHEN 0 THEN REPLICATE(@pad_char, @pad_count) + @string_unpadded --pad left
WHEN 1 THEN @string_unpadded + REPLICATE(@pad_char, @pad_count) --pad right
WHEN 2 THEN
--pad center
LEFT(@string_unpadded, FLOOR(LEN(@string_unpadded)/2))
+
REPLICATE(@pad_char, @pad_count)
+
RIGHT(@string_unpadded, LEN(@string_unpadded) - FLOOR(LEN(@string_unpadded)/2))
WHEN 3 THEN
--pad edges
REPLICATE(@pad_char, FLOOR(@pad_count/2))
+
@string_unpadded
+ REPLICATE(@pad_char, @pad_count - FLOOR(@pad_count/2))
END
RETURN
@string_padded
END

The function expects four parameters:

  • @string_unpadded - the raw string value you wish to pad.
  • @pad_char - the single character to pad the raw string.
  • @pad_count - the amount of times to repeat the padding character
  • @pad_pattern - an integer value that determines where to insert the pad character
    • 0 - all pad characters placed left of the raw string value (pad left)
    • 1 - all pad characters placed right of the raw string value (pad right)
    • 2 - all pad characters placed at the midpoint of the raw string value (pad center)
    • 3 - the raw string value will be centered between the pad characters (pad ends)

Notes

  • If either of the length of the supplied @pad_count or @string_unpadded values are odd, centering will be affected. We will show that behavior in the samples presented below.
  • Since the return value of the function is limited to 100 characters, the length of parameter corresponding to the unpadded string must be sized according to allow for padding. The length of the return value and this parameter are completely customizable to fit your needs.
  • While the REPLICATE() function allows you to pad more than a single character, this function expects you'll only pad a single character.

Example 1 Evenly-Distributable Padding

--Even distribution possible
--Pad Left
SELECT '1234' AS [raw string], dbo.[usp_pad_string]('1234', 'X', 4, 0) AS [padded string], '0 - pad LEFT' AS [pad pattern value];

--Pad Right
SELECT '1234' AS [raw string], dbo.[usp_pad_string]('1234', 'X', 4, 1) AS [padded string], '1 - pad RIGHT' AS [pad pattern value];

--Pad Center
SELECT '1234' AS [raw string], dbo.[usp_pad_string]('1234', 'X', 4, 2) AS [padded string], '2 - pad CENTER' AS [pad pattern value];

--Pad Edges
SELECT '1234' AS [raw string], dbo.[usp_pad_string]('1234', 'X', 4, 3) AS [padded string], '3 - pad EDGES' AS [pad pattern value];

The previous set of examples shows what results to expect when the parameters are evenly distributed. What happens though when this is not possible? This occurs under two situations: you are attempting to center-pad a raw string that has an odd number of characters or you're trying to pad an odd number of characters using the pad edges option (@pad_pattern = 3). I am not going to show you some method for splitting an odd string at a quantum level. Sorry, I am not all that brilliant as it is. Therefore I had to make a judgment call on how the behavior is going to occur using the structure of the three dimensions we have to work with here currently. The following examples will present how this function will behave.

Example 2 - Center-Pad an Odd-Length Raw String

--Pad Center
SELECT '12345' AS [raw string], dbo.[usp_pad_string]('12345', 'X', 4, 2) AS [padded string], '2 - pad CENTER' AS [pad pattern value];

Example 3 - Center-Edges an Odd Number of Times

--Pad Edges
SELECT '1234' AS [raw string], dbo.[usp_pad_string]('1234', 'X', 5, 3) AS [padded string], '3 - pad EDGES' AS [pad pattern value];

As you can see, the padding will be right-heavy in both cases. Though I doubt you'll have much use for center or edge-padding strings in Transact-SQL, you never really know.

Padding is a frequent need in returning string results to the end user of a RDBMS. This function allows you to do so as if the functionality existed inherently in T-SQL.

No comments: