• Skip to main content
  • Skip to primary sidebar
  • Home
  • Connect
  • Contact
  • About Us

BareFoot PeopleSoft

Oracle PeopleSoft Tips and Tricks

Tip 067: CSV File Processing with Split

October 28, 2018

PeopleCode | Using the Split function to separate a string into multiple parts

<< Previous | Next >>

A. Split – Basic Usage

The ‘Split’ PeopleCode function is used for splitting a string into smaller parts, divided by whatever delimiter character you choose. The default delimiter is a space character.

So for example, consider the following line of code:

Local array of string &words = Split("This is a sentence.");

This would return an array of four elements:

&words [1] : This
&words [2] : is
&words [3] : a
&words [4] : sentence.

 

B. Split – CSVs

‘Split’ is especially handy when you want to quickly process a CSV file. It provides a simple way to divide a comma-separated line into its individual fields.

Local array of string &words = Split("Field1,Field2," |
               "This is a sentence,Field4", ",");

This would return an array of four elements:

&words [1] : Field1
&words [2] : Field2
&words [3] : This is a sentence
&words [4] : Field4

However, a typical CSV file often contains longer pieces of text that have a comma embedded as part of the text. Consider this example:

Local array of string &words = Split("Field1,Field2," |
                """This is a sentence, with comma added"",Field4",
                ",");

This would produce the following result:

&words [1] : Field1
&words [2] : Field2
&words [3] : "This is a sentence
&words [4] : with comma added"
&words [5] : Field4

This is not correct. The longer text for Field 3 has been split into two parts, when it should have been kept together as one field. This is because the Split has no concept of a CSV file and is simply splitting at whatever point it finds a comma.

To get around this problem, we need to write some PeopleCode in advance of the ‘Split’ that checks for ‘embedded’ commas within a field. If a comma is found, the code then substitutes the comma with a special series of characters (unlikely to be found in the text). After the Split takes place, the special series is then changed back to a comma. In the example below, the comma is substituted for three semi-colons in a row (;;;).

&test_string = "Field1,Field2," |
                """This is a sentence, with comma added"",Field4";
&in_field = False;

/* Loop through each character of the string, one by one */
For &i = 1 To Len(&test_string)
   /* Get the next character */
   &next_char = Substring(&test_string, &i, 1);
   
   Evaluate &next_char
   When = """"
      /* This is a double quote character */
      If Not &in_field Then
         /* We are currently not in a field, so start a new field */
         &in_field = True;
      Else
         /* We have previously found a double quote character, 
            so the field has now ended */
         &in_field = False;
      End-If;
      
      Break;
   When = ","
      /* This is a comma */
      If &in_field Then
         /* We are currently in a field,
            so replace the comma with the special sequence */
         &new_string = &new_string | ";;;";
      Else
         /* We are not in a field, so include the comma as normal */
         &new_string = &new_string | ",";
      End-If;
      
      Break;
   When-Other
      /* All other characters - use as normal */
      &new_string = &new_string | &next_char;
      
      Break;
   End-Evaluate;
End-For;

/* Perform a split on the new string */
Local array of string &words = Split(&new_string, ",");

/* Replace all instances of the special sequence with the comma */
For &i = 1 To &words.Len
   &words [&i] = Substitute(&words [&i], ";;;", ",");
End-For;

We can confirm that the correct result is produced:

&words [1] : Field1
&words [2] : Field2
&words [3] : This is a sentence, with comma added
&words [4] : Field4

 

See Also:
Tip 008: Create Simple CSV Export
Tip 036: Introduction to File Layouts

<< Previous | Next >>

Filed Under: PeopleCode Tagged With: CSV, File, Split, String

Reader Interactions

Primary Sidebar

Categories

  • Administration
  • Application Engine
  • BI Publisher
  • COBOL
  • Data Mover
  • PeopleCode
  • PeopleTools
  • PS Query
  • Security
  • SQL
  • Utilities

Copyright © 2023 Elimbah Consulting Ltd