Bulk Copy Operations

By Silviu, on 2017-11-03

Performing bulk copy on a table

Copy From Reader

To copy from an io.Reader instance, use the CopyFromReader table utility function. For example, a method generated for a cms_tag database table, has the following syntax:

// CopyFromReader performs a bulk copy of csv-like content from the specified reader into the cms_tag table. 
// ...
func (utilRef *tCmsTagUtils) CopyFromReader(r io.Reader, opt *CopyFromReaderOptions, 
                                            columns ...string) (int, error)

The *CopyFromReaderOptions parameter allows customizing the format of a csv line. You can supply a custom separator, a null value placeholder, and whether the primary key columns should be included in the copy operation instead of relying on a database sequence. The null placeholder value makes sense for text fields, when it is not intrinsically clear whether a db null or an empty string should be inserted in the database.

// CopyFromReaderOptions allows customizing the parsing of the bulk copy
// line input when using CopyFromReader.
type CopyFromReaderOptions struct {
	// when not 0, Comment allows skipping lines that start with it (no whitespace allowed before it)
	Comment rune

	Separator        rune
	NullPlaceholders []string

	// if true, include the PK columns when explicit columns are not specified
	IncludePKCols bool
}

When a nil opt parameter is supplied to CopyFromReader, default values will be used from the following unexported variables:

var defCommentRune rune = ZeroRune
var defNullPlaceholders = [2]string{"[null]", "[nil]"}
var defSeparator rune = ','
var defIncludePKCols = false

Examples

Let's assume the following cms_tag table:

CREATE TABLE public.cms_tag
(
  id bigint NOT NULL DEFAULT nextval('cms_tag_id_seq'::regclass), -- PK of the tags table
  tag_name character varying(500) NOT NULL,
  date_created timestamp without time zone NOT NULL DEFAULT now(),
  CONSTRAINT pk_tag PRIMARY KEY (id),
  CONSTRAINT uq_tag_name UNIQUE (tag_name)
)

1. Copy four records from a csv reader source into a cms_tag table, with the default row parsing options:

inputData := `"novelist", "2017-11-03 23:45:00"
"poet", "2017-11-03 23:45:00"
"painter", "2017-11-03 23:45:00"
"sculptor", "2017-11-03 23:45:00"
`
rdr := strings.NewReader(inputData)
copyCount, err := models.Tables.CmsTag.CopyFromReader(rdr, nil)
if err != nil {
	fmt.Printf("CopyFromReader error: %v \n", err)
} else {
	fmt.Printf("A total of %d tags were copied\n", copyCount)
}

We did not have to specify the columns, so the import rows were assumed to contain the tag_name (TagName) and date_created (DateCreated). The id column was skipped, as per the default option to skip primary key columns.

2. Because the date_created column has a default set to "now()", we can skip it. Copy four records from a csv reader source into a cms_tag table, with explicit columns:

inputData := `"miner"
"farmer"
"blacksmith"
"alchemist"
`
rdr := strings.NewReader(inputData)
copyCount, err := models.Tables.CmsTag.CopyFromReader(rdr, nil, "TagName")
if err != nil {
	fmt.Printf("CopyFromReader error: %v \n", err)
} else {
	fmt.Printf("A total of %d tags were copied\n", copyCount)
}

Note how we used the Go friendly column name, TagName. We could have used the database field name, tag_name and it would have worked all the same.

Let's now add a nullable description field to the table:

ALTER TABLE cms_tag ADD COLUMN description varchar(200);

Afterwards, don't forget to regenerate the model package with pgtogogen, so that the new "description" column gets included. We are now ready to try out the null placeholders.

3. Copy six records from a csv reader source into a cms_tag table, with explicit columns, and custom db null placeholders:

inputData := `"car", "modern vehicle"
"plough", "ancient agricultural instrument"
"window",
# we are skipping this line
"door",[!null]
"table",[!nil]
"rotor",""
`
rdr := strings.NewReader(inputData)
opt := models.NewCopyFromReaderOptions(',', []string{"[!null]", "[!nil]"}, '#', false)
copyCount, err := models.Tables.CmsTag.CopyFromReader(rdr, opt, "tag_name", "description")
if err != nil {	
	fmt.Printf("CopyFromReader error: %v \n", err) 
} else {
	fmt.Printf("A total of %d tags were copied\n", copyCount)
}

Note that the "window" row will insert an empty string (just like the "rotor" row). Only the "[!null]" and "[!nil]" values were deemed as null placeholder, thus the only null description values are in the "door" and "table" records.