Importing array values into Postgres from CSV

tags: postgres 

I want to import from a CSV file some array values into a Postgres table, but I didn’t find examples. There are many examples with INSERT statements, but I already know how to do that (because there are examples).

Postgres allows a column to hold multiple values (or almost anything, really). In text, you put array values in braces and separate them with commas. There are plenty of examples of this:

postgres=# create table array_test ( pk integer primary key, members varchar array );

postgres=# insert into array_test values ( 1, '{dog,cat,bird}' );
INSERT 0 1

postgres=# select * from array_test;
 pk |    members
----+----------------
  1 | {dog,cat,bird}
(1 row)

postgres=# select members[2] from array_test;
 members
---------
 cat
(1 row)

No big whoop. Now I have this file in import_test.csv:

2,"{lizard,kangaroo}"

It’s as easy as I thought it would be. That column in the CSV uses the same syntax I used in the INSERT:

postgres=# \copy array_test from 'import.csv' delimiter ',' CSV
COPY 1

postgres=# select members[2] from array_test;
 members
----------
 cat
 kangaroo
(2 rows)

What about elements with commas in them? This works too:

3,"{lizard,""otter, sea"",kangaroo}"

An empty field was harder. Two commas in a row ("{rat,,mouse}") doesn’t work, but quotes around the empty field work:

4,"{rat,"""",mouse}"

Single quotes work around an array element work too:

5,"{rat,'',mouse}"

What if I swap those quotes around so the single quotes are on the outside? This is a bit silly because most CSV tools are going to use “, but I figured I’d try:

6,'{rat,"",mouse}'

The default quote is a “ so this won’t work:

postgres=# \copy array_test from 'import.csv' delimiter ',' CSV
ERROR:  extra data after last expected column
CONTEXT:  COPY array_test, line 1: "6,'{rat,"",mouse}'"

I can change the quote but it’s a bit odd. At first I thought that I could quote the quote, like "'", but that doesn’t work. Nor does "\'" nor various things like that. Inside the single ticks, I double up the single tick:

postgres=# \copy array_test from 'import.csv' delimiter ',' quote '''' CSV
COPY 1

But Postgres also has dollar quoting, where I form the quote delimiter with double $ on each side, with an optional tag between the $:

postgres=# \copy array_test from 'import.csv' delimiter ',' quote $$'$$ CSV
COPY 1

postgres=# \copy array_test from 'import.csv' delimiter ',' quote $foo$'$foo$ CSV
COPY 1

Further reading