====== Import Data from a CSV file into database tables ======
The Concerto database isn't very large, and sometimes you need to test things with more data. If you can pull and anonymize production-level data (but not too much to crash your test server), you can copy it into place with a CSV file.
===== Finessing the Data Prior to Import =====
**Notes:**
* If the table you're wanting to insert into uses an org, you'll want to edit your CSV file first to use the org IDs from Concerto.
* Don't include the column for ID from your production data.
* If there are duplicate key constraints, check for those in your CSV *first* before you create your tables. For example, acq.provider has a duplicate key value unique constraint of "provider_name_once_per_owner". If you add a column in your CSV for dupe checking, you can use a formula to count how many times it appears and thus find dupes.
=COUNTIFS($A$2:$A$1000,A2,$B$2:$B$1000,B2)
This assumes that Column A is the no-dupe column (name, in this case) and Column B is the org. So basically, count the number of times that the name in A2 appears with the same org that's in B2. Then if we fill that down the entire column, if you get anything above a 1 you have a dupe that you can filter for and remove.
**TIP:** If you're not sure if your table includes a uniqueness constraint, here's a grep of the Evergreen repo for the word "CONSTRAINT" which can point you toward where you might encounter a uniqueness constraint on the table you're working on: https://git.evergreen-ils.org/?p=Evergreen.git&a=search&h=HEAD&st=grep&s=CONSTRAINT
===== Importing the Data =====
1. Log into the database as the root user.
2. Create a placeholder schema for your project.
CREATE SCHEMA tlittle;
3. Create a placeholder table for your data to start in, and name the columns--including their datatype--for your table.
create table tlittle.acqprolist
(name text,
owner integer,
currency_type text,
code text,
active boolean,
prepayment_required boolean,
default_copy_count integer);
Make sure to include, at minimum, any required fields in your final table. This example is based on acq.provider, so these are the required fields in acq.provider. Each name of the column is followed by its datatype, e.g. prepayment_required (the column name) boolean (the datatype).
4. Now we're going to populate our starter table from our CSV file. Make sure your CSV file is on the server somewhere you know. In this case, my file is in /home/tlittle and the filename is pines_acqprolist.csv.
Note: If you do not have a header in your csv file, don't include the "header" at the last part of this code.
copy tlittle.acqprolist from '/home/tlittle/pines_acqprolist.csv' csv header;
Then if you do a **select * from tlittle.acqprolist** you'll see it's now populated with the data from our CSV file.
5. To get that data into our Evergreen table, we need to copy it over.
insert into acq.provider (name,
owner,
currency_type,
code,
active,
prepayment_required,
default_copy_count) select
name,
owner,
currency_type,
code,
active,
prepayment_required,
default_copy_count from tlittle.acqprolist;
So the second half is the SELECT--select all these from tlittle.acqprolist--which we then INSERT with the first portion.
If it's successful, you'll get the INSERT 0 [NUMBER OF ROWS] message. You can then do a **select * from [YOUR TABLE]** to see all your new data.
====== More Resources ======
* [[https://roganhamby.github.io/emeraldelephant/load_csv.html|Loading CSVs with a Bash Script (Emerald Elephant)]]