SQL*Loader is an Oracle tool used for loading data into tables from external files. This blog assumes you have some, but not comprehensive knowledge of SQL*Loader. You can do some quick reading of Oracle’s documentation if you are completely unfamiliar.
Data Source
In this blog, I am going to assume my data is stored in a CSV (comma separated values) file, with data headings. If you are loading your data from another file source, you will need to alter steps in this blog accordingly.
Create Table
The first thing to do is create a target table. For simplicities sake, create it the same as the column order in your file source. This will allow us to do a straight one for one load, and we can worry about transformation and data quality later. You will also need to take into account the data type and length of your source data.
Here’s a quick tip to find the max length of data in a column using Excel:
Insert a new row at the top of your spreadsheet and insert the formula =MAX(LEN(A3:A100)) and hit ctrl + shift + enter (it won’t work without this), where A3 is the first row of actual data and A100 is the last row.
File Directory
You should also take the time to set up your file directories. I have created separate folders to house the batch, control, log, bad and discarded files (I’ve stored bad and discarded in the same folder).
Control File
Now it’s time to create your control file. The control file contains all of the instructions for SQL*Loader to load data into a table. There are many different options you can implement via the control file that suit different data sources and different ways to load tables. I would advise having a quick read through of Oracle’s SQL*Loader page if you are after something slightly different.
Line 1 – tells SQL*Loader to skip the first line in the data file. This is because I have data headers in my data set which I do not want to be inserted into the table. If you do not have data headers then you can omit this line from your control file, or alternatively set skip=0.
Line 2 – is a SQL*Loader command.
Line 3 – tells SQL*Loader to replace all data in the table. This means that every time I run my batch file, the table will first be emptied before being repopulated. This is great for the one off loads or debugging, but if you wish to add additional data in the future, you should use APPEND instead.
Line 4 – is the destination table. Use INTO TABLE followed by your table name.
Line 5 – describes how my data is delimited. In this case it is by a comma.
Line 6 – tells SQL*Loader to treat any relatively positioned columns without data in the source file as null columns.
Lines 7 to 12 – The rest of the lines are the column names of the target table. You will notice that the DOB column has the date defined. This tells SQL*Loader the format of the incoming date data.
Save your control file as a “.ctl” file type, i.e. Person.ctl
Batch File
Now it is time to create the batch file.
This is the executable file that will run your SQL*Loader job.
c:cd %LOCALAPPDATA%MicrosoftAppVClientIntegration756e-3RootBINsqlldr nic/password01@my_db control='C:Data LoadSQL LoaderCTLPerson.ctl' data='C:Data LoadSQL LoaderDATAPerson Data.csv' log='C:Data LoadSQL LoaderLOGPerson.log' bad='C:Data LoadSQL LoaderBADPerson.bad' discard='C:Data LoadSQL LoaderBADPerson.dsc'pause
You will need to edit the above batch script to suit your environment. The change directory (cd) takes me to the database connection information. This may not be necessary for you.
The third section, which should all be one line in your script has the bulk of the information for SQL*Loader:
- Sqlldr initiates the SQL*Loader program.
- Next is my username, password and database.
- Control, data, log, bad and discard all point to their relative directories. You should have your control and data files already stored there. The other files will get created when you run SQL*Loader.
Save this file with a “.bat” file extension and execute away!
That’s about all there is to it. You will likely have a slightly different set up than this, so just make adjustments where needed. Check the log after each run to make sure the data has been loaded successfully. You can see data that was not loaded in the .bad and .dsc files to help you debug.
Cheers, Nic