In my work as a consultant I often have the task of getting data from MS Excel spreadsheets into tables in the database.
If this is a one-time load, the process is simple – use either SQL Developer or, easier yet, use the Data Load Workshop in APEX: SQL Workshop –> Utilities –> Data Workshop.
If the requirement is for a repeated load or many files, I need to set up some automated process for reading in the data. I do not want to ask my users to open the XLSX files, save the data to CSV, then upload the CSV files. That makes for unhappy users, at best.
I am sure there are several options for load XSLX files into Oracle data tables. I am going to focus on a two I have used (and continue to use) successfully across several clients, across many projects. And another option that used to be a very good option, that goes away with APEX 5.0 and higher.
Let’s do Bad News First.
The Option That Goes Away:
With APEX Listener, now Oracle REST Data Services, ORDS, we gained a wonderful utility for loading XLS or XLSX files into the database.
I will not elaborate on that here, for reasons which will be obvious in a bit. I did cover that option, and several other Data Load options, in this presentation, How Do I Load Data, Let Me Count the Ways.. Most of these options still apply. Yes, it is a PowerPoint, but gives you enough detail to get going.
This ORDS-based utility works fabulous for APEX versions less than APEX 5.0. It loaded data into a collection. You can process your data however you wish from there. However, as of APEX 5.0, APEX uses JSON to move data, and the ORDS Excel file upload function no longer works. Joel Kallman explains this clearly in his blog, Let’s Wreck This Together, here. It will still work for APEX versions < 5.0 – that is, APEX 4.2, APEX 4.1. But beware, if you are using this Excel-Upload option of ORDS in APEX 4.1 or APEX 4.2 applications, you will need to make some changes when you upgrade those applications to APEX 5.0 and above.
The XLSX Option That Still Works – READ_XLSX
Thanks to Anton Scheffer from AMIS, we have a PL/SQL package that reads .XSLX files from a file in a file server folder that has been mapped to an Oracle directory. That sounds like a mouthful, but it is simple.
You have files in a folder. You or your system administrators arrange for that file system folder to be available to the database – usually a mapping of the physical folder to a virtual folder on the database file server. Then your DBA creates an Oracle directory that points to the folder, physical or virtual, on the database file system. And grants appropriate grants so you can see and read and/or write from /to that Oracle directory.
Now Anton’s package, READ_XLSX comes in. download the package, read Anton’s blog post. I am only going to summarize here.
READ_XLSX consists of two main functions, one to read the contents of an XLSX file into a BLOB (named file2blob), and another to read the contents of that BLOB into records (this one is named “read“). Cast the results of the read function as a TABLE, and voila, you have the results of your XLSX file.
This statement is the guts of it:
SELECT * FROM TABLE( as_read_xlsx.read( as_read_xlsx.file2blob('MY_ORA_DIRECTORY', 'MyExcelWorkbook.xlsx' ) ) );
Now if you are like me, you do not want ot have to keep typing that SELECT statement. So I turn it into a view:
CREATE VIEW MY_SPREADSHEET_V AS SELECT * FROM TABLE( as_read_xlsx.read( as_read_xlsx.file2blob('MY_ORA_DIRECTORY', 'MyExcelWorkbook.xlsx' ) ) );
Now I can query from the view the same as I query from any other table or view I have access. And incorporate queries from that view into any packages, procedures and functions that I build to meet the requirements of the task at hand.
When I have a series of files in a folder, I use a version of GET_DIR_LIST to read files from the Oracle directory. (GET_DIR_LIST is a java utility, wrapped in PL/SQL, that returns the list of file names in a folder in a one-column table). You probably have your own method of doing the same. Then, in a loop, I use dynamic SQL to point my view to the next file in the loop, call my loading procedure to process the contents of that file, then move on to the next file. OF course I have all kinds of validations and error checking in there, as much is needed depending on the task at hand. Once configured, the process is clean, reusable, easily customizable and – best of all – works in APEX 5.0 and above.
I did not use Anton’s READ_XLSX directly. Why? At the time I downloaded it (years ago now), Anton’s package handled cells of up to 4000 characters. The spreadsheets I had to load contains some cells with > 4000 characters. So I made a slight adjustment to handle CLOBs. I described those customizations here. I suspect most persons will not need such a customization.
Another (Better?) Option – EXCEL2COLLECTION Plugin:
Is READ_XLSX the only way? No. Seems like too much work? Yup, it can be. Anton apparently thought so too, because he came up with EXCEL2COLECTION, a utility APEX Plugin that, with less coding than the READ_XLSX package, will read data from an XLSX, XSL, XML 2003 or CSV file into a PL/SQL collection. I have also used this plugin successfully across several customers and projects.
I have used the EXCEL2COLLECTION plugin in conjunction with a customized instance of the APEX Data Load Wizard to enable simpler (from a user perspective) upload of XLSX files. I describe more of that process here. The APEX Data Load Wizard does not out-of-the-box allow upload of XLSX files – one needs to cut/paste. Integrating the EXCEL2COLLECTION plugin into a customized Data Load Wizard series of pages makes use of other features of the Data Load Wizard – like the column mapping page – possible. Beware, that customizing the Data Load Wizard pages and process flow means that your work is subject to break upon any and every APEX upgrade. This customization of the Data Load Wizard process is unsupported territory. Note that the EXCEL2COLLECTION plugin itself is supported (by Anton). My use of it in customizing the Data Load Wizard is not.
Anton’s READ_XLSX remains one of the best methods of reading XLSX data into an Oracle database, when the requirement is to read many XLSX files. When using APEX, investigate using the EXCEL2COLLECTION plugin; that will make your XLSX load tasks easier. For one-of data loads, I still use SQL Developer or the APEX SQL Workshop Data Workshop utility. Need to script the load for Production? Use SQL Developer in a DEV environment, then use SQL Developer to generate a load script, there are several options for that.
Happy Data Loading!