Code

Coding, Programming & Algorithms, Tips, Tweaks & Hacks
Search

Sponsored Ads

Export MySQL data to Excel compatible CSV format

One of the most often used feature in front-end GUI tools like phpMyAdmin is to export your data to an Microsoft Excel compatible CSV format. Excel compatible because, CSV functions in Python, PHP, Java or C# can read CSV files created using Excel. Because there is no CSV standard, we'll use Microsoft Excel's CSV format. Microsoft Excel uses 2 double-quotes to indicate a double-quote within a column-value enclosed by double-quotes.
For example, a Hello"World in Excel is internally "Hello""World" if you open the file in Notepad.

When you're dealing with large sets of data, its better to use MySQL in command-line mode for export. A stand-alone Py/PHP/Java script may do the job but it would still be faster in MySQL command-line mode.

Lets take a simple person-info table type as an example. `persons` is a table in database `community`.

MySQL
CREATE TABLE IF NOT EXISTS `persons` (
  `ID` bigint(20) NOT NULL auto_increment,
  `Name` varchar(255) NOT NULL,
  `Address` varchar(255) NOT NULL,
  `City` varchar(255) NOT NULL,
  `State` varchar(255) NOT NULL,
  `Zip` varchar(255) NOT NULL,
  `Phone` varchar(255) NOT NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
MySQL 5.0.45
  1. mysqldump

    Dumping SQL via the mysqldump command-line tool is the most common use for database backup. Using mysqldump, it is possible to export to CSV but there are some limitations :

    • Its not possible to select columns, so you end up exporting all columns. And I can't see options for columns in the newer versions, 5.1 or 6.0, so I guess theres no way out here.
    • When exporting to CSV format, we indicate the column data is to be seperated by a comma by specifying options such as --fields-terminated-by=",". When using --fields-xx options, its mandatory to use the --tab=<dir name> option which specifies the folder to dump to. --tab accepts a foldername because it'll export the tables in a separate file text format to that folder. Even if we select only one table to dump, it'll still have to dump to the specified folder. So we can't output to a file by using pipe (mysqldump ... > los-angeles-persons.csv).
      Note: --result-file=<filename> option is used to output to a specified file but not in conjunction with the --tab option.
    • Its not possible to sort the result, expect by the PK (Primary Key) column by specifying the --order-by-primary option.
    mysqldump community persons --user=root --where="`City` LIKE '%Los Angeles%'" --tab="csv data" --fields-terminated-by="," --fields-enclosed-by="\"" --fields-escaped-by="\"\"" --lines-terminated-by="\r\n" --no-create-db --no-create-info -p

    This will export (dump) all persons located in Los Angeles into a folder named csv data (an existing folder relative to your current path).
    --fields-escaped-by="\"\"" : specify 2 double-quotes("") to indicate a double-quote(") - since the column values are enclosed by double-quotes(").
    Internally, this executes a SELECT INTO OUTFILE query.

  2. SELECT ... INTO OUTFILE

    Enter mysql in command-line mode.

    C:\MySQL\bin>mysql --user=root -p
    Enter password:
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 81
    Server version: 5.0.45-community-nt-log MySQL Community Edition (GPL)
    
    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
    
    mysql> USE community;
    Database changed
    mysql> SELECT `Name`, `Address`, `City`, `State`, `Zip`, `Phone`
        -> FROM `persons`
        -> WHERE `City` LIKE '%Los Angeles%'
        -> ORDER BY `Name` ASC, `Zip` ASC
        -> INTO OUTFILE 'los-angeles-persons.csv' FIELDS ESCAPED BY '""' TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n';
    Query OK, 299 rows affected (0.05 sec)
    
    mysql>

    This is probably the best option because you're inside the MySQL console.
    This selects only `Name`, `Address`, `City`, `State`, `Zip`, `Phone` columns and again filtered by Los Angeles city.
    The in INTO OUTFILE is relative to the database directory in the MySQL data directory. MySQL data directory is the one assigned to datadir= in the configuration file (my.ini or my.cnf).
    My path is D:\Data\MySQL\community\los-angeles-persons.csv (for default setup : C:\MySQL\data\community\los-angeles-persons.csv)

    Unfortunately, export options (FIELDS ESCAPED BY '""' TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n') can only be used with INTO OUTFILE option, so something like this in command-prompt is not possible:

    C:\MySQL\bin>mysql --user=root --database=community --execute="SELECT `Name`, `Address`, `City`, `State`, `Zip`, `Phone` FROM `persons` WHERE `City` LIKE '%Los Angeles%' ORDER BY `Name` ASC, `Zip` ASC FIELDS ESCAPED BY '\"\"' TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n';" -p > los-angeles-persons.csv
  3. CONCAT()

    Because of limitations of the first two methods, piping output to a file using > is not possible.
    The only other way is to join all the column-values using the CONCAT() function.
    In script.sql enter

    USE `community`;
    
    DROP FUNCTION IF EXISTS toXLCell;
    delimiter ///
    CREATE FUNCTION toXLCell(s VARCHAR(1000)) RETURNS VARCHAR(1000) DETERMINISTIC
    BEGIN
        SET @XLs = CONCAT('"', REPLACE(s,'"','""'), '",');
        RETURN @XLs;
    END;
    ///
    delimiter ;
    
    SELECT
    CONCAT(toXLCell(`Name`), toXLCell(`Address`), toXLCell(`City`), toXLCell(`State`), toXLCell(`Zip`), toXLCell(`Phone`)) AS `Row`
    FROM `persons`
    WHERE `City` LIKE '%Los Angeles%'
    ORDER BY `Name` ASC, `Zip` ASC;
    C:\MySQL\bin>mysql --user=root -p < script.sql > los-angeles-persons.csv

    Using CONCAT & REPLACE may be tedious, but helps in batch scripts.

    This will create an extra empty column at the end because of the trailing comma. If default parameter value was supported , the function could've been defined as toXLCell(s VARCHAR(1000), bLast BIT(1) = FALSE) and toXLCell(`Phone`) would be changed to toXLCell(`Phone`, TRUE)

    MySQL doesn't support variable number of parameters, otherwise we could've had a function
    toXLLine(`Name`, `Address`, `City`, `State`, `Zip`, `Phone`)
    accepting variable number of column names.

Vanakkam !

0 comments: