网站最近更新

© 乙回庐 2019. All rights reserved.

MySQL数据库导出成SQLite3

不同的数据库实现中SQL的语法稍有区别,所以从一个实现到另一个实现直接可能需要一点功夫。MySQL数据库功能强悍,不过我发现sqlite3对我这种小运用够够的,而且还简单方便。所以就开始在网上查怎么把MySQL导出成sqlite3,发现了不少的帖子,竟然都不怎么好用,最后找到了一个好用的,推荐给大家。

这个实现起源与一个“神一般”的gist:https://gist.github.com/bign8/9055981/05e65fd90c469c5eaa730823910c0c5f9de40ab4,然后那个gist被不断的更新,修订,最后终于有人受不了了,做了个项目来管理这个脚本:https://github.com/dumblob/mysql2sqlite

因此,就有了这个叫做mysql2sqlite的项目,用法也很简单

After dumping the MySQL DB somehow (e.g.

mysqldump --skip-extended-insert --compact [options]... DB_name
or

mysqldump --no-data -u root -pmyPassword myDB
), run just

mysql2sqlite.sh dump_mysql.sql | sqlite3 mysqlite3.db

其实代码就是一个AWK脚本,完全没有必要放到一个sh文件里,直接把内容提取出来:

BEGIN {
  FS=",$"
  print "PRAGMA synchronous = OFF;"
  print "PRAGMA journal_mode = MEMORY;"
  print "BEGIN TRANSACTION;"
}

# CREATE TRIGGER statements have funny commenting. Remember we are in trigger.

/^\/\*.*(CREATE.*TRIGGER|create.*trigger)/ {
  gsub( /^.*(TRIGGER|trigger)/, "CREATE TRIGGER" )
  print
  inTrigger = 1
  next
}
# The end of CREATE TRIGGER has a stray comment terminator

/(END|end) \*\/;;/ { gsub( /\*\//, "" ); print; inTrigger = 0; next }
# The rest of triggers just get passed through

inTrigger != 0 { print; next }

# CREATE VIEW looks like a TABLE in comments

/^\/\*.*(CREATE.*TABLE|create.*table)/ {
  inView = 1
  next
}
# The end of CREATE VIEW

/^(\).*(ENGINE|engine).*\*\/;)/ {
  inView = 0;
  next
}
# The rest of view just get passed through

inView != 0 { next }

# Skip other comments

/^\/\*/ { next }

# Print all `INSERT` lines. The single quotes are protected by another single quote.

( /^ *\(/ && /\) *[,;] *$/ ) || /^(INSERT|insert)/ {
  prev = "";
  gsub( /\\\047/, "\047\047" )  # single quote

  gsub( /\\\047\047,/, "\\\047," )
  gsub( /\\n/, "\n" )
  gsub( /\\r/, "\r" )
  gsub( /\\"/, "\"" )
  gsub( /\\\\/, "\\" )
  gsub( /\\\032/, "\032" )  # substitute

  # sqlite3 is limited to 16 significant digits of precision

  while ( match( $0, /0x[0-9a-fA-F]{17}/ ) ) {
    hexIssue = 1
    sub( /0x[0-9a-fA-F]+/, substr( $0, RSTART, RLENGTH-1 ), $0 )
  }
  print
  next
}

# CREATE DATABASE is not supported

/^(CREATE.*DATABASE|create.*database)/ { next }

# Print the `CREATE` line as is and capture the table name.

/^(CREATE|create)/ {
  if ( $0 ~ /IF NOT EXISTS|if not exists/ || $0 ~ /TEMPORARY|temporary/ ){
    caseIssue = 1
  }
  if ( match( $0, /`[^`]+/ ) ) {
    tableName = substr( $0, RSTART+1, RLENGTH-1 )
  }
  aInc = 0
  prev = ""
  firstInTable = 1
  print
  next
}

# Replace `FULLTEXT KEY` (probably other `XXXXX KEY`)

/^  (FULLTEXT KEY|fulltext key)/ { gsub( /.+(KEY|key)/, "  KEY" ) }

# Get rid of field lengths in KEY lines

/ (PRIMARY |primary )?(KEY|key)/ { gsub( /\([0-9]+\)/, "" ) }

aInc == 1 && /PRIMARY KEY|primary key/ { next }

# Replace COLLATE xxx_xxxx_xx statements with COLLATE BINARY

/ (COLLATE|collate) [a-z0-9_]*/ { gsub( /(COLLATE|collate) [a-z0-9_]*/, "COLLATE BINARY" ) }

# Print all fields definition lines except the `KEY` lines.

/^  / && !/^(  (KEY|key)|\);)/ {
  if ( match( $0, /[^"`]AUTO_INCREMENT|auto_increment[^"`]/)) {
    aInc = 1;
    gsub( /AUTO_INCREMENT|auto_increment/, "PRIMARY KEY AUTOINCREMENT" )
  }
  gsub( /(UNIQUE KEY|unique key) `.*` /, "UNIQUE " )
  gsub( /(CHARACTER SET|character set) [^ ]+ /, "" )
  gsub( /DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP|default current_timestamp on update current_timestamp/, "" )
  gsub( /(COLLATE|collate) [^ ]+ /, "" )
  gsub( /(ENUM|enum)[^)]+\)/, "text " )
  gsub( /(SET|set)\([^)]+\)/, "text " )
  gsub( /UNSIGNED|unsigned/, "" )
  gsub( /` [^ ]*(INT|int)[^ ]*/, "` integer" )
  # field comments are not supported

  gsub( / (COMMENT|comment).+$/, "" )
  # Get commas off end of line

  gsub( /,.?$/, "")
  if ( prev ){
    if ( firstInTable ){
      print prev
      firstInTable = 0
    }
    else print "," prev
  }
  else {
    # FIXME check if this is correct in all cases

    if ( match( $1,
        /(CONSTRAINT|constraint) \".*\" (FOREIGN KEY|foreign key)/ ) )
      print ","
  }
  prev = $1
}

/ ENGINE| engine/ {
  if (prev) {
    if (firstInTable) {
      print prev
      firstInTable = 0
    }
    else print "," prev
    # else print prev

  }
  prev=""
  print ");"
  next
}
# `KEY` lines are extracted from the `CREATE` block and stored in array for later print 

# in a separate `CREATE KEY` command. The index name is prefixed by the table name to 

# avoid a sqlite error for duplicate index name.

/^(  (KEY|key)|\);)/ {
  if (prev) {
    if (firstInTable) {
      print prev
      firstInTable = 0
    }
    else print "," prev
    # else print prev

  }
  prev = ""
  if ($0 == ");"){
    print
  } else {
    if (  match( $0, /`[^`]+/ ) ) {
      indexName = substr( $0, RSTART+1, RLENGTH-1 )
    }
    if ( match( $0, /\([^()]+/ ) ) {
      indexKey = substr( $0, RSTART+1, RLENGTH-1 )
    }
    # idx_ prefix to avoid name clashes (they really happen!)

    key[tableName]=key[tableName] "CREATE INDEX \"idx_" tableName "_" indexName "\" ON \"" tableName "\" (" indexKey ");\n"
  }
}

END {
  # print all `KEY` creation lines.

  for (table in key) printf key[table]

  print "END TRANSACTION;"

  if ( hexIssue ){
    print "WARN Hexadecimal numbers longer than 16 characters has been trimmed." | "cat >&2"
  }
  if ( caseIssue ){
    print "WARN Pure sqlite identifiers are case insensitive (even if quoted\n" \
          "     or if ASCII) and doesnt cross-check TABLE and TEMPORARY TABLE\n" \
          "     identifiers. Thus expect errors like \"table T has no column named F\"." | "cat >&2"
  }
}

然后按照正常的AWK脚本运行就OK啦!

不要去看网上那几个基于Python的实现,打开源码看都会发现都是将内容读入到内存再写出去,不如这个AWK来得直接,而且这个脚本不但省内存,效率也相当不错。

此文文长5455字,君不评论一二?
亲,给点评论吧!

展开本分类索引