Migrating from PostgreSQL to MySQL at Cocolog

Migrating from PostgreSQL to MySQL
at Cocolog
Naoto Yokoyama, NIFTY Corporation
Garth Webb, Six Apart
Lisa Phillips, Six Apart
Credits:
Kenji Hirohama, Sumisho Computer Systems Corp.
Agenda




1. What is Cocolog
2. History of Cocolog
3. DBP: Database Partitioning
4. Migration From PostgreSQL to MySQL
1. What is Cocolog
What is Cocolog

NIFTY Corporation





Cocolog




Established in 1986
A Fujitsu Group Company
NIFTY-Serve (licensed and interconnected with CompuServe)
One of the largest ISPs in Japan
First blog community at a Japanese ISP
Based on TypePad technology by SixApart
Several hundred million PV/month
History



Dec/02/2003: Cocolog for ISP users launch
Nov/24/2005: Cocolog Free for free launch
April/05/2007: Cocolog for Mobile Phone launch
Cocolog (Screenshot of home page)
2008/04
700 Thousand Users
Cocolog (Screenshot of home page)
Cocolog
TypePad
Cocolog template sets
Cocolog Growth (User) ■Cocolog ■Cocolog Free
phase1
phase2
phase3
phase4
Cocolog Growth (Entry) ■Cocolog ■Cocolog Free
phase1
phase2
phase3
phase4
Technology at Cocolog
 Core System






Linux 2.4/2.6
Apache 1.3/2.0/2.2 ＆ mod_perl
Perl 5.8+CPAN
PostgreSQL 8.1
MySQL 5.0
memcached/TheSchwartz/cfengine
 Eco System
 LAMP,LAPP,Ruby+ActiveRecord, Capistrano
 Etc...
Monitoring

Management Tool


APL


response time of each post
number of spam comments/trackbacks
number of comments/trackbacks
source IP address of spam
number of entries
number of comments via mobile devices
page views via mobile devices
time of batch completion
amount of API usage
bandwidth usage
DB



DB
Hard
Monitoring points (order of priority)










Service
Proprietary in-house development with PostgreSQL, PHP,
and Perl
Disk I/O
Memory and CPU usage
time of VACUUM analyze
APP



number of active processes
CPU usage
Memory usage
2. History of Cocolog
Before DBP
10servers
Phase1 2003/12～(Entry: 0.04Million)
Postgre
SQL
Register
TypePad
Static contents
Published
NAS
WEB
Phase2 2004/12～ (Entry: 7Million)
Publish Book
Tel Operator
Support
Rich template
Postgre
SQL
2005/5～
Podcast
Portal
Profile
Etc..
Before DBP
50servers
Register
2004/12～ TypePad
Static contents
Published
NAS
WEB
Phase2 - Problems
 The system is tightly coupled.
 Database server is receiving from multiple
points.
 It is difficult to change the system design and
database schema.
Phase3 2006/3～ (Entry: 12Million)
Publish Book
Rich template
Before DBP
200servers
Web-API
memcached
Tel Operator
Support
Postgre
SQL
Register
TypePad
Podcast
Portal
Profile
Etc..
Static contents
Published
NAS
WEB
Phase4 2007/4～ (Entry: 16Million)
Publish Book
Rich template
Before DBP
300servers
Web-API
Typepad
Tel Operator
Support
memcached
Postgre
SQL
Register
Static contents
Published
Atom
Mobile
WEB
Now 2008/4～
Publish Book
Rich template
After DBP
150servers
Web-API
memcached
Tel Operator
Support
Multi
MySQL
Register
Typepad
Static contents
Published
Atom
Mobile
WEB
3. TypePad Database Partitioning
Steps for Transitioning
•
•
•
•
•
•
•
•
Server Preparation
Hardware and software setup
Global Write
Write user information to the global DB
Global Read
Read/write user information on the global DB
Move Sequence
Table sequences served by global DB
User Data Move
Move user data to user partitions
New User Partition
All new users saved directly to user partition 1
New User Strategy
Decide on a strategy for the new user partition
Non User Data Move
Move all non-user owned data
TypePad Overview (PreDBP)
Internet
Mobile Blog
Readers
Blog Owners
https(443)
Blog Readers
smtp(25) / pop(110)
http(80)
Web
Server
Application
Server
TypeCast
Server
Mail
Server
smtp(25) / pop(110)
memcached(11211)
postgres(5432)
nfs(2049)
Storage
Static Content (HTML,
Images, etc)
http(80) : atom api
Database
(Postgres)
Data Caching servers to
reduce DB load
MEMCACHED
Dedicated Server for
TypeCast (via ATOM)
ATOM
Server
ADMIN(CRON)
Server
Cron Server for periodic
asynchronous tasks
Why Partition?
TypePad
TypePad
TypePad
TypePad
TypePad
TypePad
TypePad
TypePad
User Role
(User1)
User Role
(User0)
NonUser Role
Current setup
All inquires (access) go to one
DB(Postgres)
Global
Role
Non-User
Role
User Role
(User2)
User Role
(User3)
After DBP
Inquiries (access) are divided among
several DB(MySQL)
Server Preparation
Information that does not
need to be partitioned
(such as session
information)
TypePad
Global
Role
User Role
(User1)
DB(PostgreSQL)
User Role
(User2)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
Stores job
details
User information is
partitioned
Server for
executing Jobs
User Role
(User0)
NonUser Role
Current Setup
User Role
(User3)
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
New expanded setup
※Grey areas are not used in current
steps
Global Write
Creating the user map
TypePad
①
②
DB(PostgreSQL)
User Role
(User0)
NonUser Role
Global
Role
User Role
(User1)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
User Role
(User2)
User Role
(User3)
Job Server
+ TypePad
+ Schwartz
DB(MySQL) for partitioned data
Asynchronous Job Server
Explanation
①：For new registrations only, uniquely identifying user data is written to the global DB
②：This same data continues to be written to the existing DB
※Grey areas are not used in current steps
Global Read
Use the user map to find the user partition
TypePad
②
Global
Role
③
Non-User
Role
Schwartz
DB
①
Migrate existing
user data
DB(PostgreSQL)
User Role
(User0)
NonUser Role
User Role
(User1)
Maintains user mapping
and primary key generation
User Role
(User2)
User Role
(User3)
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
Explanation
①：Migrate existing user data to the global DB
②：At start of the request, the application queries global DB for the location of user data
③：The application then talks to this DB for all queries about this user. At this stage the global DB points to
※Grey areas are not used in current steps
the user0 partition in all cases.
Move Sequence
Migrating primary key generation
TypePad
②
Global
Role
Non-User
Role
Schwartz
DB
①
Migrate sequence
management
DB(PostgreSQL)
User Role
(User0)
NonUser Role
User Role
(User1)
Maintains user mapping
and primary key generation
User Role
(User2)
User Role
(User3)
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
Explanation
①：Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that
act as “pseudo-sequences”.
② Application requests new primary keys from global DB rather than the user partition.
※Grey areas are not used in current steps
User Data Move
Moving user data to the new user-role partitions
②
TypePad
Global
Role
③
User Role
(User1)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
Stores job
details
④
User Role
(User2)
DB(PostgreSQL)
User information is
partitioned
Server for
executing Jobs
User Role
(User0)
NonUser Role
Migrating each
user data
①
User Role
(User3)
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
DB(MySQL) for partitioned data
Explanation
①：Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is
then migrated asynchronously
②：If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later.
③：After being migrated all user data will exist on the user-role DB partitions
※Grey areas are not used in current steps
④：Once all user data is migrated, only non-user data is on Postgres
New User Partition
New registrations are created on one user role partition
TypePad
①
②
DB(PostgreSQL)
User Role
(User0)
NonUser Role
Global
Role
User Role
(User1)
User Role
(User2)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
User information is
partitioned
User Role
(User3)
DB(MySQL) for partitioned data
Explanation
①：When new users register, user data is written to a user role partition.
②：Non-user data continues to be served off Postgres
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
※Grey areas are not used in current steps
New User Strategy
Pick a scheme for distributing new users
TypePad
①
②
DB(PostgreSQL)
User Role
(User0)
NonUser Role
Global
Role
User Role
(User1)
User Role
(User2)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
User information is
partitioned
User Role
(User3)
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
Explanation
①：When new users register, user data is written to one of the user role partitions, depending on a set
distribution method (round robin, random, etc)
②：Non-user data continues to be served off Postgres
※Grey areas are not used in current steps
Non User Data Move
Migrate data that cannot be partitioned by user
Information that does not
need to be partitioned
(such as session
information)
TypePad
Global
Role
User Role
(User1)
User Role
(User2)
DB(PostgreSQL)
User Role
(User0)
NonUser Role
Migrate non-User
data
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
User information is
partitioned
User Role
(User3)
①
DB(MySQL) for partitioned data
Job Server
+ TypePad
+ Schwartz
Asynchronous Job Server
Explanation
①：Migrate non-user role data left on PostgreSQL to the MySQL side.
※Grey areas are not used in current steps
Data migration done
TypePad
①
Information that does not
need to be partitioned
(such as session
information)
①
Global
Role
User Role
(User1)
DB(Postgres)
User Role
(User2)
Non-User
Role
Schwartz
DB
Maintains user mapping
and primary key generation
Stores job
details
User information is
partitioned
Server for
executing Jobs
User Role
(User0)
NonUser Role
User Role
(User3)
Job Server
+ TypePad
+ Schwartz
DB(MySQL) for partitioned data
②
Explanation
Asynchronous Job Server
①：All data access is now done through MySQL
②：Continue to use The Schwartz for asynchronous jobs
※Grey areas are not used in current steps
The New TypePad configuration
Internet
Blog Readers
Mobile Blog
Readers
Blog Owners
(management
interface)
http(80)
Web
Server
https(443)
smtp(25) / pop(110)
Application
Server
TypeCast
Server
Mail
Server
http(80) : atom api
smtp(25) / pop(110)
memcached(11211)
MySQL(3306)
nfs(2049)
Storage
Static Content
(HTML,
Images, etc)
Database
(MySQL)
Data Caching servers to
reduce DB load
MEMCACHED
Dedicated Server for
TypeCast (via ATOM)
ATOM
Server
ADMIN(CRON)
Server
Cron Server for periodic
asynchronous tasks
Job
Server
TheSchwartz server for
running ad-hoc jobs
asynchronously
4. Migration from PostgreSQL to MySQL
History of scale up PostgreSQL server, Before DBP
 DB Node Spec History
Time
OS(RedHat)
CPU Xeon
MEM
DiskArray
2003/12
7.4(2.4.9)
1.8GHz/512k×1
1GB
No
ES2.1(2.4.9)
3.2GHz/1M×2
4GB
No
ES2.1(2.4.9)
3.2GHz/1M×2
4GB
Yes
AS2.1(2.4.9)
3.2GHz/1M×4
12G
B
Yes
AS4 (2.6.9)
3.2GHz/1M×4
12G
B
Yes
AS4 (2.6.9)
MP3.3GHz/1M×4
〔2Core×4〕
16G
B
Yes
2007/11
History of scale up PostgreSQL server, Before DBP
 DB DiskArray Spec
 [FUJITSU ETERNUS8000]
http://www.computers.us.fujitsu.com/www/pro
ducts_storage.shtml?products/storage/fujitsu/
e8000/e8000




Best I/O transaction performance in the world
146GB (15 krpm) * 32disk with RAID - 10
MultiPath FibreChannel 4Gbps
QuickOPC (One Point Copy)
 OPC copy functions let you create a duplicate copy
of any data from the original at any chosen time.
Scale out MySQL servers, After DBP
 A role configuration
 Each role is configured as HA cluster
 HA Software: NEC ClusterPro
 Shared Storage
Scale out MySQL servers, After DBP
heart beat
FibreChannel SAN
DiskArray
Postgre
SQL
MySQL
Role1
MySQL
Role2
MySQL
Role3
…
TypePad
Application
Scale out MySQL servers, After DBP
 Backup
 Replication w/ Hot backup
Scale out MySQL servers, After DBP
heart beat
FibreChannel SAN
DiskArray
Postgre
SQL
MySQL
Role1
MySQL
Role2
MySQL
Role3
mysqld
mysqld
mysqld
…
rep
rep
rep
mysqld
mysqld
mysqld
TypePad
Application
opc
MySQL
BackupRole
Troubles with PostreSQL 7.4 – 8.1
 Data size


over 100 GB
40% is index
 Severe Data Fragmentation

VACUUM




“VACUUM analyze” cause the performance problem
Takes too long to VACUUM large amounts of data
dump/restore is the only solution for de-fragmentation
Auto VACUUM

We don’t use Auto VACUUM since we are worried about
latent response time
Troubles with PostgreSQL 7.4 – 8.1
 Character set
 PostgreSQL allow the out of boundary UTF-8
Japanese extended character sets and multi
bytes character sets which normally should
come back with an error - instead of
accepting them.
“Cleaning” data
 Removing characters set that are out of the
boundries UTF-8 character sets.
 Steps




PostgreSQL.dumpALL
Split for Piconv
UTF8 -> UCS2 -> UTF8 & Merge
PostgreSQL.restore
dump
Split restore
UTF8->UCS2->UTF8
Merge
Migration from PostgreSQL to MySQL using TypePad script
 Steps
 PostgreSQL -> PerlObject & tmp publish
-> MySQL -> PerlObject & last publish
 diff tmp ＆ last Object （data check）
 diff tmp ＆ last publish （file check）
data check
Object
Object
TypePad
TypePad
PostgreSQL
Document
Document
tmp
File check
last
Troubles with MySQL
 convert_tz function
 doesn't support the input value outside the
scope of Unix Time
 sort order
 different sort order without “order by” clause
Cocolog Future Plans
 Dynamic
 Job queue
Consulting by
 Sumisho Computer Systems Corp.
 System Integrator
 first and best partner of MySQL in Japan
since 2003
 provide MySQL consulting, support, training
service
 HA
 Maintenance
 online backup
 Japanese character support
Questions

Download Report