The storage requirements for each of the data types supported by MySQL are listed here by category.
The maximum size of a row in a MyISAM table is
65,534 bytes. Each BLOB and
TEXT column accounts for only five to nine
bytes toward this size.
Important: For tables using the
NDBCluster storage engine, there is the factor
of 4-byte alignment to be taken into
account when calculating storage requirements. This means that all
NDB data storage is done in multiples of 4
bytes. Thus, a column value that — in a table using a
storage engine other than NDB — would
take 15 bytes for storage, requires 16 bytes in an
NDB table. This requirement applies in addition
to any other considerations that are discussed in this section.
For example, in NDBCluster tables, the
TINYINT, SMALLINT,
MEDIUMINT, and INTEGER
(INT) column types each require 4 bytes storage
per record.
In addition, when calculating storage requirements for Cluster
tables, you must remember that every table using the
NDBCluster storage engine requires a primary
key; if no primary key is defined by the user, then a
“hidden” primary key will be created by
NDB. This hidden primary key consumes 31-35
bytes per table record.
You may also find the ndb_size.pl utility to
be useful in such cases. This Perl script connects to a current
MySQL (non-Cluster) database and creates a report on how much
space that database would require if it used the
NDBCluster storage engine. See
Section 15.9.14, “ndb_size.pl — NDBCluster Size Requirement Estimator”, for more
information.
Storage Requirements for Numeric Types
| Data Type | Storage Required |
TINYINT |
1 byte |
SMALLINT |
2 bytes |
MEDIUMINT |
3 bytes |
INT, INTEGER
|
4 bytes |
BIGINT |
8 bytes |
FLOAT( |
4 bytes if 0 <= p <= 24, 8 bytes if 25
<= p <= 53 |
FLOAT |
4 bytes |
DOUBLE [PRECISION], REAL
|
8 bytes |
DECIMAL(,
NUMERIC(
|
Varies; see following discussion |
BIT( |
approximately (M+7)/8 bytes |
Values for DECIMAL (and
NUMERIC) columns are represented using a binary
format that packs nine decimal (base 10) digits into four bytes.
Storage for the integer and fractional parts of each value are
determined separately. Each multiple of nine digits requires four
bytes, and the “leftover” digits require some
fraction of four bytes. The storage required for excess digits is
given by the following table:
| Leftover Digits | Number of Bytes |
| 0 | 0 |
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 2 |
| 5 | 3 |
| 6 | 3 |
| 7 | 4 |
| 8 | 4 |
Storage Requirements for Date and Time Types
| Data Type | Storage Required |
DATE |
3 bytes |
DATETIME |
8 bytes |
TIMESTAMP |
4 bytes |
TIME |
3 bytes |
YEAR |
1 byte |
Storage Requirements for String Types
| Data Type | Storage Required |
CHAR( |
bytes, 0 <=
255 |
VARCHAR( |
L + 1 bytes, where
and 0
<= 255 (see note below)
or L + 2
bytes, where and 256
<= 65535 (see note below). |
BINARY( |
bytes, 0 <=
255 |
VARBINARY( |
L + 1 bytes, where
and 0
<= 255 (see note below)
or L + 2
bytes, where and 256
<= 65535 (see note below). |
TINYBLOB, TINYTEXT
|
L+1 byte, where L
< 28
|
BLOB, TEXT
|
L+2 bytes, where L
< 216
|
MEDIUMBLOB, MEDIUMTEXT
|
L+3 bytes, where L
< 224
|
LONGBLOB, LONGTEXT
|
L+4 bytes, where L
< 232
|
ENUM(' |
1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum) |
SET(' |
1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum) |
For the CHAR, VARCHAR, and
TEXT types, the values
L and M in
the preceding table should be interpreted as number of characters,
and lengths for these types in column specifications indicate the
number of characters. For example, to store a
TINYTEXT value requires
L characters plus one byte.
VARCHAR, VARBINARY, and the
BLOB and TEXT types are
variable-length types. For each, the storage requirements depend
on these factors:
The actual length of the column value
The column's maximum possible length
The character set used for the column
For example, a VARCHAR(10) column can hold a
string with a maximum length of 10. Assuming that the column uses
the latin1 character set (one byte per
character), the actual storage required is the length of the
string (L), plus one byte to record the
length of the string. For the string 'abcd',
L is 4 and the storage requirement is
five bytes. If the same column was instead declared as
VARCHAR(500), the string
'abcd' requires 4 + 2 = 6 bytes. Two bytes
rather than one are required for the prefix because the length of
the column is greater than 255 characters.
To calculate the number of bytes used to
store a particular CHAR,
VARCHAR, or TEXT column
value, you must take into account the character set used for that
column. In particular, when using the utf8
Unicode character set, you must keep in mind that not all
utf8 characters use the same number of bytes.
For a breakdown of the storage used for different categories of
utf8 characters, see
Section 10.7, “Unicode Support”.
Note: The
effective maximum length for a
VARCHAR or VARBINARY column
is 65,532.
The NDBCLUSTER storage engine in MySQL 5.1
supports variable-width columns. This means that a
VARCHAR column in a MySQL Cluster table
requires the same amount of storage as it would using any other
storage engine, with the exception that such values are 4-byte
aligned. Thus, the string 'abcd' stored in a
VARCHAR(50) column using the
latin1 character set requires 8 bytes (rather
than 6 bytes for the same column value in a
MyISAM table). This represents a change in
behavior from earlier versions of NDBCLUSTER,
where a VARCHAR(50) column would require 52
bytes storage per record regardless of the length of the string
being stored.
The BLOB and TEXT types
require 1, 2, 3, or 4 bytes to record the length of the column
value, depending on the maximum possible length of the type. See
Section 11.4.3, “The BLOB and TEXT Types”.
TEXT and BLOB columns are
implemented differently in the NDB Cluster storage engine, wherein
each row in a TEXT column is made up of two
separate parts. One of these is of fixed size (256 bytes), and is
actually stored in the original table. The other consists of any
data in excess of 256 bytes, which stored in a hidden table. The
rows in this second table are always 2,000 bytes long. This means
that the size of a TEXT column is 256 if
size <= 256 (where
size represents the size of the row);
otherwise, the size is 256 + size +
(2000 – (size – 256) %
2000).
The size of an ENUM object is determined by the
number of different enumeration values. One byte is used for
enumerations with up to 255 possible values. Two bytes are used
for enumerations having between 256 and 65,535 possible values.
See Section 11.4.4, “The ENUM Type”.
The size of a SET object is determined by the
number of different set members. If the set size is
N, the object occupies
( bytes,
rounded up to 1, 2, 3, 4, or 8 bytes. A N+7)/8SET can
have a maximum of 64 members. See Section 11.4.5, “The SET Type”.

User Comments
I cannot see why a TIMESTAMP type uses 4 bytes where a DATETIME type uses 8. Elsewhere in the Manual, it states that the ranges are identical, and in some cases the variables are treated identically. If this is not a typo, someone should document the reason for the large difference in storage requirements.
My tip is that if the above table is accurate, it is more efficient to use type TIMESTAMP over type DATETIME, saving 4 bytes per use. Even using seperate DATE and TIME columns will save 2 bytes per use.
TIMESTAMP is an integral number of seconds since epoch (Jan 1, 1970). Its range is limited to about 1970-2038 (2**31 seconds). DATETIME has a much larger range: 1000-9999 A.D. I think it's entirely plausible that a MySQL server (or database) will last to 2038 -- our mainframes running COBOL lasted into the third millenium.
This is quite an interesting topic. But then, to save space, you would need the date, and the year columns, which will be 4 bytes long as well. If "time of event" is also needed, add the time column type, and you're left with 7 bytes instead of 8.
But, how far into the future can the date / year column types go? And wouldn't it be possiblem for the server developers to update the timestamp column type?
Unix Timestamp columns are actually more space-efficient than regular columns. Right now, MySQL stores TIMESTAMPs with 4 bytes, which is pretty much equivalent to a signed integer. This gives you a range of roughly 2^32 / 2 seconds, which is equal to the 68 years that a MySQL timestamp spans. As the need for post-2038 dates increases, it will probably be pretty easy in the MySQL code to turn TIMESTAMP columns into the equivalent of a unsigned integer, which will last into the year 2106. Applying this concept even further, if MySQL were to store timestamps internally as an 8 byte unsigned integer (same size as a current Date/Time Column) or if a user was to use an unsigned bigint column to store their timestamps, instead of MySQL's TIMESTAMP field, it would be able to store timestamps through the middle of the year 584,542,048,060. The reason why DATETIME, TIME, and DATE columns store data so inefficiently is due to the nature of dates. Times are rather hard to fit into base 1, which is how binary numbers are stored. If you want to store just the minute of a time, you need to be able to store values between 1 and 59. However, the smallest number of bits (binary digits) required to store that data is 6. However, 6 bits are able to store values between 1 and 64. Since the values between 61 and 64 are never needed or used, the ability to have those numbers amounts to wasted space. With the exception of the year information, hours, minutes, seconds, months and days, do not require a number of values exactly equal to a power of 2, and thus waste data. When the data is formed in a unix timestamp, no such loss occurs as the number of seconds since 1970 can be equal to any possible integral value.
Had a lot of trouble finding the maximum table size in bytes for capacity planning. More specifically it was InnoDB tables that I had a problem with. Average row size is good, but I wanted maximum row size.
I checked several products and could not find what I wanted. Some of the tables I deal with are 300+ fields and so manual calculation was not practical.
So I wrote a little perl script that does it. Thought it might be of some use, so I include it here...it does all field types except enum/set types. It does not calculate anything regarding index size.
Just do a mysqldump -d (just the schema) of your DB to a file, and run this perl script specifying the schema file as the only argument.
----------------------------------------------------------------
#!/usr/bin/perl
use Data::Dumper;
use strict;
$| = 1;
my %DataType =
("TINYINT"=>1,
"SMALLINT"=>2,
"MEDIUMINT"=>3,
"INT"=>4,
"BIGINT"=>8,
"FLOAT"=>'if ($M <= 24) {return 4;} else {return 8;}',
"DOUBLE"=>8,
"DECIMAL"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"NUMERIC"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"DATE"=>3,
"DATETIME"=>8,
"TIMESTAMP"=>4,
"TIME"=>3,
"YEAR"=>1,
"CHAR"=>'$M',
"VARCHAR"=>'$M+1',
"TINYBLOB"=>'$M+1',
"TINYTEXT"=>'$M+1',
"BLOB"=>'$M+2',
"TEXT"=>'$M+2',
"MEDIUMBLOB"=>'$M+3',
"MEDIUMTEXT"=>'$M+3',
"LONGBLOB"=>'$M+4',
"LONGTEXT"=>'$M+4');
my $D;
my $M;
my $dt;
my $fieldCount = 0;
my $byteCount = 0;
my $fieldName;
open (TABLEFILE,"< $ARGV[0]");
LOGPARSE:while (<TABLEFILE>)
{
chomp;
if ( $_ =~ s/create table[ ]*([a-zA-Z_]*).*/$1/i )
{
print "Fieldcount: $fieldCount Bytecount: $byteCount\n" if $fieldCount;
$fieldCount = 0;
$byteCount = 0;
print "\nTable: $_\n";
next;
}
next if $_ !~ s/(.*)[ ]+(TINYINT[ ]*\(*[0-9,]*\)*|SMALLINT[ ]*\(*[0-9,]*\)*|MEDIUMINT[ ]*\(*[0-9,]*\)*|INT[ ]*\(*[0-9,]*\)*|BIGINT[ ]*\(*[0-9,]*\)*|FLOAT[ ]*\(*[0-9,]*\)*|DOUBLE[ ]*\(*[0-9,]*\)*|DECIMAL[ ]*\(*[0-9,]*\)*|NUMERIC[ ]*\(*[0-9,]*\)*|DATE[ ]*\(*[0-9,]*\)*|DATETIME[ ]*\(*[0-9,]*\)*|TIMESTAMP[ ]*\(*[0-9,]*\)*|TIME[ ]*\(*[0-9,]*\)*|YEAR[ ]*\(*[0-9,]*\)*|CHAR[ ]*\(*[0-9,]*\)*|VARCHAR[ ]*\(*[0-9,]*\)*|TINYBLOB[ ]*\(*[0-9,]*\)*|TINYTEXT[ ]*\(*[0-9,]*\)*|BLOB[ ]*\(*[0-9,]*\)*|TEXT[ ]*\(*[0-9,]*\)*|MEDIUMBLOB[ ]*\(*[0-9,]*\)*|MEDIUMTEXT[ ]*\(*[0-9,]*\)*|LONGBLOB[ ]*\(*[0-9,]*\)*|LONGTEXT[ ]*\(*[0-9,]*\)*).*/$2/gix;
$fieldName=$1;
$_=uc;
$D=0;
($D = $_) =~ s/.*\,([0-9]+).*/$1/g if ( $_ =~ m/\,/ );
$_ =~ s/\,([0-9]*)//g if ( $_ =~ m/\,/ );
($M = $_) =~ s/[^0-9]//g;
$M=0 if ! $M;
($dt = $_) =~ s/[^A-Za-z_]*//g;
print "$fieldName $_:\t".eval($DataType{"$dt"})." bytes\n";
++$fieldCount;
$byteCount += eval($DataType{"$dt"});
}
print "Fieldcount: $fieldCount Bytecount: $byteCount\n";
A timestamp uses less space, but keep in mind that when you make an UPDATE to your row, the first timestamp you have, WILL be updated with the current time.
Automatic updating of the first TIMESTAMP column in a table occurs under any of the following conditions:
* You explicitly set the column to NULL.
* The column is not specified explicitly in an INSERT or LOAD DATA INFILE statement.
* The column is not specified explicitly in an UPDATE statement and some other column changes value. An UPDATE that sets a column to the value it already has does not cause the TIMESTAMP column to be updated; if you set a column to its current value, MySQL ignores the update for efficiency.
http://dev.mysql.com/doc/mysql/en/TIMESTAMP_pre-4.1.html
Previously: "I cannot see why a TIMESTAMP....." A timestamp is always "now" or with regard to the *nix epoch, is always in the future. Therefore, the db never has to store a date "before" the *nix epoch. My guess is a time stamp is stored as *nix "seconds from the epoch". Currently, *nix stores this as a 4 byte integer (to be 8 byte in the future at some point). However, DATETIME types have to hold any date, even those before the *nix epoch. Hence, I guess they can be very large for BC dates or even larger for dates close to the end of the Universe!
Oh, and don't use TIMESTAMP over DATETIME to save space if you ever plan on running an update query on the record! Why? rtfm
IRT: Rudi Ahlers on June 6 2004 8:22am
The DATE type already contains the year portion. And the supported range is '1000-01-01' to '9999-12-31'. (storage: 3 bytes)
TIME values may range from '-838:59:59' to '838:59:59'. (storage: 3 bytes).
This means that the DATETIME type should require 6 bytes of storage, instead of 8. The 2 remaining bytes "could" be used for storing fractions of a second.
Notice the BLOB's max size is only 64K. MySQL's JDBC trunicated my .wav file at 64K w/ no Exception being generated. Look at a bigger BLOB like MEDIUMBLOB.
Follow up on previous comment. Hibernate told me that the BLOB column had been trunicated but using the JDBC driver did not generate an Exception.
I suppose the 8 bytes size of DATETIME is for better performances (CPU like 2's power memory alignments). If you use DATETIME instead of two separated columns DATE and TIME, it may be because you need to process them at once for comparing, etc... So, you loose few memory space but you increase processing performances.
If you think that "a byte is a byte" and should not be lost this way, have a look at M$ implementation of SYSTEMTIME struct (somewhere in VC98/Include/WTYPES.h):
typedef struct _SYSTEMTIME
{
WORD wYear;
WORD wMonth;
WORD wDayOfWeek;
WORD wDay;
WORD wHour;
WORD wMinute;
WORD wSecond;
WORD wMilliseconds;
} SYSTEMTIME;
And yes, WORD type is a 16bits unsigned word... :o).
This thing takes twice the DATETIME memory space!
I suppose some peoples did not want to make the "640KB is enough" mistake again.
Here's a modification of Marc's script above that also handles ENUM's. Enjoy.
#!/usr/bin/perl
use Data::Dumper;
use strict;
$| = 1;
my %DataType =
("TINYINT"=>1, "SMALLINT"=>2, "MEDIUMINT"=>3,
"INT"=>4, "BIGINT"=>8,
"FLOAT"=>'if ($M <= 24) {return 4;} else {return 8;}',
"DOUBLE"=>8,
"DECIMAL"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"NUMERIC"=>'if ($M < $D) {return $D + 2;} elsif ($D > 0) {return $M + 2;} else {return $M + 1;}',
"DATE"=>3, "DATETIME"=>8, "TIMESTAMP"=>4, "TIME"=>3, "YEAR"=>1,
"CHAR"=>'$M', "VARCHAR"=>'$M+1',
"ENUM"=>1,
"TINYBLOB"=>'$M+1', "TINYTEXT"=>'$M+1',
"BLOB"=>'$M+2', "TEXT"=>'$M+2',
"MEDIUMBLOB"=>'$M+3', "MEDIUMTEXT"=>'$M+3',
"LONGBLOB"=>'$M+4', "LONGTEXT"=>'$M+4');
my ($D, $M, $dt);
my $fieldCount = 0;
my $byteCount = 0;
my $fieldName;
open (TABLEFILE,"< $ARGV[0]");
LOGPARSE:while (<TABLEFILE>) {
chomp;
if ( $_ =~ s/create table[ ]`*([a-zA-Z_]*).*`/$1/i ) {
print "Fieldcount: $fieldCount Bytecount: $byteCount\n" if $fieldCount;
$fieldCount = 0;
$byteCount = 0;
print "\nTable: $_\n";
next;
}
next if $_ !~ s/(.*)[ ]+(TINYINT[ ]*\(*[0-9,]*\)*|SMALLINT[ ]*\(*[0-9,]*\)*|MEDIUMINT[ ]*\(*[0-9,]*\)*|INT[ ]*\(*[0-9,]*\)*|BIGINT[ ]*\(*[0-9,]*\)*|FLOAT[ ]*\(*[0-9,]*\)*|DOUBLE[ ]*\(*[0-9,]*\)*|DECIMAL[ ]*\(*[0-9,]*\)*|NUMERIC[ ]*\(*[0-9,]*\)*|DATE[ ]*\(*[0-9,]*\)*|DATETIME[ ]*\(*[0-9,]*\)*|TIMESTAMP[ ]*\(*[0-9,]*\)*|TIME[ ]*\(*[0-9,]*\)*|YEAR[ ]*\(*[0-9,]*\)*|CHAR[ ]*\(*[0-9,]*\)*|VARCHAR[ ]*\(*[0-9,]*\)*|TINYBLOB[ ]*\(*[0-9,]*\)*|TINYTEXT[ ]*\(*[0-9,]*\)*|ENUM[ ]*\(*['A-Za-z_,]*\)*|BLOB[ ]*\(*[0-9,]*\)*|TEXT[ ]*\(*[0-9,]*\)*|MEDIUMBLOB[ ]*\(*[0-9,]*\)*|MEDIUMTEXT[ ]*\(*[0-9,]*\)*|LONGBLOB[ ]*\(*[0-9,]*\)*|LONGTEXT[ ]*\(*[0-9,]*\)*).*/$2/gix;
$fieldName=$1;
$_=uc;
$D=0;
($D = $_) =~ s/.*\,([0-9]+).*/$1/g if ( $_ =~ m/\,/ );
$_ =~ s/\,([0-9]*)//g if ( $_ =~ m/\,/ );
($M = $_) =~ s/[^0-9]//g;
$M=0 if ! $M;
($dt = $_) =~ s/\(.*\)//g;
$dt =~ s/[^A-Za-z_]*//g;
print "$fieldName $_:\t".eval($DataType{"$dt"})." bytes\n";
++$fieldCount;
$byteCount += eval($DataType{"$dt"});
}
print "Fieldcount: $fieldCount Bytecount: $byteCount\n";
"A timestamp uses less space, but keep in mind that when you make an UPDATE to your row, the first timestamp you have, WILL be updated with the current time."
It doesn't have to be so. You can create the table with all timestamp columns defaulted to 0000-00-00 00:00:00, not CURRENT_TIMESTAMP
Add your own comment.