Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
MY SITUATION:
I came across a situation where I need to store a value for user selection in one of my columns of a table. Now my options would be to:
Either declare the Column as char(1) and store the value as 'y' or 'n'
Or declare the Column as tinyint(1) and store the value as 1 or 0
This column so declared, may also be indexed for use within the application.
MY QUESTIONS:
So I wanted to know, which of the above two types:
Leads to faster query speed when that column is accessed (for the sake of simplicity, let's leave out mixing other queries or accessing other columns, please).
Is the most efficient way of storing and accessing data and why?
How does the access speed vary if the columns are indexed and when they are not?
My understanding is that since char(1) and tinyint(1) take up only 1 byte space, storage space will not be an issue in this case. Then what would remain is the access speed. As far as I know, numeric indexing is faster and more efficient than anything else.
But the case here is tough one to decide, I think. Would definitely like to hear your experience on this one.
Thank you in advance.
–
Rate insert tinyint(1) insert char(1) insert enum('y', 'n')
insert tinyint(1) 207/s -- -1% -20%
insert char(1) 210/s 1% -- -19%
insert enum('y', 'n') 259/s 25% 23% --
Rate insert char(1) insert tinyint(1) insert enum('y', 'n')
insert char(1) 221/s -- -1% -13%
insert tinyint(1) 222/s 1% -- -13%
insert enum('y', 'n') 254/s 15% 14% --
Rate insert tinyint(1) insert char(1) insert enum('y', 'n')
insert tinyint(1) 234/s -- -3% -5%
insert char(1) 242/s 3% -- -2%
insert enum('y', 'n') 248/s 6% 2% --
Rate insert enum('y', 'n') insert tinyint(1) insert char(1)
insert enum('y', 'n') 189/s -- -6% -19%
insert tinyint(1) 201/s 7% -- -14%
insert char(1) 234/s 24% 16% --
Rate insert char(1) insert enum('y', 'n') insert tinyint(1)
insert char(1) 204/s -- -4% -8%
insert enum('y', 'n') 213/s 4% -- -4%
insert tinyint(1) 222/s 9% 4% --
it seems that, for the most part, enum('y', 'n')
is faster to insert into.
Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1) 188/s -- -7% -8%
select tinyint(1) 203/s 8% -- -1%
select enum('y', 'n') 204/s 9% 1% --
Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1) 178/s -- -25% -27%
select tinyint(1) 236/s 33% -- -3%
select enum('y', 'n') 244/s 37% 3% --
Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1) 183/s -- -16% -21%
select tinyint(1) 219/s 20% -- -6%
select enum('y', 'n') 233/s 27% 6% --
Rate select tinyint(1) select char(1) select enum('y', 'n')
select tinyint(1) 217/s -- -1% -4%
select char(1) 221/s 1% -- -2%
select enum('y', 'n') 226/s 4% 2% --
Rate select char(1) select tinyint(1) select enum('y', 'n')
select char(1) 179/s -- -14% -20%
select tinyint(1) 208/s 17% -- -7%
select enum('y', 'n') 224/s 25% 7% --
Selecting also seems to be the enum
. Code can be found here
–
–
–
I think you should create column with ENUM('n','y')
. Mysql stores this type in optimal way. It also will help you to store only allowed values in the field.
You can also make it more human friendly ENUM('no','yes')
without affect to performance. Because strings 'no'
and 'yes'
are stored only once per ENUM
definition. Mysql stores only index of the value per row.
Also note about sorting by ENUM
column:
ENUM values are sorted according to the order in which the enumeration members were listed in the column specification. (In other words, ENUM values are sorted according to their index numbers.) For example, 'a' sorts before 'b' for ENUM('a', 'b'), but 'b' sorts before 'a' for ENUM('b', 'a').
–
–
–
–
Using tinyint is more standard practice, and will allow you to more easily check the value of the field.
// Using tinyint 0 and 1, you can do this:
if($row['admin']) {
// user is admin
// Using char y and n, you will have to do this:
if($row['admin'] == 'y') {
// user is admin
I'm not an expert in the inner workings of MySQL, but it intuitively feels that retrieving and sorting integer fields is faster than character fields (I just get a feeling that 'a' > 'z' is more work that 0 > 1), and seems to feel much more familiar from a computing perspective in which 0s and 1s are the standard on/off flags. So the storage for integers seems to be better, it feels nicer, and is easier to use in code logic. 0/1 is the clear winner for me.
You may also note that, to an extent, this is MySQL's official position, as well, from their documentation:
BOOL, BOOLEAN: These types are synonyms for
TINYINT(1). A value of zero is
considered false. Nonzero values are
considered true.
If MySQL goes so far as to equate TINYINT(1) with BOOLEAN, it seems like the way to go.
–
–
To know it for sure, you should benchmark it. Or know that it probably will not matter that much in the grander view of the whole project.
Char columns have encodings and collations, and comparing them could involve unnecessary switches between encodings, so my guess is that an int will be faster. For the same reason, I think that updating an index on an int column is also faster. But again, it won't matter much.
CHAR
can take up more than one byte, depending on the character set and table options you choose. Some characters can take three bytes to encode, so MySQL sometimes reserves that space, even if you only use y
and n
.
–
–
–
If you specify the types BOOL
or BOOLEAN
as a column type when creating a table in MySQL, it creates the column type as TINYINT(1)
. Presumably this is the faster of the two.
Documentation
Also:
We intend to implement full boolean
type handling, in accordance with
standard SQL, in a future MySQL
release.
While my hunch is that an index on a TINYINT would be faster than an index on a CHAR(1) due to the fact that there is no string-handling overhead (collation, whitespace, etc), I don't have any facts to back this up. My guess is that there isn't a significant performance difference that is worth worrying about.
However, because you're using PHP, storing as a TINYINT makes much more sense. Using the 1/0 values is equivalent to using true
and false
, even when they are returned as strings to PHP, and can be handled as such. You can simply do a if ($record['field'])
with your results as a boolean check, instead of converting between 'y' and 'n' all the time.
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.