sql - Tinyint vs Bit? - Stack Overflow

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I don't want to touch-off a religious war here, but there seem to be two schools of thoughts in how to represent boolean values in a database. Some say bit is the appropriate data type, while others argue tinyint is better.

The only differences I'm aware of are these:

bit

: storage size is 1 bit, possible values are 0 or 1


    tinyint

: storage size is 1 byte, possible values are 0-255

Which data type is better when you need to represent boolean values? Is tinyint worth the extra overhead "just in case" you need to values > 1?

“Just in case” seems like a pretty fluid database design. Why not store everything as NVARCHAR(MAX) and cover all your bases? – Stuart Ainsworth Jul 21, 2018 at 18:54 TinyInt is my preference. Then, when doing aggregated counts against the field, you don't have to cast it. Also, some front-end languages interpret a Bit differently than others, and using a TinyInt makes validation checks universal for any front-end language. – Gregory Hart Jul 28, 2018 at 4:43 I just encountered an oddity with bit in phpMyAdmin. When I tell it to let the field be NULL and no default value is set, it defaults to <em>NULL</em> instead of NULL. +1 for tinyint btw – Vörös Imi Feb 20, 2019 at 17:55 when importing form csv file 1 works in case of tinyint(1), but in case of bit(1) you have to replace it to b'1' – Rajat Jun 5, 2019 at 14:14 in SQL server bit data type is 1 byte and tinyint is also 1 byte. you can cross check it with builtin function DATALENGTH(@MyVariable) – Abubakar Riaz Jun 15, 2021 at 5:12

When you add a bit column to your table it will occupy a whole byte in each record, not just a single bit. When you add a second bit column it will be stored in the same byte. The ninth bit column will require a second byte of storage. Tables with 1 bit column will not gain any storage benefit.

Tinyint and bit can both be made to work, I have used both successfully and have no strong preference.

That's a very helpful comment and your reputation is quite good but do you have any references to support it? Is it an implementation detail or do all engines handle it the same way? – Jon z Apr 16, 2013 at 15:35 It is quite clear from the reference of @shmosel that 1 bit(1) column takes 1 byte, but it is not so clear that two, three, four... until eight bit(1) columns take the same byte. I've searched that online without success. Could you reference that too? I'm interested just to know if, in case I have let's say four boolean columns that I need for my table, it may be worth to use bit(1) column instead of tinyint(1)s to save storage space. Thank you. – assensi Dec 30, 2020 at 1:25 @assensi Good point. You can always use a single


    BIT(n)

in place of

fields. Or you can use a regular

INT

and store each boolean as a bit. But if you're going with separate fields, I think


    TINYINT

is usually preferred to

BIT

in MySQL. – shmosel Dec 30, 2020 at 5:13

Bit...unless you're of the "true / false / file not found" clan

In case you didn't get the reference...

And in the case of Linq2SQL, bit works with true/false which makes it easier to program for. There's advantages to both.

And there's also programming maintenance to consider. What happens if you (or a junior intern programmer) uses a 2, 3, 25, 41, 167, 200 etc? Where is that documented? Bits are self-documenting and pretty universal.

@Pratik the problem is NULL means there is no value in the database. It do not mean file not found. Do this and you start to implicitly encode states into your rows that are hard to document and confusing. Kind of like having a table of items. How do i see if a item has been sold? I could look to see if it have a sales price, a sell date, a buyers name etc. Or i could enforce all that with a check constraint and create a bit field for Items sold. – CodeMonkey Aug 11, 2015 at 6:31

I use bits when appropriate. Aside from it being semantically the correct type (semantics count!), multiple bit fields (up to 8) in a single row (on SQL Server, anyway) can be consolidated into a single byte of storage. After the eighth, an additional byte is needed for the next 8, and so on.

References:

SQL 2000

SQL 2005

SQL 2008

fixed: Noted in 5.0.23, 5.1.12 changelogs. BIT columns in a table could cause joins that use the table to fail. – Antti Rytsölä Sep 23, 2014 at 12:16

A previous StackOverflow post: What is the difference between BIT and TINYINT in MySQL?

When adding a new "BOOL" column, MySQL actually uses TINYINT.

I'd just stick with BOOL (aka TINYINT ) and move on with life.

Zero Space for False

Whatever your choice, you can set to NULL instead of 0 and it will take up no extra space (since the database almost always has a NULL flag for every field of every row, just sitting there; more info here ). If you also make sure the default/most likely value is false , you'll save even more space!

Some Space for True

The value to represent true requires the space defined by the field type; using BIT will only save space if a table has multiple such columns, since it uses one byte per 8 fields (versus TINYINT which uses one byte per field).

TINYINT has the advantage of allowing you to customize an 8-value bitmask without worrying about managing a bunch of extra columns, and searching is theoretically faster (a single integer field versus several bit fields). But there are some disadvantages such as slower ordering, fancy cross-indexing stuff, and lack of field names. Which to me, is the biggest loss; your database would require external documentation to note which bits did what in which bitmasks.

In any case, avoid the temptation to use TEXT fields to store booleans or sets of them. Searching through text is a lot more work for the server, and arbitrary naming schemes like "on, off, off" can hurt interoperability.

All these theorentical discussions are great, but in reality, at least if you're using MySQL and really for SQLServer as well, it's best to stick with non-binary data for your booleans for the simple reason that it's easier to work with when you're outputting the data, querying and so on. It is especially important if you're trying to achieve interoperability between MySQL and SQLServer (i.e. you sync data between the two), because the handling of BIT datatype is different in the two of them. SO in practice you will have a lot less hassles if you stick with a numeric datatype. I would recommend for MySQL to stick with BOOL or BOOLEAN which gets stored as TINYINT(1). Even the way MySQL Workbench and MySQL Administrator display the BIT datatype isn't nice (it's a little symbol for binary data). So be practical and save yourself the hassles (and unfortunately I'm speaking from experience).

I don't think I saw it mentioned above, but there's the issue of not being able to aggregate BIT columns (e.g. MIN, MAX, and especially SUM). I just tested using 2008 and the issue is still there. That's the biggest reason I use tinyint lately - the other being I like how tinyint scales - it's always a pain when your "two-value" bit flag suddenly needs more possible values.

You can aggregate them by casting them to another datatype - Why would you need to sum true/false though? – Martin Smith Mar 26, 2011 at 20:24 We frequently group on one field and sum up how many of another field is true for each group by result, the alternative to sum would be to return the whole result to code and loop it there, sometimes resulting in returning 1000x more data to the client. But casting eliminates that so it's not a problem. – David Mårtensson Jun 30, 2011 at 8:10 You could just make all of your tables a single TEXT column and put everything in there comma-delimited. Then you would never have to change your data model. – Tom H Jan 28, 2009 at 19:44 We have a somewhat unique environment. We have extremely large datasets AND 4 9's uptime, so altering tables is rather prohibitive (double that where replication is involved). We track all the bits in a centralized location, which helps avoid the maintenance issue. – Joe Jan 28, 2009 at 20:56

@Kevin: I believe you can use group by on bit fields (SQL Server 2005):

declare @t table (
    descr varchar(10),
    myBit1 bit, 
    myBit2 bit
insert into @t values ('test1', 0, 1)
insert into @t values ('test2', 1, 0)
insert into @t values ('test3', 1, 1)
insert into @t values ('test4', 0, 0)
select myBit1, count(myBit1) from @t group by myBit1
select myBit2, count(myBit1) from @t group by myBit2
Results:
myBit1 
------ -----------
0      2
1      2
myBit2 
------ -----------
0      2
1      2
                You can (and should) easily add a constraint to the column to only allow "T" and "F". That being said, the reporting layer should be COMPLETELY SEPARATE from the database. You should not alter your database schema just for the purposes of how a column will be displayed.
– Tom H
                Jan 28, 2009 at 19:48
                I agree with Darryl. Given the lack of support for boolean types in general RDBMS systems (MySQL is not alone here) T/F (actually I prefer Y/N) is much more readable.  While I agree in principle with Tom H's comments, I think that readability is much more important than he gives credit for.  Database developers don't look at the front end when changing someone else's code!  Also, it's not always necessarily clear which way round a developer considers 1 and 0 to be.  If we were all doing it the 'proper' old-fashioned way, we'd be using -1 to represent true, and 0 to represent false.
– cartbeforehorse
                Nov 22, 2012 at 14:24
                To my previous comment, I should add that it seems as though MySQL doesn't support CHECK constraints, which would complicate the T/F option, since you can't prevent the column with being populated by any other character of the alphabet.  Not nice.
– cartbeforehorse
                Nov 22, 2012 at 14:35
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.