Collection of insane and fun facts about SQLite
SQLite is the most deployed and most used database. There are over one trillion (1000000000000 or a million million) SQLite databases in active use.
It is maintained by three people. They don’t allow outside contributions.
SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild. It’s everywhere.
It is also probably one of the top five most deployed software modules.
Hwaci is the company behind SQLite. They are also into music(?)
SQLite originated from a US warship. D. Richard Hipp (DRH) was building software for the USS Oscar Austin, a Navy destroyer. The existing software would just stop working whenever the server went down (this was in the 2000s). For a battleship, this was unacceptable.
So DRH asked the question: what if the database just worked without any server? In DRH words:
Why do we even need a server? Why can’t I pull this directly off the disk drive? That way if the computer is healthy enough, it can run our application at all, we don’t have dependencies that can fail and cause us to fail, and I looked around and there were no SQL database engines that would do that, and one of the guys I was working with says, “Richard, why don’t you just write one?” “Okay, I’ll give it a try.” I didn’t do that right away, but later on, it was a funding hiatus. This was back in 2000, and if I recall correctly, Newt Gingrich and Bill Clinton were having a fight of some sort, so all government contracts got shut down, so I was out of work for a few months, and I thought, “Well, I’ll just write that database engine now.”
Unlike most open source project that uses standard licenses (like MIT or GPL), SQLite does not come with a OSI-approved license.
Instead, SQLite is released into the public domain and it has even fewer restrictions.
Also note that this can be a problem in some countries/jurisdictions where public domain is not recognized. Hence, SQLite sells licenses, called ‘Warranty of Title’.
They don’t allow outside contributions. You cannot just send a pull request and hope the patch will be accepted.
Open Source, Not Open Contribution
Contributing to SQLite is invite-only (I don’t have a source). Only after you are invited and have signed an affidavit dedicating your contribution to the public domain can you submit patches.
How do they cook?
There are over 600 lines of test code for every line of code in SQLite. Tests cover 100% of branches (and 100% MC/DC) in the library. The test suite is extremely diverse, including fuzz tests, boundary value tests, regression tests, and tests that simulate operating system crashes, power losses, I/O errors, and out-of-memory errors.
Interestingly, some SQLite tests are proprietary. The test suite called TH3 (Test Harness 3), which achieves 100% branch coverage of the code, is proprietary and is not open to access.
I don’t know any other project which has made code free, but test suites are paid.
However, they could not sell a single copy of TH3. DRH said in a podcast:
The 100% MCD tests, that’s called TH3. That’s proprietary. I had the idea that we would sell those tests to avionics manufacturers and make money that way. We’ve sold exactly zero copies of that so that didn’t really work out.
In order to get access, one needs to be part of SQLite Consortium, which costs $120K/yearly.
It’s an interesting business model. They generate revenue through licenses, paid support, maintenance services, consortium membership, and commercial extensions.
SQLite does not have a Code of Conduct (CoC), rather Code of Ethics derived from “instruments of good works” from chapter 4 of The Rule of St. Benedict
In SQLite: In place of a legal notice, here is a blessing:
All the of the source code files come with a blessing.
SQLite is so fast, they compete with
fopen
. For some use cases, you can use SQLite instead of a filesystem, that can be 35% faster.SQLite vs Redis (guess which is faster?)
For some usecases, the SQLite can be faster than Redis due to network stack and (de)serialisation overhead.
But, unlike most databases, SQLite has a single writer model. You cannot have more than one concurrent writer.
This was also changed recently in 2010 by adding WAL mode. Before that, you could have either readers or a writer, but never together.
There are other things which are very common in other databases but not in SQLite:
- The default is rollback journal mode, which restricts you to have either multiple readers or a single writer
- Foreign Keys are disabled; they are opt-in
- It is “weakly typed”. SQLite calls it “type affinity”. Meaning you can insert whatever in a column even though you have defined a type. Strong typed columns are opt-in (by
STRICT
tables). - Many of the
ALTER
commands you expect in other databases don’t work. For example, you cannot add a contraint to an existing column. (They recently added ability to rename a column name)
There is a whole list of quirks here.
I hate that it doesn’t enforce types. It’s totally YOLO:
CREATE TABLE user(id INTEGER); INSERT into user VALUES ("YOLO!"); --- This works!
Not only that, it does not throw any error if you give some random type.
CREATE TABLE t(value TIMMYSTAMP);
There is no
TIMMYSTAMP
type, but SQLite accepts this happily.SQLite has five types:
NULL
,INTEGER
,REAL
,TEXT
,BLOB
. Want to know something cursed? The type affinity works by substring match!CREATE TABLE t(value SPONGEBLOB) --- This is BLOB type!
So yeah, this happens too:
Note that a declared type of “FLOATING POINT” would give INTEGER affinity, not REAL affinity, due to the “INT” at the end of “POINT”.
This is one my favorite lore. SQLite had to change the default prefix from
sqlite_
toetilqs_
when users started calling developers in the middle of the nightSQLite takes backward compatibility very seriously
All releases of SQLite version 3 can read and write database files created by the very first SQLite 3 release (version 3.0.0) going back to 2004-06-18. This is “backwards compatibility”. The developers promise to maintain backwards compatibility of the database file format for all future releases of SQLite 3.
But they take backward compatibility so seriously that even if they have shipped a bug, they won’t fix it
SQLite’s author D. Richard Hipp (DRH) did not find existing version control systems suitable. So he wrote his own called Fossil. Fossil is powered by SQLite, of course.
This reminds me of how Linus wrote Git.
DRH also wrote his own parser generator called Lemon.
DRH wrote the B-Tree based on the algorithm in the book TAOCP by Donald Knuth, coding it on a plane while traveling (super based)
SQLite is pronounced as “Ess-Cue-El-ite”. There is no official guideline though. DRH mentioned in the SQLite forums:
I wrote SQLite, and I think it should be pronounced “S-Q-L-ite”. Like a mineral. But I’m cool with y’all pronouncing it any way you want.
:-)
That’s it for today! If I missed any, let me know. Happy holidays and Happy New Year! 🎄 ☃️
1. Sources: Most Deployed, Public Domain and Contributions, Testing, Paid support, Faster than filesystem, SQLite History.
2. I posted this as thread on Twitter, where bunch of people provided more sources. Thanks to them: 1, 2, 3, 4.
3. After following the discussions on Reddit and HN, I have rephrased the license section (#6). Previously, it read: SQLite is not open source in the legal sense, as “open source” has a specific definition and requires licenses approved by the Open Source Initiative (OSI).