I have often gotten into discussions of how to store binary data, say pdf files, images, even Word documents in a database.
While technically it's pretty easy to do (
this article shows you how to store binary data in SQL Server ), just using some sort of BLOB column, I always reply with a question back: why would you EVER want to do such a thing?
First of all, my critic against such a solution has nothing to do with performance. A blob column is not stored inside the "real" datarows, but the datarows simply contains a pointer to the blob column data, so there is not much of a performance hit - some have suggested you might be accessing more pages than needed in queries selecting other columns but the blob data. But that's not the case.
It is more of a principle. My point is that the database should only be used for storing data. A PDF file consists of data AND formatting instructions, so therefore don't put it into your database.
A pdf-file in a database would be of no use for say a custom query or whatever.
On the contrary, by leaving it on disk (and save the filename reference in the database) it might very well be of use, say by using some sort of indexing system capable of indexing pdf files. While you might setup some indexing system fetching the pdf from the database and then feeding it to the indexer, that's not my point. My point is that it's generally of no use on the database and therefore should be kept out of it.