Better String library Security Statement ---------------------------------------- by Paul Hsieh =============================================================================== Introduction ------------ The Better String library (hereafter referred to as Bstrlib) is an attempt to provide improved string processing functionality to the C and C++ language. At the heart of the bstring library is the management of "bstring"s which are a significant improvement over '\0' terminated char buffers. See the accompanying documenation file bstrlib.txt for more information. DISCLAIMER: THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Like any software, there is always a possibility of failure due to a flawed implementation. Nevertheless a good faith effort has been made to minimize such flaws in Bstrlib. Also, use of Bstrlib by itself will not make an application secure or free from implementation failures. However, it is the author's belief that use of Bstrlib can greatly facilitate in the creating of software meeting the highest possible standards of security. Part of the reason why this document has been created, is for the purpose of security auditing, or the creation of further "Statements on Security" for software that is created that uses Bstrlib. An auditor may check the claims below against Bstrlib, and use this as a basis for analysis of software which uses Bstrlib. =============================================================================== Statement on Security --------------------- This is a document intended to give consumers of the Better String Library who are interested in security an idea of where the Better String Library stands on various security issues. Any deviation observed in the actual library itself from the descriptions below should be considered a bug, and not a design flaw. Common security issues: ....................... 1. Buffer Overflows The Bstrlib API allows the programmer a way to deal with strings without having to deal with the buffers containing them. Ordinary usage of the Bstrlib API itself makes buffer overflows impossible. But furthermore, the Bstrlib API has a superset of basic string functionality as compared to the C library's char * functions, C++'s std::string class and Microsoft's MFC based CString class. It also has abstracted mechanisms for dealing with IO. This is important as it gives developers a way of migrating all their code from a functionality point of view. 2. Memory size overflow/wrap around attack Bstrlib is, by design, impervious to memory size overflow attacks. The reason is it is resiliant to length overflows is that bstring lengths are bounded above by INT_MAX, instead of ~(size_t)0. So length addition overflows cause a wrap around of the integer value making them negative causing balloc() to fail before an erroneous operation can occurr. Attempted conversions of char * strings which may have lengths greater than INT_MAX are detected and the conversion is aborted. It is unknown if this property holds on machines that don't represent integers as 2s complement. It is recommended that Bstrlib be carefully auditted by anyone using a system which is not 2s complement based. 3. Constant string protection Bstrlib implements runtime enforced constant and read-only string semantics. I.e., bstrings which are declared as constant via the bsStatic() macro cannot be modified or deallocated directly through the Bstrlib API, and this cannot be subverted by casting or other type coercion. This is independent of the use of the const_bstring data type. 4. Aliased bstring support Bstrlib detects and supports aliased parameter management throughout the API. The kind of aliasing that is allowed is the one where pointers of the same basic type may be pointing to overlapping objects (this is the assumption the ANSI C99 specification makes.) Each function behaves as if all read-only parameters were copied to temporaries which are used in their stead before the function is enacted (it rarely actually does this). No function in the Bstrlib uses the "restrict" parameter attribute from the ANSI C99 specification. 5. Information leaking In bstraux.h, using the semantically equivalent macros bSecureDestroy() and bSecureWriteProtect() in place of bdestroy() and bwriteprotect() respectively will ensure that stale data does not linger in the heap's free space after strings have been released back to memory. Created bstrings or CBStrings are not linked to anything external to themselves, and thus cannot expose deterministic data leaking. If a bstring is resized, the preimage may exist as a copy that is released to the heap. Thus for sensitive data, the bstring should be sufficiently presized before manipulated so that it is not resized. bSecureInput() has been supplied in bstraux.c, which can be used to obtain input securely without any risk of leaving any part of the input image in the heap except for the allocated bstring that is returned. 6. Memory leaking Bstrlib can be built using memdbg.h enabled via the BSTRLIB_MEMORY_DEBUG macro. User supplied definitions for malloc, realloc and free can then be supplied which can implement strategies for memory corruption detection or memory leaking. Otherwise, bstrlib does not do anything out of the ordinary to attempt to deal with the standard problem of memory leaking (i.e., losing references to allocated memory) when programming in the C and C++ languages. However, it does not compound the problem any more than exists either, as it doesn't have any intrinsic inescapable leaks in it. Bstrlib does not preclude the use of automatic garbage collection mechanisms such as the Boehm garbage collector. 7. Encryption Bstrlib does not present any built-in encryption mechanism. However, it supports full binary contents in its data buffers, so any standard block based encryption mechanism can make direct use of bstrings/CBStrings for buffer management. 8. Double freeing Freeing a pointer that is already free is an extremely rare, but nevertheless a potentially ruthlessly corrupting operation (its possible to cause Win 98 to reboot, by calling free mulitiple times on already freed data using the WATCOM CRT.) Bstrlib invalidates the bstring header data before freeing, so that in many cases a double free will be detected and an error will be reported (though this behaviour is not guaranteed and should not be relied on). Using bstrFree pervasively (instead of bdestroy) can lead to somewhat improved invalid free avoidance. For example: struct tagbstring hw = bsStatic ("Hello, world"); bstring cpHw = bstrcpy (&hw); #ifdef NOT_QUITE_AS_SAFE bdestroy (cpHw); /* Never fail */ bdestroy (cpHw); /* Error sometimes detected at runtime */ bdestroy (&hw); /* Error detected at run time */ #else bstrFree (cpHw); /* Never fail */ bstrFree (cpHw); /* Will do nothing */ bstrFree (&hw); /* Will lead to a compile time error */ #endif 9. Resource based denial of service bSecureInput() has been supplied in bstraux.c. It has an optional upper limit for input length. But unlike fgets(), it is also easily determined if the buffer has been truncated early. In this way, a program can set an upper limit on input sizes while still allowing for implementing context specific truncation semantics (i.e., does the program consume but dump the extra input, or does it consume it in later inputs?) 10. Mixing char *'s and bstrings The bstring and char * representations are not identical. So there is a risk when converting back and forth that data may lost. Essentially bstrings can contain '\0' as a valid non-terminating character, while char * strings cannot and in fact must use the character as a terminator. The risk of data loss is very low, since: A) the simple method of only using bstrings in a char * semantically compatible way is both easy to achieve and pervasively supported. B) obtaining '\0' content in a string is either deliberate or indicative of another, likely more serious problem in the code. C) the library comes with various functions which deal with this issue (namely: bfromcstr(), bstr2cstr (), and bSetCstrChar ()) Marginal security issues: ......................... 11. 8-bit versus 9-bit portability Bstrlib uses CHAR_BIT and other limits.h constants to the maximum extent possible to avoid portability problems. However, Bstrlib has not been tested on any system that does not represent char as 8-bits. So whether or not it works on 9-bit systems is an open question. It is recommended that Bstrlib be carefully auditted by anyone using a system in which CHAR_BIT is not 8. 12. EBCDIC/ASCII/UTF-8 data representation attacks. Bstrlib uses ctype.h functions to ensure that it remains portable to non- ASCII systems. It also checks range to make sure it is well defined even for data that ANSI does not define for the ctype functions. Obscure issues: ............... 13. Data attributes There is no support for a Perl-like "taint" attribute, however, an example of how to do this using C++'s type system is given as an example.