When reading and writing database records, be aware that there are some slight differences in behavior depending on whether your database supports duplicate records. Two or more database records are considered to be duplicates of one another if they share the same key. The collection of records sharing the same key are called a duplicates set. In DB, a given key is stored only once for a single duplicates set.
By default, DB databases do not support duplicate records. Where duplicate records are supported, cursors (see below) are typically used to access all of the records in the duplicates set.
DB provides two basic mechanisms for the storage and retrieval of database key/data pairs:
The DBT->put() and DBT->get() methods provide the easiest access for all non-duplicate records in the database. These methods are described in this section.
Cursors provide several methods for putting and getting database records. Cursors and their database access methods are described in Using Cursors.
Records are stored in the database using whatever organization is required by the access method that you have selected. In some cases (such as BTree), records are stored in a sort order that you may want to define (see Setting Comparison Functions for more information).
In any case, the mechanics of putting and getting database records do not change once you have selected your access method, configured your sorting routines (if any), and opened your database. From your code's perspective, a simple database put and get is largely the same no matter what access method you are using.
You use DB->put() to put, or write, a database record. This method requires you to provide the record's key and data in the form of a pair of DBT structures. You can also provide one or more flags that control DB's behavior for the database write.
Of the flags available to this method, DB_NOOVERWRITE may be interesting to you. This flag disallows overwriting (replacing) an existing record in the database. If the provided key already exists in the database, then this method returns DB_KEYEXIST even if the database supports duplicates.
For example:
#include <db.h> #include <string.h> ... char *description = "Grocery bill."; DBT key, data; DB *my_database; int ret; float money; /* Database open omitted for clarity */ money = 122.45; /* Zero out the DBTs before using them. */ memset(&key, 0, sizeof(DBT)); memset(&data, 0, sizeof(DBT)); key.data = &money; key.size = sizeof(float); data.data = description; data.size = strlen(description) +1; ret = my_database->put(my_database, NULL, &key, &data, DB_NOOVERWRITE); if (ret == DB_KEYEXIST) { my_database->err(my_database, ret, "Put failed because key %f already exists", money); }
You can use the DB->get() method to retrieve database records. Note that if your database supports duplicate records, then by default this method will only return the first record in a duplicate set. For this reason, if your database supports duplicates, the common solution is to use a cursor to retrieve records from it. Cursors are described in Using Cursors.
(You can also retrieve a set of duplicate records using a bulk get. To do this, you use the DB_MULTIPLE flag on the call to DB->get(). For more information, see the DB Programmer's Reference Guide).
By default, DB->get() returns the first record found whose key matches the key provide on the call to this method. If your database supports duplicate records, you can change this behavior slightly by supplying the DB_GET_BOTH flag. This flag causes DB->get() to return the first record that matches the provided key and data.
If the specified key and/or data does not exist in the database, this method returns DB_NOTFOUND.
#include <db.h> #include <string.h> ... DBT key, data; DB *my_database; float money; char description[DESCRIPTION_SIZE + 1]; /* Database open omitted for clarity */ money = 122.45; /* Zero out the DBTs before using them. */ memset(&key, 0, sizeof(DBT)); memset(&data, 0, sizeof(DBT)); key.data = &money; key.size = sizeof(float); data.data = description; data.ulen = DESCRIPTION_SIZE + 1; data.flags = DB_DBT_USERMEM; my_database->get(my_database, NULL, &key, &data, 0); /* * Description is set into the memory that we supplied. */
Note that in this example, the data.size field would be automatically set to the size of the retrieved data.
You can use the DB->del() method to delete a record from the database. If your database supports duplicate records, then all records associated with the provided key are deleted. To delete just one record from a list of duplicates, use a cursor. Cursors are described in Using Cursors.
You can also delete every record in the database by using DB->truncate().
For example:
#include <db.h> #include <string.h> ... DBT key; DB *my_database; float money = 122.45; /* Database open omitted for clarity */ /* Zero out the DBTs before using them. */ memset(&key, 0, sizeof(DBT)); key.data = &money; key.size = sizeof(float); my_database->del(my_database, NULL, &key, 0);
When you perform a database modification, your modification is made in the in-memory cache. This means that your data modifications are not necessarily flushed to disk, and so your data may not appear in the database after an application restart.
Note that as a normal part of closing a database, its cache is written to disk. However, in the event of an application or system failure, there is no guarantee that your databases will close cleanly. In this event, it is possible for you to lose data. Under extremely rare circumstances, it is also possible for you to experience database corruption.
Therefore, if you care if your data is durable across system failures, and to guard against the rare possibility of database corruption, you should use transactions to protect your database modifications. Every time you commit a transaction, DB ensures that the data will not be lost due to application or system failure. Transaction usage is described in the Berkeley DB Getting Started with Transaction Processing guide.
If you do not want to use transactions, then the assumption is that your data is of a nature that it need not exist the next time your application starts. You may want this if, for example, you are using DB to cache data relevant only to the current application runtime.
If, however, you are not using transactions for some reason and you still want some guarantee that your database modifications are persistent, then you should periodically call DB->sync(). Syncs cause any dirty entries in the in-memory cache and the operating system's file cache to be written to disk. As such, they are quite expensive and you should use them sparingly.
Remember that by default a sync is performed any time a non-transactional database is closed cleanly. (You can override this behavior by specifying DB_NOSYNC on the call to DB->close().) That said, you can manually run a sync by calling DB->sync().
If your application or system crashes and you are not using transactions, then you should either discard and recreate your databases, or verify them. You can verify a database using DB->verify(). If your databases do not verify cleanly, use the db_dump command to salvage as much of the database as is possible. Use either the -R or -r command line options to control how aggressive db_dump should be when salvaging your databases.