Unspecified Errors in SqlServerCE
Wow… I can’t believe it has been so long since my last post.? I guess that’s what happens when you spend all your time trying to figure out ONE STINKING BUG.
This is the CF application I have been working on for over a year.? It is a huge application for CF, well outside the normal realm of expectiations, I would say.? Dozens of forms, user controls, custom classes, and more than 10 database tables.? Things were actually going quite well, until… unpredictably, seemingly at random, the software would throw an “Unspecified Error Occurred” message.? No further detail was available, but following the stack trace we would discover that it would happen the application was trying to read the database.? Not every time, and not always in the same place.? On the same dataset, it would happen once and not again for a while, then start happenging again and again and again and… need I go on?
It actually started about 6 months or so ago.? Our testers would be able to work for awhile, but once the error happened they had to shut the app down and restart.? Now you might be inclined to think (as I did) “no problem, just put it in debug and find out whats going on”.? Aye… there’s the rub!? You could NOT recreate it in Debug.? In fact, we apparently couldn’t recreate in the office at all!? The user would go to the field and within minutes have the error.? He would bring it back to the office and sit at my desk to show it to me and he could go hours, processing hundreds of records, without a single error.? Over the course of several weeks I processed thousands of records myself and never once received the error.
Naturally, I spent a good bit of time online trying to find some other poor sap who had had the same problem, but found precious little information.? I found a couple of references to open database connections, disposed objects, and Garbage Collection, so I restructered all the database connectivity, added a ton of memory management code, and littered my classes with GC.Collect() and GC.WaitForPendignFinalizers().? Everythign worked great for me, so I sent it back to the field.? No Joy… same error, same problems.
All indications were that it was a memory management problem, but not being a C++ pocket-protector type, I couldn’t find anything else to do.? What I did know, or so I thought, was that the problem was with SqlServerCE.? So I talked to a buddy of mine, and he pointed me to SQLite, a small footprint database that I could use to replace SqlServerCE.? And there is a .NET Managed Provider I can get for it too… and it’s all open source and no cost… so I researched it, and since there were some additional advantages for my particular project (and I had no other tricks in my bag), I spent a couple of weeks tearing SqlServerCE completely out of my project and replacing it with SQLite.? Fortunately, thanks to the consistency of .NET Managed Providers, I was able to complete the work with little trouble.? Again, my testing went just fine, so happy as a clam thinking I had corrected the issue, I sent it back to the field for testing.
The first record on the first day of testing threw the error… well, sort of.? Instead of “Unspecified Error”, SQLite began throwing “Could not insert to database. Database is full.” Huh?? This caused my angst in a number of ways: first, where the error was thrown, nothing was being inserted.? Second, while the project is large, the database is decidedly NOT.? A grand total of 1500 records or so in all 10 tables: I had read about users with tables in the MILLIONS in SQLite.? The internal memory of the device was no where near capacity, and neither was the storage capacity of the Compact Flash card the database was on.
I almost gave up on the project right then and there.? But since that would have meant also giving up my gainful employment, back to the Internet I went.? One nice thing about using something like SQLite is that I was able to communicate with an active community who uses the product, including the guy who actually develops and maintains it.? I won’t get into the gory details, but there was a lot of head scratching.? Eventually, someone suggested that it sounded like a problem with the CF Card.? It turns out that like Rewriteable CDs, CF Cards have a limited number of times they can be written over before they fail.? I was quite interested in this notion… I hadn’t considered a hardware problem, and all the testing was being done on a particular device with the same CF card over and over.? Why, this must be it!? So I trotted off to Staples, bought a new card, shoved the database on it and back out to the field goes the device, sure to work this time!
Do I really have to say what happened?? The tester was actually scared to call me and tell me that he got the error message almost immediately.? OK, now I AM ready to quit.? There is nothing left I can do… I’m physically and emotionally spent at this point.? I can’t sleep for the constant thinking and worrying about this problem.? The only thing left to do is contact the guy who wrote SQLite and pay him to fix the problem.? So I did… contact him that is.? And he replied.? We even spoke on the phone, and he politely told me that he had been following my problem on line and that he had never seen anything like it and that there was no explanation for why the database would do that given the circumstances.? He said that he would take my money but didn’t know if he could get anywhere.? He also suggested that I check first with the guys who wrote the managed provider for SQLite (since it wasn’t him and he isn’t a .NET guy).
So I did that next.? I got on the forums for the managed provider.? I went around the barn again, but to no avail.? We did, however seem to reach the same conclusion: there is something wrong with the memory management.? And the came THE POST. Someone posted that they had seen a problem with CF Cards not remounting correctly after hibernation.? Was I handling the database connection before and afterthe device shut down?
Ummm… no I wasn’t.? What a silly thing to have to do, I thought.? The poster said that he had had a similar problem, and that Microsoft changed somewhere back in 2003 how Windows Mobile handles remounting storage cards after shut down.? It used to operate the way I thought it should, but MicroSoft changed it… on purpose… because the user *might* have replaced the memory card while it was shut down.
And the light bulb went on.? Even back to the original SqlServerCE database, this is what was happening.? The application creates a single database connection when the user starts the application to the database on the CF Card.? User functions for a while, no problem.? Device shuts itself down in a period of idleness.? Application may or may not be able to maintain its connection to the database.? When it can, no probem.? When it can’t: bombo, weird messages ensue.? The problem only happened when the device shut down automatically to conserve power and was restarted.? That’s why when I was in DEBUG it would never happen: plugged in to my PC, the device never shutdown.? It’s why it wouldn’t happen in my office: the user was continuously processing the application, never letting it rest naturally between records as it would in the field.
I, timidly, began to think that this time we might actually have found something useful.? I spent a couple of days rewriting the software to move the database to internal memory (which does not suffer from the same remounting issues) when the application starts, and moving it back to the CF Card when it closes.? In essence, the CF Card becomes the transport mechanism and backup.
It worked great.? In fact, it was faster and more responsive according to the field testers… and nary an error in sight!? I haven’t felt so elated in years.? A great weight shifted off my shoulders.? It has completely reinvigorated me, and I am back working on the project 110%.? Which is a good thing, too, since the deadline is next week!
This unfortunately looks like it could be the answer to my current random error. In the field I’m getting logs of “Unspecified Error Occurred” with no stack trace.
Thanks for the pointer, I’ll direct myself down this path.
Have the same problem, but had to move my DB to the storage card as it exceeded space on device. Now have to find ways of detecting wakup, and then if storage card is present, before tearing down and reopenning the DB connections.