May | 2010 | Developing For .NET

Clearing a SQL Server Database

May 28, 2010 joelcochran 3 comments

UPDATE: Be sure to check out the follow-up article, Clearing a SQL Server Database, Take 2.

I posted recently in Project Greenfield: Testing the TDD Waters about a small conversion project I’ve been using as a test bed for some of the techniques and technologies I’ll be using in Project Greenfield. I know SQL and am very comfortable in DB2 on the IBM System i, but this is the most extensive work I have done with SQL Server in recent memory. I have really appreciated the ease of database management provided by the tools integrated into Visual Studio. I especially appreciate it since I cannot seem to get SQL Server Management Studio installed on my development machine, but I won’t go into that right now.

The Database Schema

As background for the rest of this article, here is the schema we’ll be discussing. The diagram itself is too hairy to post, but the simplified version below should suffice. In this case, “->” means “Parent Of” and indicates a 1 to many relationship.

RealEstateMaster
-> CardDetail
-> CardImprovementDetail
-> TransferHistory
-> LandDetail

This collection of tables and relationships is exposed by the Entity Framework as a RealEstateMaster Entity. In the database, these tables also hold numeric Codes for various elements, each with a corresponding table, a “look up” table for normalization. There are well over a dozen of these, so I’ll not list them all, but they all function like so:

CodeTable
-> CardDetail
-> RealEstateMaster
-> LandDetail

From an Entity standpoint, these are not child and parent relationships, but from a database standpoint they do enforce the same type of Foreign Key constraints. In other words, each code in the CardDetail table must exist in it’s corresponding CodeTable.

Starting Fresh

I have several scenarios where the conversion process requires a “fresh start”, in other words a clean database with no data in the tables. This means that on demand I need to be able to wipe out all the data from all the tables in the database. This seemingly simple task turned out to take a lot more effort to figure out than I originally anticipated.

Using Entity Framework

At first, I assumed (wrongly) that since I was using Entity Framework for all of my database access that there would be a way to do this built in to the Entity Context. I made the rookie mistake of equating my EF classes to direct access to the database and all it’s functionality. I also made the rookie mistake of equating the EF classes to database tables: this one to one mapping is in no way necessary, so in hindsight I understand why there is no “TableName.Clear()” kind of an option.

I believe this problem can be solved using the EF classes but it would be very cumbersome. As I see it, it would require you to loop through the entire collection of RealEstateMaster entities and delete each one. That delete operation should loop through it’s children and delete those records as well. Afterwards, you could then do the same to each code table, which at that point should have no constraining records.

NOTE: The statements above are theoretical: I did not try this because it seemed like way too much work and not really a proper application of EF. I chose EF because it provides an easier way to work with SQL Server, but when EF actually gets in the way, it tells me I should find a different solution.

Back to SQL Server

Having explored my EF options, I decided the best thing to do was create a Stored Procedure in SQL Server to perform this task. Having never written a Stored Procedure in SQL Server, I wasn’t sure exactly what I was getting into, so I reached out for help to a known SQL Server guru: Andy Leonard. One of the great things about being involved with the community is knowing people who know things!

Andy graciously tolerated my newbie questions and with his guidance via an email exchange he led me to the solution I finally implemented. With his permission, I’m going to share a little of our exchange. I’m going to leave it unedited, mostly because I love Andy’s way of putting things, but also so you can get the same undiluted experience I did.

ME: having explained the situation and schema above …

So I guess I have several questions:
1) How do the relationships affect the deletes? Does the order matter?
2) Is there a way to do a "cascading" delete that will loop through the relationships and delete the related table rows automatically?
3) Am I making this harder than it needs to be? Is there a better way?

ANDY:

1. Usually 1-many indicates parent-child. The parent is usually on the "one" side of this relationship; child is usually on the "many" side. Order matters. You want to remove the child(ren) first. If you want to be extra cool about it, remove the child(ren) and then the parent in a transaction. That way, if something "bad" happens (like crossing the streams </GhostBusters>) during the parent delete, the child delete can rollback. Transactions are built for this. You are set up here for something "bad" to happen – you have multiple children for a given parent. If you miss one and get all the rest – and there’s data in that parent-child relationship you miss – your best outcome is a failure with rollback. Everything will return to its pre-transaction state. Without a transaction, you risk introducing data pollution (half the data for a given entity is missing).

2. There is a way to set up cascading referential integrity. It’s rare in practice and has to be in place before you begin your delete statements.

3. This is rocket surgery. You are not adding complexity, the complexity was here when you arrived.

My solution would be something like:

begin tran

delete sh

from SalesHistory sh

inner join MasterRecord mr on mr.ID = sh.MasterRecordID

where mr.ID in (…<list of MasterRecord table IDs>…)

<cut more examples of the same approach>

— commit tran

— rollback tran

Notice I worked from the bottom of the list to the top – that’s intentional. Most people think of entity construction "top down." Deletes need to work in the opposite order.

If everything appears to work, execute the commit statement. If not, you can execute the rollback and put everything back just like it was before you started. As a precaution, always execute the commit or rollback at least twice – you want to make sure you close all the transactions you opened. And it’s easy to start a new one accidentally – and it becomes a nested transaction when you do (if you close the SSMS window and leave an open transaction, the tables involved are locked. That’s "bad"…). You want to highlight the commit or rollback in SSMS and keep clicking Execute until you get an error indicating there are no open transactions to end. It’s a best practice.

ME:

My first quest
ion would be why this:

delete sh

from SalesHistory sh

inner join MasterRecord mr on mr.ID = sh.MasterRecordID

where mr.ID in (…<list of MasterRecord table IDs>…)

instead of this:

delete sh

from SalesHistory sh

Here is why I ask:
1) The purpose here is really just to clear out all the tables, completely disregarding the current data. A total purge, if you will.
2) Using the first statement leaves open the possibility of orphaned data – or does it? If the relationships are defined, what happens when there are SalesHistory rows with no associated MasterRecord row?
3) It seems like additional complexity: won’t the joins be a performance hog?

ANDY:

If you’re just after clearing all the tables, a simple DELETE statement – starting with the children – will work. There is a popular myth that JOINs slow down performance. It’s akin to saying a farm of web servers slow down performance because there’s all that time lost deciding which server to send the request and then managing the distributed session management.

   The truth is Joins can improve performance as much as hurt it. They’re a tool. Proper indexing and server management are the keys to performance.

   That said, you can use Truncate Table to clear them. That does a couple things:

1. Wipes out the data.

2. Is not logged (so it flies).

3. Resets identity columns to the initial seed value (usually 1).

4. Requires the ddl_admin role for permission.

   That’s a nice middle ground between dropping/recreating and deleting.

Order Matters

Andy’s first response talked about the best practice for doing an operation of this nature, which I rejected only because I just wanted a total purge of the data: if I was doing something more production oriented I would have taken the approach Andy suggested.

So the idea of just issuing a bunch of DELETE commands over all the tables does what I need. The first lesson here, though, is that Order Matters. Because of the relationships I created between the tables I could not simply issue a DELETE against the parent tables until there were no longer any children constraining them.

Recall the relationships listed above:

RealEstateMaster
-> CardDetail
-> CardImprovementDetail
-> TransferHistory
-> LandDetail

I had to start at the deepest point in the heirarchy and work my way up, so the final order looks like this:

DELETE CardImprovementDetail
DELETE CardDetail
DELETE TransferHistory
DELETE LandDetail
DELETE RealEstateMaster
DELETE all the Code tables mentioned above

Using TRUNCATE instead of DELETE

From Andy’s last email, I decided that TRUNCATE might be a better option. I had never heard of TRUNCATE before, but using it is very simple: ex. “TRUNCATE CardDetail”.

Unfortunately, when I changed all my DELETEs to TRUNCATEs, I discovered a little foible. Apparently, TRUNCATE will not work when a Foreign Key Relationship is defined, even when there is no data affected. So in other words, I can issue TRUNCATE CardImprovementDetail, because it does not define a FOREIGN KEY relationship to any other table. I can NOT, however, issue TRUNCATE CardDetail, because it defines a FOREIGN KEY relationship to CardImprovementDetail (as well as all of it’s corresponding Code tables). This held trus even after CardImprovementDetail had been truncated itself and held no data.

The Final Solution

So the final solution was to use TRUNCATE when possible, and DELETE when necessary. I wrapped all of these up in a Stored Procedure and now when I need to clear the entire database I can simply execute.

Remembering that Order Matters, the final procedure execution looks like this:

ALTER PROCEDURE dbo.ClearDatabase
AS
BEGIN
TRUNCATE TABLE CardImprovementDetail;
DELETE FROM CardDetail;
TRUNCATE TABLE Land;
TRUNCATE TABLE TransferHistory;
DELETE FROM RealEstateMaster;
DELETE FROM Carport;
DELETE FROM Condition;
DELETE FROM Easement;
DELETE FROM ExteriorWall;
DELETE FROM Floor;
DELETE FROM Foundation;
DELETE FROM Garage;
DELETE FROM Heat;
DELETE FROM InteriorWall;
DELETE FROM MagisterialDistrict;
DELETE FROM Occupancy;
DELETE FROM RightOfWay;
DELETE FROM RoofMaterial;
DELETE FROM RoofType;
DELETE FROM Sewer;
DELETE FROM SiteCharacteristic;
DELETE FROM SiteTerrain;
DELETE FROM Water;
END
RETURN

The Nuclear Option

Before I close, I did want to mention the Nuclear Option: I could just drop the database and recreate it at runtime. I considered it briefly because while the Entity Context does not have a Clear Table kind of option, it does have CreateDatabase(). It also has CreateDatabaseScript(), which you can use to extract the schema into an executable SQL script. It seems to me that you could just nuke the database (probably with a Stored Procedure) and use some combination of these to recreate it.

I only considered it for a moment, because it seems heavy handed. On top of that, if something were to go wrong it could leave the application in an unusable state. It also assumes that the SQL Script generated by EF will match the standards required by the application or the client. I’m not saying the generated schema would not function, but there could be outside factors. I suppose you cold store the schema in a file locally and use it to recreate the database, outside of EF, but it just feels ill advised.

Back to Entity Framework

At the end of all this, what I was left with was a Stored Procedure defined in my SQL Server that will do the task required. Unfortunately, if I leave it as is it means I will need to use a tool like SQL Server Management Studio to execute the procedure manually. Since my users will occasionally need to do this themselves, I don’t think that is a viable option.

Instead, I need to be able to run the Stored Procedure programmatically, but doing so means using traditional ADO.NET methods. I would then turn around and create an Entity Context, and it feels silly to do both in the same program. To solve that problem, I added the Stored Procedure to my Entity Context. And that’s where we’ll pick up next time: using Stored Procedures in Entity Framework.

Categories: Entity Framework, SQL Server

Code Snippet Basics – now with NUnit!

May 27, 2010 joelcochran 6 comments

Download the NUnit Code Snippets.

As I was working through using NUnit for the first time, I started to notice that the majority of the tests I wrote followed this pattern:

var expected = someValue;
var result = methodUnderTest();
Assert.AreEqual(expected, result);

Realizing that I was liable to write hundreds, if not thousands, of these tests, I decided this was an excellent opportunity to finally try my hand at writing Code Snippets.

Code Snippets

If you Code Snippets are old hat to you, skip on to the next section. If you haven’t used Code Snippets yet, you are in for a real treat.

I remember the first time I saw Code Snippets. I was at a presentation at VSLive! in 2005, and the presenter kept making code templates (seemingly) appear out of nowhere. Then he was able to quickly tab through them and fill in bits of the code. I immediately recognized how cool this was, and a few code snippets in particular have become second nature to me.

If you aren’t sure yet what I’m talking about, try this out. Open a project in Visual Studio. Go to a class, and just inside the class declaration type the letter “c”. Intellisense should popup with something like this:

What we are looking for is anything with the orange box icon: this indicates a Code Snippet. If you select that item and press Tab, Visual Studio will place a template in your code. This template may even have defined sections that you can navigate with the Tab key and fill in with the correct data.

You can search through the library of snippets by going to Tools –> Code Snippet Manager. This will open a window that will allow you to browse your snippets. For starters, expand the C# folder and you will see it is chock full of goodies. Here are a few that I use all the time:

ctor – this snippet will insert an empty Constructor
prop – this snipper will create an automatic property and allow you to easily fill in the return type and name (also check out propg, which will make the property setter private)
foreach – lays out the template for a foreach loop (you can also use for to insert a traditional for loop.)
try – inserts a try…catch… block template.

There are many more shipped by default with Visual Studio. You can also import snippets from a different source, which we’ll discuss in a little bit.

Writing your own Code Snippet

A Code Snippet is an XML file that lays out the rules for inserting this block of code. To create one, just create an XML file and name it whatever you like with a .snippet extension. The name of the file doesn’t matter, but I named mine based on the snippet shortcut, so “nutm.snippet” for Code Snippet nutm. When you edit it, of course it has to conform to the snippet standard. I don’t claim to be an expert: I just found a couple of samples on line and butchered them until they did what I want. All in all it was pretty easy. If you are really interested in trying this yourself, I suggest you read the MSDN Documentation – Creating Code Snippets.

For my purposes, I created two Code Snippets.

nutf – creates an NUnit Test Fixture (the class that holds the Unit Tests)
nutm – inserts an NUnit Test method.

Here is what the snippet XML looks like for nutm, the Code Snippet for inserting a Test Method:

<?xml version="1.0" encoding="utf-8"?>
<CodeSnippets xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
  <CodeSnippet Format="1.0.0">
    <Header>
      <Title>NUnit Test Method</Title>
      <Author>Joel Cochran</Author>
      <Shortcut>nutm</Shortcut>
      <Description>Inserts an NUnit Test method</Description>
      <SnippetTypes>
        <SnippetType>Expansion</SnippetType>
      </SnippetTypes>
    </Header>
    <Snippet>
      <Declarations>
        <Literal Editable="true">
          <ID>method</ID>
          <Default>Method</Default>
          <ToolTip>Insert the Method name you are testing.</ToolTip>
        </Literal>
        <Literal Editable="true">
          <ID>scenario</ID>
          <Default>Scenario</Default>
          <ToolTip>Insert the name of the scenario you are testing.</ToolTip>
        </Literal>
        <Literal Editable="true">
          <ID>expectedBehavior</ID>
          <Default>ExpectedBehavior</Default>
          <ToolTip>Insert the expected behavior of your test.</ToolTip>
        </Literal>
        <Literal Editable="true">
          <ID>expectedValue</ID>
          <Default>ExpectedValue</Default>
          <ToolTip>Insert the expected return value for this test.</ToolTip>
        </Literal>
      </Declarations>
      <Code Language="CSharp">
        <![CDATA[
      [Test]
      public void $method$_$scenario$_$expectedBehavior$()
        {
            var expected = $expectedValue$;
            var result = _instance.$method$();
            Assert.AreEqual(expected, result);
        }

      $end$]]>
      </Code>
    </Snippet>
  </CodeSnippet>
</CodeSnippets>

Importing a Code Snippet

There are lots of Code Snippets available online for download, including my NUnit Code Snippets. Whether you are writing your own snippets or importing them from somewhere else, you will need to go through the same process to make them available to Visual Studio. Once you have the snippet on your local machine, just follow the documentation.

I will share a frustration with you: I had no way of knowing whether or not my Code Snippet was valid until I tried to import it into Visual Studio. If the format is invalid, the Code Snippet will simply not import, and that is all the help you get. Once the format is acceptable the import goes off without a hitch. Abd that’s all the help Visual Studio will give you.

Using Your Code Snippet

Now that your Code Snippet is installed, you use it just as we defined before: enter the snippet shortcut, press Tab, and watch the magic! Unfortunately, there is one bit of bad news. Intellisense does not show your snippet in its listing.

At least, it doesn’t for me: that’s not to say it can’t, but more that I don’t know how to make it show up in Intellisense. It’s possible that there is a way to define it in the XML, or perhaps it is ReSharper intruding on Intellisense a little (although I doubt it.) In either case, I don’t know how to do it: but if you do, please post it in a comment below!

Categories: .NET

Project Greenfield: Testing the TDD Waters

May 24, 2010 joelcochran 2 comments

NOTE: if you are just here for the video, the link is here: http://www.developingfor.net/videos/TDD1Video/

I’ve mentioned recently in Developer Growth Spurts and Project Greenfield that I am trying my hand at Test Driven Development (TDD). I’ve been reading a lot about it and have given it a go on a couple of occasions. I’ve been sidelined the last week or so by a billable project with a deadline (I supposed paying the bills is kind of important), so I’m not focused on Project Greenfield right now, but I don’t see that as an excuse to completely halt my progress.

Taking Advantage of the Unexpected

The good news is that the side project fits very well into the overall goals of Project Greenfield. The project is a pretty straightforward conversion project, reading data from our legacy database and writing it to a SQL Server database. I even get to design the schema of the target database.

I suppose a SQL Server guru would use SSIS or something like that to accomplish this task with no code, but that is well beyond my SQL Server skills at the moment. The project does, however, give me the chance to experiment with a few other technologies that I will be using in Project Greenfield, so I am trying some new things out and only billing half-time to make up for it with my client.

SQL Server

This is my first real project using SQL Server, small though it may be. I’ve messed around with it in the past, creating some tables and relationships for a co-worker, but this is something that will actually be going into the field so it is different. The first thing I did was build the schema based on the client’s specifications. As I was doing so, I realized it was wrong, but I finished it anyway because I didn’t want to stop progress to wait on a response. Once I was able to communicate with them, they agreed with my concerns and now I am fixing the problems, which are largely normalization issues.

I will share though, that I think I screwed up. My first instinct was to use a SQL Server project in Visual Studio, largely so it would be under version control. Unfortunately, when using such a project failed to be intuitive, I quickly gave up and went with what I know. In Visual Studio I connected to my local SQL Server Express, created a Database Diagram, and used it to create my schema.

This works just fine, except I now have no way to get to that database to extract the schema for my client. I know the answer is supposed to be to use SQL Server Management Studio, which I have installed for SQL Server 2005, but I need one that works with SQL Server 2008 Express. I found it online and downloaded it, but it won’t install. I’ll have to spend some time soon fixing this or come up with another solution. I do have a couple of ideas that would involve using …

Entity Framework 4

The next thing I am doing differently is using Entity Framework 4 for all of the SQL Server database access. Don’t get me wrong, I’m not doing anything really complex: all I need to do is connect to the database and write new records in a handful of files. But it has given me the opportunity to understand how to work with the Entity Context, how to manage object creation, experiment with how often to write records, learn about object relationships, and more. I feel much more confident with EF now.

This was helped by spending some time this weekend at Richmond Code Camp with Dane Morgridge. I was able to sit in his EF4 presentation and we spent some time coding together later, so I learned a bunch more about EF in the process. But he gave me some great guidance, and as it gels more I’m sure I will write about it. We also talked about Dependency Injection and some other stuff: folks, THIS is why I love community events so much!

Test Driven Development

If you’ve managed to read this far you are surely asking yourself “I thought this was supposed to be about TDD?” Fair enough, I just wanted to lay some of the ground work for the project.

I started this project with the intent of implementing TDD. I felt that a small project like this would be ideal to get my feet wet, and I will say so far so good. I’m sure I’m not doing it “just right”, but I am doing it which is a huge step forward. A buddy of mine said this weekend that just trying TDD puts me far ahead of most .NET developers when it comes to testing. I’ll take that with a grain of salt, but in a way I’m sure he’s correct.

As usual, I really started with the best of intentions. I began with an empty solution and created two projects: the working project and the testing project. I began writing code in my Test class first, then allowed the magic of ReSharper to help me create the classes and methods I was testing. I also used the NUnit Code Snippets I wrote to speed production.

Mocking

I quickly ran into my first need for a mock object. I have a huge pre-existing DAL project that handles all of the legacy database work. The main class I would be using is about 3500 lines of codes, so naturally I wasn’t about to reinvent the wheel. I also thought at first that mocking this class up would be inordinately difficult, but I was willing to go down the rabbit hole for a little while to see where it led.

Where I ended up, at least at first, was actually not that bad. I used ReSharper once again to extract an interface from this huge class. At first, I thought I found a ReSharper bug: my entire computer froze for about 4 minutes. The mouse disappeared, the keyboard would not respond, windows would not focus, etc. I was basically locked out of my machine. I let it sit for a while and sure enough it came back and the new Interface file was created.

Now for my mock object: I created a test class in my test project that implemented the same interface. I did make one mistake: I allowed the implementation to throw Not Implemented exceptions. This caused some issues later, so I changed it to just create default auto properties.

Now for one of the beauties of TDD: because I was committed to writing the tests first and then writing just enough code to make it pass, I did NOT try to implement all the properties and methods of my test class. Instead, I implemented each one as I was testing it! This helped with the mocking a lot since there were plenty of properties and methods I was not using in this project.

And Not Mocking

This worked great for a while, but I admit that it eventually began breaking down. Or rather, I began breaking down.

I ran into what I considered a practicality issue. It may fly in the face of TDD, Unit Testing, and Code Coverage, but I got the feeling that there are just some things I don’t need to test. The legacy database DAL has been used extensively: I know it reads data from the database correctly, so I’m not goi
ng to try to fit in a bunch of tests after the fact. If I was starting from scratch perhaps I would, but at this point in the game there just isn’t enough ROI.

I came to the same conclusion with Entity Framework: I’m pretty sure that I don’t need to test that putting a string into a string variable in an EF class actually works. And for about 90% of this project, that’s all I’m doing: moving strings from the legacy database DAL to my new Entity Framework classes. So I decided that when that’s all I’m doing, moving one piece of data from old to new, with no reformatting, type conversions, or anything like that, then I was not going to write tests for those operations.

So the tests I did write for that first class were only for times when I had to convert or reformat the data. This was good because it severely limited the number of test scenario I needed to cover. I expect this is an issue I will have to figure out at some point: I know the goal is to test everything, but surely there must be a line drawn somewhere.

And then I ran into an issue where Mocking didn’t seem feasible. And before I go any further, I recognize that I am not talking about mocking frameworks or auto mocking or anything like that: I guess what I am really doing is called stubbing.

As I got further into the conversion, I began to rely on data from the legacy database. I could have faked all these classes, but it would have taken a lot of time and effort for very little reward. Fortunately, and one of the reasons it became difficult to mock all of this out, is that much of the data I needed at this point is static system data. Faking this stuff out would have just been a nightmare, so instead I chose to integrate a single database connection into my unit tests.

I realize this breaks a few rules. It means I have to be connected to my network at my office to run these particular tests. It means that my tests, and ultimately my code, is brittle because if this dependency. Which means that I should probably be using a mocking framework and Dependency Injection to solve some of these problems. Not to worry, I’ll get there!

I’m sure the TDD and testing purists would have a field day with my decision. And I’m cool with all of that, I welcome the comments.

Houston, we have Video!

During these adventures I thought it would be interesting if I shared some of the Project Greenfield content as videos. As a result, I am happy to announce the first ever Developing For .NET Video, available for viewing at http://www.developingfor.net/videos/TDD1Video/

Rather than walk through some Hello World/Calculator TDD example, this video contains, among other things, a walk through of a real world TDD sample. I have a method I need to create, so I write a Unit Test first, use it to create the Method, write enough code to compile but fail, then write enough code to pass, all in a real production project!

I would love to hear your comments about the video, so please add them to this post.

Categories: Project Greenfield

Richmond Code Camp 2010.1 Coolness

May 22, 2010 joelcochran 1 comment

As I write this, I am at Richmond Code Camp 2010.1. If you read this blog regularly, you’ll know that I think code camp is a lot of fun, and Richmond is no exception. This is my 4th Richmond Code Camp and it is always a top notch event. This time in particular is special for me because I was asked to be on the Planning Committee.

It’s very cool to watch an event like this build from the inside. It’s probably a little cliche, but you really can’t appreciate all the work it takes to successfully pull off an event like this until you’ve seen it from inside the ropes. And it is truly rewarding to know that I contributed something that was helpful and appreciated. Kudos to the rest of the team: they are very very good at running this event.

I Love Learning

I can’t tell you how much I enjoy a good presentation. I admire people who are knowledgeable enough in a tech area to teach others and are willing to freely do so. I especially appreciate it when that knowledge comes from real world experience. I also think it’s fantastic that so many developers are passionate enough about their craft to spend precious free time coming to code camps and user group meetings.

Even when the presentations are not specifically applicable to my current projects, I still get a lot out of them, but there is nothing like sitting in a presentation on Saturday and being able to use the information on Monday. This time at Richmond, I hit the trifecta: I sat in 3 sessions in a row that will increase my skills and abilities on Monday. Considering I could only attend three because I was presenting at two, this was a real treat.

First was Andy Leonard’s “Database Design for Developers”, where I saw some really cool scripting tricks for SQL database generation. Perfect timing since I am working on just such a project. Second was Dane Morgridge’s presentation entitled “Getting Started with Entity Framework 4” – also perfect timing since I am using EF4 for the first time. Dane and I even got together later and geeked out over using StructureMap for Dependency Injection and a new CodePlex project he’s working on (write up coming soon!) Finally, a great time was had by all in Curtis Mitchell’s session on Distributed Version Control using Mercurial, which I wrote about recently. Thanks to him and Dugald Wilson I finally got my head around branching!

More Rewards

As much as I enjoy presentations, I have to admit I enjoy giving presentations even more. I get a lot out of the experience: honing my skills, considering questions that never occurred to me, camaraderie, meeting people, and the list goes on. Today, though, I experienced some extra coolness in the form of two encounters.

This morning I was standing near the speaker wall looking at the schedule when someone came up and stood next to me. They looked at the wall, they looked at my badge, looked at the wall, looked at my badge, and asked if that was me on the wall. I said yes, and he said “I read your blog!” I think it was the first time I was ever recognized in relation to this blog, and best of all he was very excited about meeting me. He even recalled the series I wrote about taking the Graphics Design class at the local community college. It was a nice experience: he was happy and it made me happy!

Another bit of coolness happened after lunch when I ran into a coder I’ve know for a couple of years but hadn’t seen for a while. I asked how he was doing and what he was working on, and he said that he is the only one in his office who knew anything about Blend, so he was in charge of their current WPF application development. Then he said something way cool: he said he had seen me present a couple of times on Blend, and that it was because of me that he was doing what he id doing now. Wow!

Personally, I’m just blown away: thanks to both of these guys for making today a great day. It means a lot to think that my little contributions are helping people. It really is a true reward.

Categories: .NET

Project Greenfield: Implementing Source Version Control

May 20, 2010 joelcochran 3 comments

As promised, I have begun down the path outlined recently in Project Greenfield. As I discussed in that post, one thing my company has sorely lacked has been Version Control. Yes, there are backups. Yes, there are development copies. Yes, we have source escrowed with a third party. No, I don’t think any of those things count as source or version control.

I’ve discussed this topic many times with fellow geeks, and the conclusion is always the same: even as a 1 person team, I should absolutely be operating under source version control. I’ll admit for a while I thought it seemed like overkill for what I do, but over the years I have come to understand that it really is a fundamental part of the development environment. So, with Project Greenfield, I have finally implemented a Version Control System (VCS).

Choosing a Solution

Starting with a blank slate is nice: I am free to select whatever system I wish to use. And in the beginning, I will be the only one using it, so I have the opportunity to set the standard and get my feet wet before I need to bring anyone else into the fold. The problem was I had no idea what I was looking for or what I needed.

Naturally, I spent a bunch of time researching, and my friends will tell you I spent a lot of time asking pretty basic questions. I realize now that the solution isn’t really all that important. The most important thing is to use VCS: any VCS is better than no VCS! You can always change which system later by starting fresh in a new system, at least that’s how I see it. In fact, it appears that some people use multiple systems. I know one person who uses one system locally for his development work, but his company uses an entirely different system, so he updates his changes to that when he is done locally.

If you are new to VCS

If you are an old hat at VCS, you can safely skip this section. Or you can keep reading it if you want to laugh at me, I really don’t mind.

For the rest of you (us), I want to share a little of my research. My one cursory experience with VCS was in the late 90’s at an AS/400 shop. The system was built around a check-out, check-in, approval model. It made sense to me because it was very linear. There were several layers of approval required to get code back into the code base: supervisor, testing and quality assurance, documentation, and final review (or something like that – it’s been a while.) At each step of the way you had to deal with conflicts, rejections, etc. It was a lengthy and tedious process.

Expecting the same sort of experience, I was surprised to find that the world of VCS is not so straightforward. I learned that the choices partitioned themselves into two camps: Traditional Version Control Systems and Distributed Version Control Systems (DVCS). Frankly, I don’t feel qualified to discuss the differences between the two approaches, but I’ll try to hit the highlights.

VCS uses a central repository that contains all the code. Developers check out the code they need to work on and then check it back in when they are done. Because this is all done over the wire, it can be a little slow and cumbersome.

DVCS, on the other hand, distributes complete copies of the repository, so every developer machine becomes a full fledged version control system in its own right. All developer changes are then made to the local repository. This leaves the developer free to create new branches, experiment, refactor code, or what have you, without even pulling in code from the central repository. The repository can easily be reset to any point in its history at any time in the future, so you can abandon changes if they don’t work out. This is very powerful and is frequently called “time travel.”

When the developer is ready to post changes, he first pulls down the current version of the repository and merges his changes with it locally. This means all the conflict resolution is also handled locally by the developer who caused the conflict. Once all is right with the code again, it gets pushed back to the central repository where other developers can now go through the same process.

One nice thing about this approach is that there are no locks on the repository and no expectation that code must be “checked back in.” Another nice thing about DVCS is that if something happens to the central repository, it can be rebuilt from the developer copies. Finally, DVCS *really* only moves around the changes to the repository, not the entire repository. This means that updates are much smaller: combined with the fact that almost all the work is done locally and you have a lightening fast system.

From the reading I did and the polls I took, DVCS was the hands down winner, although one traditional VCS had a good showing.

The Choices

SVN

This is totally a guess on my part, but it seems to me that the most prevalent system out there is Subversion, more commonly known as SVN. SVN is a very popular open source VCS. It’s free and supposedly easy to setup and use. It is a traditional VCS, so it has a Server component and a Client component.

There are some downsides of SVN, and traditional systems in general. Committing changes to the server is slower because of the complete files are being transferred and analyzed. Also, merging is more of a hassle because you have to download the complete files from the central server. The comparison and merging methods are different from DVCS, so conflicts are far more common. Additionally, SVN treats file and folder renames as deletes and adds, meaning you can lose revision history data.

I spoke with a lot of developers who use SVN, either as a personal choice or more often because their company uses it. All of the problems notwithstanding, the overall opinion of SVN was very positive. It appears to work well, supports large number of developers, has lots of tooling available, and is generally regarded as very stable. The same could not be said of the alternative VCSs out there.

VSS and TFS

Microsoft’s classic entry in this space is Visual Source Safe (VSS). VSS is famous as the source control developers love to hate. When I was at PDC09 I picked up a pretty cool shirt from a vendor: it has a picture of a woman screaming in surrealistic agony, and at the bottom are the words “VSS Must Die.” Naturally, the shirt is from a source control vendor, but it seems to sum up the community opinion of VSS. I can’t say I’ve ever heard a single positive remark about VSS, except that people positively hate it.

Fortunately, it seems that Microsoft agrees, and is attempting to replace VSS with Team Foundation Server. To be fair, TFS is much more than just version control, it is a complete code management system, with bug tracking, administrative control, rule enforcement, Visual Studio integration, and so on. I’ve heard questionable things about the source control but great things about the rest of the system. One suggestion I’ve heard was that Microsoft should allow any source control system to integrate with TFS, and that would make TFS ideal.

Git

Git is a DVCS. I see a lot of talks about Git at code camps and conferences and it seems to be getting a lot of
attention in the .NET community. By virtue of the fact that it was written (at least partly) by Linus Torvalds, it has already become the de rigueur choice of Linux and open source geeks. Many Git users are almost fanatical about their devotion to this tool, which I think says a lot (some good and some bad.)

The good thing about Git is that it just seems to work, and work well. It is built for speed and from all accounts it delivers. As a distributed system it has all the benefits I mentioned above and then some. Finally, I learned about a most compelling feature for me: Github. Github is a web based hosting service for Git repositories, which made immediate sense to me in a distributed environment. I almost chose Git then and there because everything I heard about Github was fantastic: I think people are more fanatical about Github than Git itself. Of course, once I calmed down a bit I learned that other systems have similar hosting services available, so I did not allow that alone to be the deciding factor.

The bad thing about Git is that it really seems oriented towards gear heads. I don’t mean that as a derogatory term at all. To me, a gear head is someone who is comfortable operating closer to the metal, using things like shell scripts, command lines, configuration files, etc. I have nothing but respect for that, because while I can function at that level I really prefer not to. Instead, I want to see those complexities wrapped up in a nice, user-friendly interface that I can rely on to flawlessly enter the twelve switches of some cryptic command (but that’s just me).

Mercurial (Hg)

The product I finally selected is Mercurial, commonly abbreviated to Hg for mercury’s abbreviation on the Periodic Table Of Elements (#80): the terms Mercurial and Hg are used interchangeably. Hg is similar to Git: it is a distributed system with all that entails. In fact, the two projects have some interesting parallels. They were inspired by the same event (the withdrawal of the free version of Bitkeeper), they were begun at virtually the same time, and they share many of the same goals.

They were also both originally designed to run on Linux, but it seems that Hg adapted to Windows faster and Git has been playing catch up in the cross platform arena. I don’t see that as much of a concern today since both systems functional perfectly well in a Windows environment. That being said, I consider the fact that CodePlex uses Hg as a pretty solid endorsement.

For me and my purposes, the best thing about Hg is that it feels less complex and seems more Windows friendly. This is really because the supporting Windows software, which I’ll cover shortly, is more advanced. The overall impression I got was that if I “just want to do source control”, then I can get up and running faster and easier with Mercurial, without the need to learn a ton of command line stuff. Since I have not implemented Git I cannot compare, but I was able to get Hg up and running pretty easily.

Implementing Mercurial

Python

Hg is built on Python, so you will need to at least install the Python Windows Binary before you can install Hg. Python is free and open source, so just download it from the Python homepage. I chose the 2.6.5 Windows Installer (binary only) because I don’t want or need the source, but feel free to dig as deeply as you like. Also, as of this writing there is a newer version of Python, but the download page states “If you don’t know which version to use, start with Python 2.6.5; more existing third party software is compatible with Python 2 than Python 3 right now.”

Installing Hg

Remembering that DVCS means each install is a full repository, there is no Hg Client vs. Hg Server installation. Instead, you simply install Hg. If you plan on just using the Command Line interface, you can simply download and install the latest version.

BUT WAIT!

If you plan on using the Windows Integration features, which I would recommend, then skip this step and proceed to the next section on TortoiseHg.

TortoiseHg

TortoiseHg is a Windows Shell Extension that makes working with Hg in Windows a breeze. Once installed, you can access the source control tools directly from Windows Explorer by right-clicking on folders: the tools will be integrated into the context menus.

The reason we skipped the step above is that installing TortoiseHg will also install the latest version of Mercurial, so for a Windows developer this is where I would start.

VisualHg

If you do not use Visual Studio, you now have all you need to easily and quickly get started with Hg. If you do use Visual Studio, there is one other tool you will want to install: VisualHg.

VisualHg integrates most of the TortoiseHg features into Visual Studio, so you can manage your repository from directly within the IDE. Additionally, it adds icons to your Solution Explorer letting you know when files and projects in your Solution need to be committed to your local repository. It’s built on and tightly integrated with TortoiseHg, so that is a prerequisite.

Bitbucket

Hg’s answer to Github is Bitbucket, which doesn’t have the reputation that Github has but seems to have the same basic toolset and abilities at the same price. For several reasons, I was very keen to host my source elsewhere, so I went ahead and created a free account to experiment. Using the service has been really easy, and linking my local repository to the private repository I created on Bitbucket is very simple: since it just uses HTTP, all I have to do is provide Hg with the link to the repository on Bitbucket.

Now how the heck do I use this thing?

As a stone cold newbie, I needed some guidance. A site I found very helpful, both to deciding to use Hg and in learning how it works, is Joel Spolsky’s excellent HgInit.com. This is probably the best non-video training I’ve seen on DVCS. It is command line oriented, but I suggest you go through it (probably more than once) to help understand what the GUI tools are doing for you. I know I will be returning to this site again in the future.

Also, I don’t often plug services you have to pay for, but TekPub.com is worth every penny. The videos are fantastic and widely varied. In this case, they have a series of videos called “Mastering Mercurial” by Rob Conery, a very well known figure in .NET land. This series uses TortoiseHg and Visual Hg and is a superb walk-through of Mercurial in a real world environment. If you have TekPub, go watch this series. If you don’t have TekPub, buy it, then go watch this series!

After that, the best thing I can recommend is to simply try it out. A buddy of mine and I have used Bitbucket and played around with making simultaneous changes to files, merging, multiple heads, etc. I think like a lot of things it will just take practice. In my case, as a lone developer, it is very simple: I make changes, I commit those changes to my local repository, and I update (or Push) t
hose changes to the central repository on Bitbucket.

Some Closing Thoughts

I have a tendency to suffer from “paralysis by analysis”, so this process took me far longer than it probably should have. Once I finally decided to do something about it, though, actually getting up and running was a pretty short exercise. I’d say it took me roughly half a day to get everything installed, figure out how to use Bitbucket, watch some videos, and learn how to use TortoiseHg and Visual Hg.

I want to make it clear that I am not advocating any particular solution. It does seem obvious to me that DVCS is the way of the future, which at this point means choosing between Git or Mercurial. Right after I selected Mercurial and got it up and running, I came across this article that has me wary of my choice. I’m not going to switch or anything like that, but I will proceed with a watchful eye. And I will continue to study Git and DVCS in general.

I have plenty left to learn: branching, multiple heads, sub-repositories, merging, and more. For now, I am just happy to be using source control: progress has been made!

Categories: Project Greenfield

Project Greenfield

May 10, 2010 joelcochran 2 comments

I am in a theoretically enviable position: I am beginning a “green field” project. A green field project is one that begins with a completely blank slate: no preconceptions about what technologies to use, what methodologies to employ, or what the final product will look like. This is the project we all dream about: total freedom and total control. I am no longer hobbled by an existing database. I am no longer restricted to “how we’ve always done things.” Paraphrasing Sarah Conner from the original Terminator, for the first time the future is unclear to me.

At first glance, this sounds like a developer’s dream come true, and in the end it probably is, but as I near the beginning of the project I begin to see it as an embodiment of the saying “be careful what you wish for, you just may get it.” This is why I say my position is theoretically enviable. While I have complete freedom, I also have complete responsibility. And to top it all off this project is make or break for the company. If this project fails, we might as well close the doors. And no, I am not overdramatizing.

My plan, for what it’s worth, is to document this undertaking.

Where to begin…

I had a long section written here about the history of my company, our software, our customers, and why we were tackling this project. Then I realized that, in fact, this is what I am trying NOT to do: focus on the past. I don’t want to rehash where we’ve been because I don’t want it to taint where we are going. And so far that is the hardest thing: I met with our domain expert to discuss some of the target goals, and I had to steer the conversation away from the existing product several times.

While this is really, truly, everything new from the beginning, there are some decisions that have already been made, so let’s get them out of the way.

We will use SQL Server. I’ve long believed that data is king. I always start with the data: the database, schema, relationships, etc. Ultimately it is the reason we are in this business. Almost every RFP we have received in the last 5-7 years has required SQL Server: it is becoming the de facto standard in our market. Since we have never offered a SQL Server solution we are frequently unable to bid for contracts. This fact is the driving force behind this project. It’s bad enough when the other kids make fun of you, but far worse when you’re not even allowed on the playground.
We will use .NET. If we are going to make the jump from IBM to Microsoft, from Green Screen to GUI, from DB2 to SQL Server, then we’re going whole hog. Knowing that SQL Server is our target database, what better decision could you possibly make than to develop the rest of the application on the Microsoft Stack?
We will use Version Control. Our current software was originally written in the mid 80’s. I realize that’s longer than some of you readers have been alive, so it may be a shock to you, but yes software that old does work. The software has been continuously modified, upgraded, and maintained over that period, but it has never been in source control. Our first action will be to implement version control, which I will cover in my next post.
We will use Unit Testing. It probably goes without saying, but our existing software has exactly 0 unit tests. The nature of the platform and the development environment do not lend themselves to unit testing, TDD, mocking, etc. Don’t get me wrong, the software is thoroughly tested, but not in any kind of a “best practices” sense of the word. While the verdict is not yet in on TDD, I’m definitely feeling pulled in that direction. Again, I’ll be posting about that when the time comes.
We will use Agile Techniques. At least, we’ll use some parts of Agile. Company owners, users, and domain experts aside, this is essentially a one man operation, so that naturally means no pair programming. I’m also not sure what a one man stand up would look like. That being said, I’ve consulted some practitioners and there are things I can do. I have a couple of books to read and I bought a bunch of post-it notes, so we’ll see.

With the exception of .NET, everything in the list above is a new endeavor for me and my company. And none of the above mentions the technical specifics: there are a lot of decisions to made there, many of which will be new for us as well. This is a huge undertaking, so I expect to encounter some failure along the way. I’m OK with that: we all know you learn more from your mistakes than your successes.

Where we go from here

I’ll be spending the next couple of weeks in project preparation: setting up version control, writing specifications, developing guidance, establishing processes, etc. Along the way I’ll be posting about what I’m going through, what’s going through my head, and what decisions I’ve made.

Given the scope of the project, I expect to be writing about it for quite a while. Along the way, if you are interested, I encourage you to participate in the comments. I will place every post in this on going series in the Project Greenfield category. It should be fun!

Categories: Project Greenfield

XAML Formatting in Visual Studio

May 7, 2010 joelcochran 1 comment

A question came up last night at RVNUG about manually editing XAML, something I avoid as much as reasonably possible. When I have to edit XAML though, I almost always jump over to Visual Studio, so I was asked why I prefer to edit XAML in Visual Studio over the Blend editor. Besides the fact that I’ve always just done it that way because originally we did not have Intellisense in Blend, I have two other reasons.

The first reason is that I use Blend to write XAML and Visual Studio to code XAML. It sounds like splitting hairs, but let me explain. Blend is the best darn XAML Editor ever written, primarily because it allows me to write and edit XAML without actually typing the XAML. It magically translates the design I have on the screen into it’s XAML representation: that’s what makes it so awesome. When I find a situation where I actually need to code the XAML and make textual changes to it myself, then I use Visual Studio, because it is the best darn Code Editor ever written. To sum up: Design = Blend, Code = Visual Studio.

The second reason is far less highbrow: I dislike the default XAML formatting that Blend produces. Don’t get me wrong, the XAML code itself is wonderful, almost pristine, but it compresses it to as few lines as possible. That means lots of properties on a single line, which becomes an issue when you have a bunch of properties full of Binding references and more complicated structures. So my preference is to see one property per line in the XAML. This makes it much more palatable on those rare moments when I must code the XAML manually.

Setting up Visual Studio

I saw the question today on the Expression Blend forums about how to get Blend or Visual Studio to do exactly that, so I thought I would put up a quick post for the archives.

While Blend cannot format the XAML in this fashion, Visual Studio can:

In VS2008 or VS2010 go to Tools –> Options
Expand Text Editor -> XAML -> Formatting –> Spacing
Under "Attribute Spacing" check "Position each attribute on separate line".
If you prefer, you can also put the first attribute on the same line as the tag by checking the box. I use this setting because it isn’t offensive and still saves a little space.

Now, whenever you edit XAML in Visual Studio, press "Ctrl+K+D" and Visual Studio will reformat the XAML as desired.

Enjoy!

Categories: Blend, Visual Studio, XAML

Developer Growth Spurts

May 5, 2010 joelcochran 3 comments

As a kid, I remember my parents always talking about the “growth spurts” I would go through. Mom always said you could tell if I was about to have a growth spurt by the way I would eat – she used to claim I had hollow legs. As I recall, the biggest growth spurts always happened during the summer, and then you would return to school and your friends and teachers were always amazed at how much you’d grown.

I remember one summer in particular, between 9th and 10th grade. I had always been kind of a small kid, a little behind the rest, and I was an easy target for bullying and harassment. That summer I caught up with a vengeance: I grew over 4 inches and gained about 50 lbs. On returning to school some of the kids didn’t even recognize me as I was now bigger than most everyone else. My life really changed as the bullies decided it was time to move on to smaller prey. That summer marks my entrance into adulthood, at least physically speaking, and things were never quite the same afterwards.

Growing as a Developer

My development as a developer has seen several similar growth spurts: the move from Procedural to Object Oriented Programming; the move from developing fixed format text screens to interactive GUI and the Event Driven model; and more recently the move from Windows Forms to WPF. I consider each one of these to be a Paradigm Shift, and each represents huge advancements in my abilities as a developer.

Of course, what I remember most was how difficult each task seemed at the time. I speak frequently about WPF, Silverlight, Blend, and other XAML related technologies. Perhaps the most common question I am asked is “how long does it take to learn this stuff.” While I naturally do not have a real answer (I usually say about 6 months of emersion to become competent), I understand what prompts the question – anticipation of the growth spurt and the accompanying growth pains.

While our adolescent growth was completely beyond our control, our developer growth is just the opposite. We have complete responsibility over our growth as developers, and that is a bit of a scary thought. The old adage “if you ain’t growing your dying” never applied more than to technology professions. I think what this means for us is that we need to constantly make the effort to place our selves in one of two states: either we need to be preparing ourselves for a growth spurt or we need to be in the midst of one at all times.

Preparing for Growth

Remember Mom said you could tell I was getting ready for a growth spurt by how I ate? In development, information is our nourishment. You should be able to tell if you are preparing for growth by what you are reading. For that matter, are you reading at all?

I’ve always been a fan of tech books and I have a tendency to read them cover to cover. I even like to read books about stuff I think I already know because inevitably I don’t know it as well as I could. I bought a Kindle just for tech books. I take it with me almost everywhere so I can fill dead moments with reading. And of course the web is overflowing with blogs, articles, white papers, etc. I probably spend 30% of my work time reading or looking up information on the web. The point here is to read, read, read, and then read some more.

A lot of the time this is reading just for the sake of it, like reading about a new technology or device just because you are curious. This sort of non-targeted reading is great and necessary to stay aware of trends and general goings-on in our profession. And doing so primes the pump, so to speak: it keeps your cognitive juices flowing so that you will be in the right state to grow. To really prepare for a growth spurt, however, we need targeted study.

Targeting Growth

So once we make the conscious decision to grow, where do we begin? The first thing to do is select the area you want to advance. Personally, I’ve recently decided that Software Craftsmanship is where I want to improve. Like many developers, I’ve mostly coded by the seat of my pants. Get it coded, make it work, push it out the door: after all, this is a business we’re trying to run here and productivity is everything.

I see now that this is a very short-sighted way of developing. I’ve almost always been a lone wolf programmer so I never had an environment or mentor that would train me otherwise. Since I’ve been involved with the developer community, however, I’ve been exposed to different ways of thinking. I’ve seen presentations on Unit Testing, learned about Agile practices, adopted coding tools, and more, but never in any targeted way: until now.

Summer Reading List

I have put together a reading list to begin the process.

The Pragmatic Programmer, by Andy Hunt and Dave Thomas. Probably the most famous work ever on programming practices. This book is mentioned constantly as a must read.
Clean Code: A Handbook of Agile Software Craftsmanship, by Robert “Uncle Bob” Martin.
Code Complete, 2nd Edition, by Steve McConnell. Another book I’ve heard nothing but good things about.
The Art of Unit Testing, by Roy Osherove. The first area I intend to implement is TDD and Unit testing.
Test Drive Development by Example, by Kent Beck.
Applications = Code + Markup, by Charles Petzold. While this is not a programming practices book, I chose it because it promises to cover XAML specific best practices.

I’ve already begun reading some of these. I’ve already been trying to learn unit testing and I think I’m finally starting to understand TDD: now I just need to learn how to actually put it into practice. The fun part is that I will be implementing these ideas as I go: I’ll be writing soon about my new project, which is very ambitious from a development standpoint.

Choose Your Next Growth Spurt

So now it is up to you to choose your next growth spurt. Where would you most like to improve as a developer? Once you answer that the challenge becomes “what are you going to do about it?” Who knows, before long people may not recognize you anymore.

As always, feel free to post in the comments below: I’d especially like to see what’s on your summer reading list.

Categories: Miscellaneous

Developing For .NET

Archive

Clearing a SQL Server Database

Code Snippet Basics – now with NUnit!

Project Greenfield: Testing the TDD Waters

Richmond Code Camp 2010.1 Coolness

Project Greenfield: Implementing Source Version Control

Project Greenfield

XAML Formatting in Visual Studio

Developer Growth Spurts

Recent Posts

Archives

Categories

Meta