Published: 10 Aug 2009
By: Andrew Siemer

Andrew Siemer will walk you through from start to finish (in a series of articles) on how he would go about creating a StackOverflow style knowledge exchange.

Contents [hide]

The Stack Overflow Inspired Knowledge Exchange Series

  • TOC Checkout the project homepage of this series to follow our journey from the creation of the famous StackOverFlow website.
  • Introduction

    In looking at all the hype surrounding the great site StackOverflow (currently my third place) I have found a lot of people wondering how they built that site. There is quite a lot of information regarding how it was built from a high level generic view. But I have yet to find any site that details an actual implementation of a StackOverflow style knowledge exchange from start to finish (though there are some copies of their idea out there: cnprog.com, code.google.com/p/cnprog, code.google.com/p/stacked).

    While it would be nice for Jeff and Joel to open up their code base to the world (as they did with their data, and their WMD WYSIWYM Markdown Editor), it is highly unlikely as they are offering their software to generate revenue. And so, with the permission of Jeff Atwood himself (via email), I will walk you through from start to finish (in a series of articles) on how I would go about creating a StackOverflow style knowledge exchange.

    Egoless Programming

    The idea of egoless programming came up while researching this series. I read about it first on Jeff Atwood’s site here but also found other references to it. The basic idea as stated by Johanna Rothman is this:

    Note:

    Egoless programming occurs when a technical peer group uses frequent and often peer reviews to find defects in software under development. The objective is for everyone to find defects, including the author, not to prove the work product has no defects. People exchange work products to review, with the expectation that as authors, they will produce errors, and as reviewers, they will find errors. Everyone ends up learning from their own mistakes and other people's mistakes. That's why it's called egoless programming. My ego is not tied to my "perfect" or "imperfect" work product. My ego is only tied to my attempts to do the best job I know how, and to learn from my mistakes, not the initial result of my work.

    Along this guideline I will attempt to do my best while designing and building this knowledge exchange software in full view of the public. I plan to build this software in a manner that uses all of the latest and greatest industry buzz-words and technologies such as nTier, TDD, DDD, continuous integration, MVC, LINQ to SQL, AutoMapper, MvcContrib, SOLID, DRY, IoC, StructureMap, SketchFlow, etc. I want to admit up front however that I do not proclaim to be a rocket scientist at all of these and so I expect to learn along with you in some cases. I fully expect a great many of you to give me coarse corrections (word play?) along the way where you think I am wrong and I will make adjustments where possible. I expect some of you to send full on flames instead of suggestions. And this is where the ego-less programming will come in!

    What information is currently available about StackOverflow?

    StackOverflow architecture

    There is some information regarding the StackOverflow architecture in a readable form but a good majority of it is buried in a podcast here or there. In a blog post on blog.stackoverflow.com entitled “What was stack overflow built with” you will see a list of technologies the SO team used.

    Table 1: StackOverflow technology stack

    Stack

    Technology

    framework

    Microsoft ASP.NET (version 3.5 SP1)

    language

    C#

    development environment

    Visual Studio 2008 Team Suite

    web framework

    ASP.NET MVC

    browser framework

    JQuery

    database

    SQL Server 2008

    data access layer

    LINQ to SQL

    source control

    Subversion

    compare tool

    Beyond Compare 3

    source control integration

    VisualSVN 1.5

    And then there are the other tools that are used by StackOverflow.

    Table 2: Other dependancies

    Type

    Dependency

    Captcha

    ReCaptcha

    Authentication

    DotNetOpenID

    Editor

    wmd initially (rewritten by them)

    jQuery charts

    flot

    There is also a good article on highscalability.com regarding StackOverflow, the stack, the hardware, and the stats for their site. We will use these stats as something to shoot for in our sites design. There is also a great list of “lessons learned” that you might be interested to read if you plan on having a site even remotely as popular as theirs!

    Stats excerpt from highscalability.com

    • 16 million page views a month
    • 3 million unique visitors a month (Facebook reaches 77 million unique visitors a month)
    • 6 million visits a month
    • 86% of traffic comes from Google
    • 9 million active programmers in the world and 30% have used Stack Overflow.
    • Cheaper licensing was attained through Microsoft's BizSpark program. My impression is they pay about $11K for OS and SQL licensing.
    • Monitization strategy: unobtrusive adds, job placement ads, DevDays conferences, extend the software to target other related niches (Server Fault, Super User), develop StackExchange as a white label and self hosted version of Stack Overflow, and perhaps develop some sort of programmer rating system.

    StackOverflow public database

    We will discuss the database for this application in a future chapter. However, it is interesting to take a look at the database that StackOverflow has made public. There are many people data mining this to see what sort of coolness can be found. A good article on this topic is at sqlserverpedia.com entitled “Understanding the StackOverflow Database Schema”. We will return to this subject later.

    Semi-controversial architecture and design decisions

    I am a big podcast fan and listen to them for roughly 3 hours a day (as I drive 75 miles to and from work in Los Angeles). I love to listen to the greats such as Hanselminutes, Polymorphic Podcast, and many more. Something relevant to the readers of this series are two podcasts from Hanselminutes that really drove me to the decision of following known design patterns and best practices to the best of my ability. In the first interview Scott Hanselman interviews Jeff Atwood and team to discuss StackOverflow. During that interview Scott unearths several things that certainly took his breath away. You could sort of tell that he was a bit shocked at what he heard regarding some decisions to not use known best practices. Listen to it - it was sort of funny. In the second podcast…which wasn’t scheduled…Scott continued to record the behind the scenes discussion that took place after the initial interview of the StackOverflow team. In this discussion Scott really dug into what he had heard in the first interview. I found this to be quite funny too!

    I think that Jeff Atwood and his team are probably way better programmers than I am. Having said that, I can’t bring myself to develop a project of this nature in a non-best practices fashion. Having a great and very successful site up and running quickly was their goal and so making decisions to do things in a non-standard fashion met their needs. Having a great series of current trend related tutorials followed by a great OSS platform available for others to use and modify is my goal. For that reason everything that I build will be done in a manner that to the best of my abilities follows today’s best practices and design patterns. For example: there will be no direct dips to the database from the presentation tier in our implementation!

    What will be covered?

    I will admit up front that I think that this series will be a long one. I am going to do my best to attack this project in the manner that I would any other project. I want to show the programmer that has not done a project of this nature ALL of the steps that go into it, not just code snippets. This will include setting up some infrastructure aspects, configuring continuous integration, automated builds, source control, test suites, logging, and many other aspects of a software project that generally doesn’t get mainstream attention (but should).

    We will take a look at creating wireframes and screen mock ups using the new SketchFlow tool in Microsoft’s Expression product. We will discuss solution and project structure. We will cover some of the great tools out there such as NAnt, StructureMap, AutoMapper, Elmah, NUnit, Rhino Mocks, and CruiseControl.net. We might also discuss some platforms for managing a project such as this using Zen or VersionOne. Pretty much anything that goes in between the A-Z aspect of this project will get at least an article’s worth of coverage!

    For that reason I am unable to create a specific article list of a 1, 2, … 30 nature outlining what is to come. We will instead create an index page that will give you a starting place for someone that is interested in jumping around the series (once it is completed that is!). And the articles will reference back to this index to keep you going.

    What is required?

    I will do my best to only use technologies, frameworks, and other tools that you will have access too. For that reason I will ensure that there is a trial version, or free version, or open source version of whatever we use in our discussions. This way you can walk along with me side by side to develop your own version of this knowledge exchange. Also, a copy of the source code will be maintained for each article. Along with that you will also have access to the knowledgeexchange.codeplex.com site where my incremental check-ins will be stored. That repository will be further along than the article release so if you want to jump ahead that is the place to do it.

    Some additional information

    While none of this is required reading by any means. The following videos, blog posts, and articles are worth a look. These are things that I found interesting as I started to research this project.

    Summary

    This article was primarily an introduction to the upcoming series of articles. It introduced why I chose to create a StackOverflow style knowledge exchange and showed some of the high level looks at how StackOverflow is currently running. We discussed egoless programming and the fact that I fully expect my readers to help keep me straight with what we are going to build. I also loosely discussed some of the technologies that we will use in our implementation of this knowledge exchange. I then stated that this series will be performed using publicly available software so that anyone can follow along with me. Lastly I listed a few articles that I think might be worth reading to get your mind in the right place to understand what StackOverflow is, and its core features.

    In the next article we will take a look at setting up our development environment. We will get into setting up a version control project on CodePlex. We will also discuss an appropriate file and folder structure to support a big project like this. Then we will create our initial solution which will include an ASP.NET MVC 2 web application and its test project. Next we will get the TortoiseSVN client set up to communicate with our version control repository on CodePlex. And at the end of the article we will get to perform our first commit into our code repository and make our StackOverflow inspired knowledge exchange project public.

    <<  Previous Article Continue reading and see our next or previous articles Next Article >>

    About Andrew Siemer

    I am a 33 year old, ex-Army Ranger, father of 6, geeky software engineer that loves to code, teach, and write. In my spare time (ha!) I like playing with my 6 kids, horses, and various other animals.

    This author has published 29 articles on DotNetSlackers. View other articles or the complete profile here.

    Other articles in this category


    Code First Approach using Entity Framework 4.1, Inversion of Control, Unity Framework, Repository and Unit of Work Patterns, and MVC3 Razor View
    A detailed introduction about the code first approach using Entity Framework 4.1, Inversion of Contr...
    jQuery Mobile ListView
    In this article, we're going to look at what JQuery Mobile uses to represent lists, and how capable ...
    Exception Handling and .Net (A practical approach)
    Error Handling has always been crucial for an application in a number of ways. It may affect the exe...
    JQuery Mobile Widgets Overview
    An overview of widgets in jQuery Mobile.
    Book Review: SignalR: Real-time Application Development
    A book review of SignalR by Simone.
    Top
     
     
     

    Discussion


    Subject Author Date
    placeholder egoless programming - sound advice Robert Williams 8/16/2009 1:27 PM

    Please login to rate or to leave a comment.