Total votes: 2
Print: Print Article
Please login to rate or to leave a comment.
Published: 04 Feb 2008
Download Sample Code
This article explores some of the key performance issues that can occur while developing a Web 2.0 portal using server side multithreading and caching. It also demonstrates model driven application development using Windows Workflow Foundation.
Performance is a vast area and great results can never be achieved by a silver bullet. Apparently there are only few points we can explore while developing a Web 2.0 portal - within the scope of the article. Web 2.0 applications are widely developed. Even the DotNetSlackers website you are browsing right now is a perfect example of a Web 2.0 application. These applications often work with third party contents, aggregate them, make various use of them and then make something useful and meaningful to the users. For the past few years, developers were also engaged with such endeavors and a lot of their websites have not addressed performance issues, thus resulting in an unpleasant experience to the users.
The Web 2.0 portal
The Web 2.0 portal you will develop in this article is a mashup of the Eventful.com, Upcoming.org, Zvents.com and Yahoo Geocode APIs. The main interface of the application consists of a Button and a Text Box where the user will type the location of his/her interest and press the Button. It will validate the user's location by using Yahoo Geocode and will let the user know (with corresponding messages) whether the location is valid or not. Then it will go through all three popular Web 2.0 services that store local events by location, and it will display only the present and future local events to the user, sorted by date with no duplicates. In this article you will build the Local Events search engine shown in figure 1, on top of the .NET Framework 3.5.
Figure 1: The Local Events Web 2.0 portal
After I built this application with no performance optimization in place, I tested it using a dialup internet connection, to simulate the experience for low bandwidth Internet users. I found that it often failed and sometimes it took 80-90 seconds to load. The question was - why wasn't it as fast as any other search service? The answer: The search result, in this case, is the combined result of search performed on three different services over the Internet, which are located in three different servers and maybe in different continents/countries. We need them, since the application itself has no data and yet we need to offer users a search feature. So, such amount of data download is inevitable. However, we need to smartly decide, design, and apply tricks to make it a little faster.
The Developer APIs
We will need four third party APIs in this application. One is the Yahoo Geocode API, which can validate whether the user's submitted location data is valid or not. The other three services make up our data source for this application. Documentation and API keys are available at the following addresses:
They offer open APIs with numerous methods and lots of protocols. We choose REST because it is the simplest among others and can be performed by simple HTTP GET or POST requests.
Performance Improvement #1: Make it AJAX
As I said before, the data download is inevitable and it will take time to load. Making the application postback-less decreases the waiting time for the user. After the user types a location, the application simply displays a message saying that it is locating the events. When available, it displays them in a grid - without reloading the whole page. This removes the overhead caused by reloading a whole page, and makes the website more interactive.
Performance Improvement #2: Remove burden from client side
In this application, we will make an AJAX call to a single Web Service that is hosted by our application. The Web Service will perform all of our business logic, including complex operations, and return only an array of
LocalEvent objects to the client side. After that, some AJAX control will take over the responsibility of rendering them. Also, an appropriate and adequate use of caching might also improve performance a lot. We will discuss it soon.
Model-driven Development: Windows Workflow Foundation
Windows Workflow Foundation, which we will use as part of our Model-driven development, is a core capability that lets you explicitly or declaratively model the control flow of your application. Rather than embedding your application logic in code, in a workflow the logic is represented declaratively. As a result, you can inspect the application logic, visualize it, track its execution, and even change the logic at runtime. Workflow Foundation provides a higher level of abstraction and visual representation of your business processes that makes them easier to understand and design, by both developers and business domain experts. It's easy to change the flow and rules associated with business processes, often without having to recompile.
Compared to their UML counterpart Activity Diagrams, Workflow diagrams are first-class software artifacts that do not become outdated and diverge from business process logic because they are the business process logic. On the other hand, the Windows Workflow runtime provides a robust, scalable environment for your workflows to execute. Workflows can be persisted to a database when they become idle and reactivated when an external stimulus occurs.
The following figure shows the Workflow of our application.
Figure 2: The SearchWorkflow for searching events in three different sources
This Workflow does the following:
- It checks in the Cache whether there is a previous entry against the location passed to Workflow
- If there is any, it directly retrieves the
LocalEvent array from the Cache and returns it to the invoker
- If there is none, it verifies the location with the Yahoo Geocode API
- If it fails to verify, it sets the
ErrorMessage properties and goes to the Terminate state of the Workflow
- If the location is valid, it shifts to a ParallelActivity made up of three CodeActivity. Each of them performs the search by using a particular Event Service
- The last CodeActivity consolidates all the results into a single
LocalEvent array and returns it to the invoker.
Note: ParallelActivity does not ensure that the activities underneath will run asynchronously. It is a nice representation of what we intend to do though.
The Solution Structure
The solution structure is similar to the one shown in figure 3. It is divided into two projects. The LocalEvents.Business project contains the LocalEvent class, SearchWorkflow and other helper classes. On the other hand, the LocalEvents.Portal project is a standard ASP.NET application with a Style.css, Web.config, Global.asax and a WebService file named LocalEventsWS.asmx.
Figure 3: The Solution Structure
Performance Improvement #3: Initialize Workflow Runtime Engine Once and then Reuse It
The WorkflowHelper class encapsulates the hosting runtime management for our application. The
Start method of this class creates a runtime and puts in the
Application. Before that, it looks for the previous one: If found, it is reused. By default, the Workflow Scheduler Service class is
DefaultWorkflowSchedulerService, which allows spawning threads dynamically for each of the Workflows, which means that the Workflows run asynchronously. But in our case, we need the Workflows to run synchronously, since we will expose the Workflow to a WebService, and WebServices need to wait for the processing to be finished by the Workflow in order to get the results. To achieve this, a
ManualWorkflowSchedulerService instance is added to the runtime.
The following method is responsible for executing the workflow, which takes an instance of a Dictionary that contains the necessary parameters, including output parameters, and then runs the Workflow through the Scheduler service we added before:
Stop method terminates the Workflow Runtime engine and removes it from the
Run and terminate the Workflow Runtime engine at the application level so that workflows will run once in the application's lifetime. Add the following two event handlers in Global.asax.cs:
Inside the SearchWorkflow
The first activity in the Workflow finds the cached
LocalEvent array and, if found, it terminates the Workflow and returns the array immediately:
Depending on the
e.Result property, the Workflow engine determines the way to go. If the
LocalEvent array cannot be found in cache,
true will cause the flow moving to the next activity, which is an
IsInvalidLocation activity that performs a check using Yahoo Geocode. The
GetResponse method fetches data from the specified URL. The code is pretty self-explanatory so let us leave it to you to explore. In this block of code, you can see that if the location can be resolved, we store its longitude and latitude in private variables that we will reuse to retrieve the events that do not have valid longitude and latitude but came up after a search for that location.
If the location cannot be resolved, it throws an exception that will be handled in WebService in order to return a meaningful message to the client, telling what has just happened.
Performance Improvement #4: Server-side Multithreading
The next thing to do in the Workflow, after validating the location, is invoking the event provider services:
InvokeSearchZvents. Wait, we have got something to work on. Instead of executing these searches sequentially, how about invoking each in a different thread? Remember when we talked about ParallelActivity? Do not forget that it is not responsible for spawning the activities into separate threads. So how do we achieve this?
We initialized a couple of variables to keep track of the threads to be spawned:
As soon as the control flow comes to the
InvokeSearchEventful activity, it adds a reference to this particular thread into the locks that we will use to keep track of the threads in our code. Then, it queues the
SearchEventful method to the
ThreadPool. It means that it spawns a thread which will execute the
The other two methods,
InvokeSearchZvents are similar, so we are not going into them. The
SearchEventful method retrieves data from the service and populates each
LocalEvent object with data using LINQ to XML, as shown in the following listing. As you can see at the end of the code block, the call to
evt.Set()marks that the thread is done. The other two methods are the same, so those are intentionally left out.
Listing 9: SearchEventful method
Next thing to do in the Workflow is to wait for the methods to complete:
As I said when I first created this application, it took me 80-90 seconds to complete the search. After I implemented the server-side multithreading feature, performance dramatically improved, requiring 25-28 seconds to complete the search. Multithreading on server-side is definitely something that might boost the website's performance from the client point of view, but it might not be appropriate in some cases, since it's quite stressful on the server. You will have to make tradeoffs depending on your business problems.
Performance Improvement #5: Cache the result
This is the last part of the
ConsolidateResult method, which implements a caching technique. It caches the
LocalEvent array for one hour. You might want to configure the timeout depending on your needs. The next time the Workflow starts, the first activity would be to look for a cached item for this particular location. If the item is found, the control flow doesn't go through this long tunnel of Workflow. Instead, it will simply return the cached result.
One of the biggest performance improvements lies here. Caching allows us the retrieval of events from all the three data sources. This is of course much faster since the data is now delivered directly and only from our server. Just before this improvement I was able to retrieve search results in at most 28 seconds, in fact 28536ms. Now, the same operation requires 1302ms, which correspond to 1.3 seconds.
Performance Improvement #6: Client side validation
The Ajax Data Controls
You will certainly notice that we used a GridView AJAX control, which is part of a DotNetSlackers' hosted project called AjaxDataControls . The reason behind choosing this awesome control library is that these controls have fantastic client programmability, which makes the life of an AJAX developer much easier.
The following snippet simply binds a
LocalEvent array to this GridView:
The Web.config changes
The following code needs to be edited with your own API keys, User Name and Password. These credentials are used by the methods in the SearchWorkflow.
Performance Improvement #7: Cache on the client side
Now let us look at the WebService code that resides in server side. It is invoked from the client side and deals with the server side cache. It also invokes the web services of the events provider. It's a bridge between the data and client. It reads the Web.config file, prepares a
Dictionary with some necessary objects and executes the
SearchWorkflow we saw before. One significant thing to note here is that the WebService methods that can be invoked through AJAX should use the
ScriptMethod attribute. The reason why we set the
UseHttpGet parameter is that we want to issue an AJAX HTTP GET request. This will be discussed in the next section.
Now for the tricky part. Before we return a response from the WebService method to the client, we need to cache it so that once the WebService is fired up from the AJAX code, it will show the data cached in the browser instead of making a round trip to the server. This results in an awesome performance optimization. The following code is supposed to do exactly what we need:
This is again setting 1 hour expiry time to the cache. Unfortunately, though,
SetMaxAge didn't work. It demands a special Reflection hack when using with ASP.NET AJAX, in order to alter the
The browser now saves the request query and the associated response as a pair in its cache, so that every time this method is invoked with the same location parameter, it returns the data directly from the browser cache within one hour. Now my response time came down to 117ms, which is almost 0.1 second.
This project is hosted at Codeplex: http://www.codeplex.com/LocalEvents for those of you who would like to do further development on it. One enhancement one might want to implement is integrating a Virtual Earth or Google Maps map to pin point the events on the map. It's very easy to do because longitude and latitude information is available for each event passed to the client side.
We saw how a Web 2.0 portal's response time was decreased from 90 seconds to 117ms by applying 7 performance tips. Finally, there can be numerous ways to improve performance depending on the business problems you are trying to solve. Even in this application, there might be other ways to improve performance. Not every issue can be addressed in such a short scope. However, I hope to discuss them in near future.
In this article you explored some of the key performance issues while developing a Web 2.0 portal using server side multithreading, caching. You also learned how to do model driven application development using Windows Workflow Foundation.
Please login to rate or to leave a comment.