Part 1 In this series of articles, I would like to first introduce the search engine related concepts and technologies, and then through an in-site
searching module of an ASP. NET 4.0 sample Web site (a simple Question and Answer site), to show readers how to put all the SEO
related goodies into practice.
Part 2 In the last article we addressed the importance of an in-site searching engine, the technical difficulties in developing an in-site search engine, as well as solutions to develop an available in-site search engine. In this article, we'll turn to explore another important topic - SEO (search engine optimization), together with a lot of related details and tips.
Part 3 In the first two
articles of this series we mainly dwelled upon the SEO related theories. What really attracts our interests may be the details
and tips in building a practical ASP.NET 4.0 based web application. Starting from this article, we'll focus upon the practical
things - developing a commonly-used Q&A module that a real ASP.NET website frequently contains.
Part 4 In this part, I will introduce to you the
backend sub modules of the Q&A Web application.
part 5 In the last part of this series you leaned the backend sub modules composed of the small Q&A sample application, as well as part of SEO related techniques under the ASP.NET 4.0 environment. In this part, we will shift out attention to delve into the foreground part.
Part 6 In the last several parts of this series you leaned the backend and front-end modules of the Q&A sample application, as well as part of SEO related techniques under the ASP.NET 4.0 environment. In this part, we will shift out attention to delve into how to construct the internal searching module, what kinds of techniques you should have to accomplish such a module, and what kinds of SEO optimization actions should be taken.
Part 7 Built on the previous articles, this article will detail into how to create a universal caching module.
Part 8 This article will use the universal cache modules to reconstruct the question and answer modules, as well as the internal searching engine module.
Introduction
In the previous article, we discussed some theoretical ideas around cache design under complex development environments. Especially, we built some universal cache modules with which to construct our Q&A sample application in this series. This article will use the universal cache modules to reconstruct the question and answer modules, as well as the internal searching engine module. By adding cache support to these modules that already bear fundamental functions, readers will better understand the development and deployment of cache-related knowledge. Due to space limitation, we will focus on the ideas, methods and processes to create cache modules - some non-focus areas will no more be covered in detail.
NOTEThe sample test environments in this series involve:
1. Windows 7;
2. .NET 4.0;
3. Visual Studio 2010;
4. SQL Server 2008 Express Edition & SQL Server Management Studio Express.
Determine Cached Objects
Based on the design methods and principles explained in the previous articles, this section will add the cache support to some representative pages in the Q&A system.
Determine partially caching objects
In the Q&A application, we can partially caching the pages and user controls in Table 1.
Table 1: Partially caching objects
Pages | User Controls | Expire Time (minutes) | Other Expire Conditions |
Default.aspx | SolvedQuestionList.ascx | 5 | Asked questions related info changed in the database |
Default.aspx | UnSolveQuestionList.ascx | 5 | Asked questions related info changed in the database |
NewQuestion.aspx | CatalogDropDownList.ascx | 60 | Question catalog related info changed in the database |
Determine data cache objects
According to analyses in the preceding article, we can use caching all and partially caching approaches targeting the data in Table 2.
Table 2: Data caching objects
Data | All/Partially Caching | Expire Time (minutes) | Other Expire Conditions |
User data: QAUserInfo | All; flexibly changed to partially caching when there are huge data | 60 | User data is updated |
Global setting data: QAConfig | All | 30 | Global setting data is updated |
Question catalog: QACatalog | All | 1000 | Question catalog data is updated |
Questions and answers data: QAQuestion and QAQuestion | Partially | 100 | none |
Searching data, together with QAQuestion and QAQuestion | Partially | 100 | none |
We need to distinguish between data in the different database tables. For example, data in the tables QAUserInfo, QAConfig, and QACatalog are relatively stable and will not change within short time; we can use caching all policy to deal with them. As for the data in the two tables, QAQuestion and QAAnswer, they will be updated faster; we can apply partially caching solution to them. That is, when the user starts the query, we can put the related data in the cache for some time.
Develop Cache Classes
Except for page cache, the data cache mechanisms designed in this section will target the Q&A module. All the data cache classes are based upon the base classes established previously. In detail, the cache classes we are going to create will involve the following tables:
- QAUserInfo
- QAConfig
- QACatalog
- QAQuestion
- QAAnswer
In the above list, questions and the internal queries are the key and difficult points of the entire cache solutions.
Optimizing page cache
Since the page cache is simpler than the data cache, here we will only focus upon the two pages, default.aspx and NewQuestion.aspx.
1. Optimizing the home page default.aspx
Page cache in the home page default.aspx targets the two user controls, SolvedQuestionList and UnSolveQuestionList. We will set the related buffering time to 5 minutes (300 seconds) for these two controls. And also, we set the cache depend upon the update of the database table QAQuestion. Now, let's first look at the code for the control SolvedQuestionList.ascx.
Above, <%@ OutputCache%> is used to set up page cache support. Duration="300" indicates the expiration time is 300 seconds. VaryByParam="None" is used to declare that the cache does not rely on the kinds of parameters, without which an exception will be thrown out. SqlDependency="QADemo:QAQuestion" shows the current page cache depends on the table QAQuestion specified from the caching section in the file We.config. So, when there are insert, update and delete operations in the table QAQuestion, the cache will expire.
Also note that the statement Buffer Time: <%= DateTime.Now.ToString()%> is inserted purposefully, used to indicate the related events that the page is cached, so that we can visualize when the user control is cached again.
For another user control UnSolveQuestionList.ascx, things are quite similar; we'll no more dwell upon them. And also, the behind code associated with the two user controls are common LINQ to Entities implementation; we've also no more list them.
Now, let's look at the interesting part of the markup code in the home page Default.aspx.
The first bold statement is used to indicate the current time. Since we've not set up cache for the page Default.aspx, this time data can be compared with those inside the two user controls.
Now, let's pay attention to the initial running-time screenshot of the page default.aspx, as shown in Figure 1.
Figure 1: The initial time of starting up default.aspx

Next, when you press F5 to refresh the page, the current time on the page default.aspx quickly changed. However, the time data ('Buffer Time') within the two user controls keep their old states. Figure 2 illustrates the related snapshots.
Figure 2: There is no changes in the buffering time when refreshing the current page

Subsequently, if you create a new question, you will find the caching time corresponding to the two lists, solved questions list and unsolved question list, simultaneously expires. Hence, when you again open up the page default.aspx the related caches will be reestablished.
2. Optimizing the question catalog tree
Quite similar to the two user controls inside the page default.aspx, the question catalog tree within the page NewQuestion.aspx is achieved in the form of a user control. A special point worth noticing is the data associated in the DropDownList control in the page NewQuestion.aspx is obtained inside the page itself, rather than inside the user control.
Another fact is when the user control CatalogDropDownList.ascx is cached, it only similar to a string; you cannot use the server control related properties to obtain the related value. So, to obtain this value we can store the client name of the ddlCatalogs control in the control CatalogDropDownList.ascx to Page.Cache["CatalogTreeName"], so that when the page NewQuestion.aspx invokes the value in the control ddlCatalogs we can use Request.Form[Page.Cache("CatalogTreeName”)] to obtain it.
To achieve the above target, we first need to store the info about the control ddlCatalogs before the control CatalogDropDownList.ascx is cached. So, we can put code like listing 3 in the Page_Load event handler in the file CatalogDropDownList.ascx.cs.
There are several points deserved to be taken notice of:
- ddlCatalogs.ClientID represents the client-side ID for the ddlCatalogs control.
- The statement
Page.Cache["CatalogDropDownListClientName"] = ddlCatalogsClientName is used to store the client-side name of the control ddlCatalogs into the cache item CatalogDropDownListClientName.
- The catalog data, in this case, is still derived from the database. When the cache class related to the catalog tree is finished, you can also use cache herein to obtain data to improve the efficiency.
Correspondingly, the page CatalogDropDownList.ascx also needs to set up page-level partial cache.
Next, we need to add reference to the control CatalogDropDownList.ascx in the page NewQuestion.aspx; we'll still omit the related explanations since they have already been introduced in the previous articles.
With the above control CatalogDropDownList.ascx getting ready, we can write the related method in the page NewQuestion.aspx.cs, like the following.
There are still some points to notice here:
Page.Cache["CatalogDropDownListClientName"].ToString() is used to get the cache item CatalogDropDownListClientName out of the cache, whose content is the client side name of the control ddlCatalogs in the user control CatalogDropDownList.ascx.
Request.Form[string.Concat("ctl00$", ddlCatalogsClientName)] is used to obtain the corresponding value of the name from the PostBack data, where it corresponds to the client-side name of ddlCatalogs in the user control CatalogDropDownList.ascx.
In this way, the DropDownList data of the catalog tree can be grabbed from the cache. If expired, it will automatically get updated. And also, we can get the server control related value passed from the static HTML code through Request.Form.
Starting from the next sections, we are going to build up different kinds of data cache classes all of which are derived from the base class BaseCache.
Create the Cache Classes
We'll create various kinds of concrete classes.
Create the UserInfoCache class
The UserInfoCache class is used to provide all operations of the user info, for which we use the caching all solution. But, if there is a great deal of user info, we can change the caching solution to partially caching. Listing 6 indicates the complete code for the class UserInfoCache.
There are several points required to be explained, as follows:
base._timeout is used to set the cache expiration time, with unit being minute. This time will be used on all default cache expiration time.
- The variable
CACHE_KEY is used to set the cache key corresponding to the cache item. Because its attribute is public, it can be used at other places to directly gain access to the cache content using UserInfoCache.CACHE_KEY.
public UserInfoCache(): base(CACHE_KEY) specifies that when the UserInfoCache() constructor is executed the constructor of the base class BaseCache is also executed. Because the constructor of the class BaseCache needs a parameter of the string type, we pass CACHE_KEY as the cache key of the cache item. Note also that the base constructor that base(CACHE_KEY) invokes will be executed prior to the constructor UserInfoCache().
- In the Update() method, we first create a database cache dependency item
sqlCacheDependency, the dependency conditions are that the database instance is QADemo and the dependency table is QAUserInfo. Hence, when there are add, edit, and delete operations with the table QAUserInfo, the corresponding caches will expire.
base.SetData; is used to store data into the cache.
return userInfos; returns the latest queried data put into the cache. In this way, the efficiency will be improved, without querying cache again.
Most of the above methods will be used in the later cache classes.
Create the ConfigCache class
Another class ConfigCache is used to provide cache support for the global setting info (QAConfig). As you've known, since there is only one piece of configuration info in the table QAConfig, we can only apply caching all policy to this object.
On the whole, there is mainly one point required to be explained, as follows:
public override QAConfig Data will override the Data accessor in the base class. Because the Data accessor in the base class only provides support for cache query and update operations without the function of synchronously updating database, in this sub class we must override it since we want to synchronously update the database.
The so-called database synchronization refers to at the time the program update the cache it synchronously put the latest information in the cache to the database. The past most common way is to first update the database, and then the data in the cache get updated based on data in the database. The advantage of this synchronization is obvious - you can enable users to query the up-to-date data, truly seamless data update.
Also note, in the set accessor of the member Data, we used a delegate UpdateWithBataBase with which to trigger the database update. At last,
Create the CatalogCache class
The CatalogCache class is used to provide cache support for the question catalog. Obviously, because there does not exist much catalog data and especially the related query is very much frequent, we can also use the caching all policy with it. The following shows the complete source code for the class.
As is seen, the database cache dependency item is the table QACatalog, so when there are operations, such as create, update, and delete, happening in the table QACatalog, the cache item 'QACatalog' will expire.
Create Cache Class for Question and Internal Searching Module
The class QuestionAndAnswersCache is the most important and difficult part in the overall Q&A application. The reason why we say it is the emphasis is that this data is the most important part of the entire module, with quite a prominent position whether in terms of quantity or importance. The reason why we say it is the most difficult point is that to handle appropriately such a large amount of data and ensure high efficiency is indeed a difficult problem whether by design or from a application point of view. Besides these, the complexity of the internal searching module requires very reasonable and meticulous implementation.
First of all, the data type of the class QuestionAndAnswersCache is rather complicated, requiring creating an entity module class to manipulate the questions and the relevant answers and keywords. This entity module class is defined as follows.
Here, the property Question is used to record question info (the state is "Solved"). The property Answers is used to record all the answers corresponding to a question; in addition, all the answers are sorted on the field BestAnswer in descending order. The property Keywords is used to record the questions and the keywords inside contained inside the best answers, so as to facilitate the search. Note the keywords here come from the searched keywords or those with higher searching frequency.
Next, let's create the class QuestionAndAnswersCache. Since it contains a lot of members with lengthy code, we are to outline the main functions of them.
This is used to specify the name of the cache key associated with the cache item.
This method is used to check whether there are global cache keywords stored in the cache. If so, add the related keywords.
This method is used to check whether all the keywords included in the questions.
This property overrides the Data accessor in the base class.
This method is used to overrides the Update method in the base class. The return value is the latest data in the cache.
This method is used to obtain questions and answers data according to the given question ID.
This method is used to obtain the given question ID related info, which will be optimized into the QuestionAndAnswers type of data. Note the return type is QuestionAndAnswers (of type Dictionary<int, QuestionAndAnswers>) whose key is equal to the value of the parameter id.
This method is used to synchronize database with the specified item in the cache. The parameter id corresponds to the question ID.
This method is used to search the query result according to given keywords. Note this method is the core of the whole internal searching module. The parameter keywords represent a list of the keywords. The return type is List<QuestionAndAnswers>, indicating the search result list in response to the specified condition.
NOTEUsing LINQ to Entities to handle stored procedures is quite similar to handling table entities. In the method SearchQuestionByKeyWord, there is the following statement:
List<QAQuestion> questionIdList = ctx.sp_SearchQuestion(keyword).ToList();
For details, please refer to the stored procedure definition in the database.
Integrate the Data Cache Classes into the Q&A Modules
With the sub classes getting ready, in this section we are going to use them to improve some of the important pages in the Q&A application.
The backend page Users.aspx
The backend page Users.aspx is used to display all the logged in users info. As is pointed out previously, the user info related cache class is the UserInfoCache class. Now, let's look into the new Page_Load event handler for this page.
The logic above is easily seen. After create the instance of the class UserInfoCache, we use userInfoCache.Data to obtain the data in the cache, and then specify the data source of the GridView control.
Note here we use userInfoCache.Data to obtain all the user related info out of the cache because we used the caching all policy with the UserInfoCache class. Also note that the old ctx.QAUserInfo bears the characteristic of lazy-load while userInfoCache.Data does not. Since the cache reading speed is very fast, we do not need to set lazy-load function for it.
Using the cache policy, the code has completely broken away from the dependency upon the QAEntities class, which owes to a better encapsulation of database access. In fact, all the database related operations have been performed inside the cache class itself.
The backend page Catalogs.aspx
Now that we've create the cache class CatalogCache, we no more need to grab article catalog data directly from the database. With the help of the method BindToDdlCatalogs defined in the class Common, we can obtain the data out of the cache instead. Note for the method BindToDdlCatalogs, we've taken the caching all policy; we'll omit the complete code since it has been briefly introduced in the previous article in this series.
Now, in the behind code of the page Catalogs.aspx, we can rewrite the relate code, as given below.
Since the code above is simple to understand, we are not to detail it.
The foreground page Question.aspx
The question detail page and the search result page both use the cache class QuestionAndAnswersCache. Now, let's first look at the page Question.aspx (belonging to the front end).
The main function here is to shield question IDs that do not meet the requirements. Especially, when the value of ID is zero, although there is data in the cache with ID being 0, this special cache item is used to record global keywords related info, so we must shield querying these kinds of info.
Next, the helper method LoadInfo() will also be updated (the bold part), as follows.
Note the method GetQuestionAndAnswersItem is used to get the question and related answer info from inside the cache. If there is not related data inside the cache, it will further retrieve from the database and update the cache correspondingly. Note also that inside the method GetQuestionAndAnswersItem we've added the "Solved" related judgment, so even if the "Unsolved" info is put in the cache, it will not be searched through the "Search" function in the page.
Here, there is another approach to further improve the efficiency, the readers can continue to explore, is that you can establish a special cache item with which to especially store the problem detail page related data. And you can even create relationships between two cache items.
Finally, the following code:
is just to end the execution. In practice, however, you are suggested to jump to a more friendly page showing the related error info.
Redesign the Master Page and Searching Result Page
When designing the class QuestionAndAnswersCache, we specially provide support for multiple keywords list. And accordingly, we develop a corresponding method SearchQuestionByKeyWord(List<string> keywords). This means that the search result page needs to identify the passed multiple keywords and provide support for highlighting multiple keywords.
Different from the method HighlightKeyword, the static method HighlightKeywords is used to highlight multiple keywords. But, as with the method HighlightKeyword, HighlightKeywords is also as an extension method of the type of string. Extension method is a new feature introduced since C# 3.0; we are not to delve into it.
In the above method HighlightKeywords, there are some points worth noticing. First, the parameter keywords is a string linked by multiple keywords end to end, with the separator being "+". Note the separator plus should be consistent throughout the application, whether in the class QuestionAndAnswersCache, the search result page or the stored procedures in the database. Second, keywords is first divided into a string array, and then restored to a string using String.Join, rather than directly using keywords.Replace(“+”, "|"). The reason for this is the empty strings can be more easily excluded via the StringSplitOptions.RemoveEmptyEntries parameter in the method String.Split.
A new searching bar
To make good comparison between directly searching the database and searching the cache, we purposefully create a new searching result page. Accordingly, we should also set up a new searching bar for the cache search and clearing cache, so as to facilitate test.
Now, let's again look at the related markup code in the master page Site.Master.
The functionalities of the above three buttons are easy to guess: txtKeyword2 is used to enter multiple keywords composed keyword string; Button2 is used to submit the keywords; Button3 is used to clear up the cache item associated with the QuestionAndAnswersCache class.
Next, let's follow up the scent to see the click event handlers related to the two buttons Button2 and Button3.
As you've seen, because the key logics are all encapsulated inside other pages or in the QuestionAndAnswersCache class, the above code is not difficult to understand.
A new searching result page
SearchResult2.aspx is the new searching page corresponding to the cache search. Since the markup code for the page SearchResult2.aspx is lengthy, we are not to list it any more. However, there are two crucial different points between the page SearchResult2.aspx and the page SearchResult.aspx worthy to be noticed:
- We added the code
Search Time: <asp:Label ID="lblSearchCost" runat="server"></asp:Label> milliseconds, with which we can easily compare the efficiency differences between the two searching methods.
- The old method
HighlightKeyword is replaced by the new one HighlightKeywords.
OK, the last question should be: how about the efficiency of using cache search instead of direct database search? With preliminary view, this seems to be a secret yet to unlock. However, with the so much explanation covered above (and even those in the previous articles), you can easily find out the secret yourselves.
Summary
Well, after so long a journey we can finally rest a little longer! This series is aimed to bring to you at least three things: searching engine optimization knowledge, an LINQ to Entities based internal searching engine, and a universal cache module. Because of my limited knowledge and other reasons, there are maybe a lot of loopholes in the contents of the article, sincerely welcome readers to leave your faithful comments and passionate corrections.
Part 1 In this series of articles, I would like to first introduce the search engine related concepts and technologies, and then through an in-site
searching module of an ASP. NET 4.0 sample Web site (a simple Question and Answer site), to show readers how to put all the SEO
related goodies into practice.
Part 2 In the last article we addressed the importance of an in-site searching engine, the technical difficulties in developing an in-site search engine, as well as solutions to develop an available in-site search engine. In this article, we'll turn to explore another important topic - SEO (search engine optimization), together with a lot of related details and tips.
Part 3 In the first two
articles of this series we mainly dwelled upon the SEO related theories. What really attracts our interests may be the details
and tips in building a practical ASP.NET 4.0 based web application. Starting from this article, we'll focus upon the practical
things - developing a commonly-used Q&A module that a real ASP.NET website frequently contains.
Part 4 In this part, I will introduce to you the
backend sub modules of the Q&A Web application.
part 5 In the last part of this series you leaned the backend sub modules composed of the small Q&A sample application, as well as part of SEO related techniques under the ASP.NET 4.0 environment. In this part, we will shift out attention to delve into the foreground part.
Part 6 In the last several parts of this series you leaned the backend and front-end modules of the Q&A sample application, as well as part of SEO related techniques under the ASP.NET 4.0 environment. In this part, we will shift out attention to delve into how to construct the internal searching module, what kinds of techniques you should have to accomplish such a module, and what kinds of SEO optimization actions should be taken.
Part 7 Built on the previous articles, this article will detail into how to create a universal caching module.
Part 8 This article will use the universal cache modules to reconstruct the question and answer modules, as well as the internal searching engine module.
About Xianzhong Zhu
 |
I'm a college teacher and also a freelance developer and writer from WeiFang China, with more than fourteen years of experience in design, and development of various kinds of products and applications on Windows platform. My expertise is in Visual C++/Basic/C#, SQL Server 2000/2005/2008, PHP+MyS...
This author has published 81 articles on DotNetSlackers. View other articles or the complete profile here.
|
You might also be interested in the following related blog posts
GiveCamps Get a new Sponsor
read more
Scenarios for WS-Passive and OpenID
read more
Announcing the IIS SEO Toolkit (beta)
read more
IIS Search Engine Optimization Toolkit
read more
Announcing: IIS Search Engine Optimization Toolkit Beta 1
read more
Win a Govie Award Submit an Innovative Gov 2.0 Application
read more
Microsoft Web Platform Installer
read more
Script for Bulk Import of Active Directory Site Links
read more
Script for Bulk Import of Active Directory Subnets
read more
Adding streaming video content to your site
read more
|
|
Please login to rate or to leave a comment.